<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Volunteered Geographic Information &#187; uncertainty</title>
	<atom:link href="http://danieljlewis.org/tag/uncertainty/feed/" rel="self" type="application/rss+xml" />
	<link>http://danieljlewis.org</link>
	<description>A Geography/GIS blog by Daniel J Lewis</description>
	<lastBuildDate>Tue, 20 Dec 2011 17:15:30 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4-alpha-20124</generator>
		<item>
		<title>Multi Dimensional Scaling of Southwark OAC data</title>
		<link>http://danieljlewis.org/2010/04/28/multi-dimensional-scaling-of-southwark-oac-data/</link>
		<comments>http://danieljlewis.org/2010/04/28/multi-dimensional-scaling-of-southwark-oac-data/#comments</comments>
		<pubDate>Wed, 28 Apr 2010 16:34:47 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Representation]]></category>
		<category><![CDATA[geodemographics]]></category>
		<category><![CDATA[LOAC]]></category>
		<category><![CDATA[MDS]]></category>
		<category><![CDATA[OAC]]></category>
		<category><![CDATA[Southwark]]></category>
		<category><![CDATA[uncertainty]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=283</guid>
		<description><![CDATA[Geodemographic classifications are funny things, they report a view of the world which suggests that areas can be split into groups within which all areas share the same or similar characteristics. This is not an inherently bad thing, for large scale analyses it can be a very useful way of simplifying a diverse array of [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F04%2F28%2Fmulti-dimensional-scaling-of-southwark-oac-data%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F04%2F28%2Fmulti-dimensional-scaling-of-southwark-oac-data%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p style="text-align: center">
<p>Geodemographic classifications are funny things, they report a view of the world which suggests that areas can be split into groups within which all areas share the same or similar characteristics. This is not an inherently bad thing, for large scale analyses it can be a very useful way of simplifying a diverse array of variables into something that characterises the underlying patterns in the distribution of data. However, for smaller scale analyses I am increasingly finding that non-bespoke geodemographics are limited, I attempted to demonstrate this on a national scale by looking at the entropy scores for each OA in the UK with respect to distance from all supergroup cluster centres <a title="Entropy Scores for OAC Supergroups" href="http://danieljlewis.org/2010/02/10/oac-quality-using-entropy-scores/" target="_blank">(here)</a>. Recently, <a title="Dr Pete Fischer - Leicester" href="http://www.le.ac.uk/gg/staff/academic_fisher.html" target="_blank">Pete Fischer</a> presented some very clever work in this vein at the recent GISRUK 2010 conference, he used fuzzy classification strategies to account for the likelihood that each OA does not fit exactly into any particular grouping, and that different OAs fit differently into the same group. <a title="Aidan Slingsby - City" href="http://www.soi.city.ac.uk/~sbbb717/" target="_blank">Aidan Slingsby</a> at City also showed this very nicely visually with his &#8216;OAC Explorer&#8217;. Procedings from the conference can be found <a title="GISRUK Proceedings 2010" href="http://eprints.ucl.ac.uk/19284/" target="_blank">here</a>.</p>
<p>With this in mind, I wanted to test the variability of the data in Southwark, my study site, with respect to OAC. OAC paints a very flat picture of the population of Southwark as shown below, and had led to me using LOAC, a London specific variant of OAC created by Jacob Petersen a previous research student at UCL and available as a layer on the <a title="London Profiler" href="http://www.londonprofiler.org/" target="_blank">London Profiler</a>. Using OAC, Southwark is primarily &#8216;multicultural&#8217;, there is more variability in the LOAC classification however, as is evident.</p>
<p style="text-align: center">
<div id="attachment_284" class="wp-caption aligncenter" style="width: 560px"><a href="http://danieljlewis.org/files/2010/04/OACandLOACSwk.png"><img class="size-full wp-image-284 " title="OACandLOACSwk" src="http://danieljlewis.org/files/2010/04/OACandLOACSwk.png" alt="" width="550" height="378" /></a><p class="wp-caption-text">OAC and LOAC classifications for Southwark.</p></div>
<p>Inspired by some of<a title="JC's Academic Context" href="http://spatialanalysis.co.uk/surnames/" target="_blank"> James  Cheshire&#8217;s great work with surnames</a> I employed a method called  &#8220;Multi Dimensional Scaling&#8221; or MDS. Multi Dimensional Scaling is great for exploring similarities and dissimilaries in data, rather than clustering data as in the creation of OAC, it reorders it so that similar datapoints have similar values. One of it&#8217;s great advantages is that it allows for the scaling of data that has many dimensions, such as the 41 OAC variables, into fewer dimensions representative of those 41, these can subsequently be visualised. Traditional approaches in geography have used MDS to scale many dimensions into 2, using these 2 to adjust spatial coordinates to &#8216;blow apart&#8217; maps, reordering places that are similar together and dissimilar further apart. Such representations challenge the validity of Tobler&#8217;s 1st law &#8211; near things are more similar than distant things. In this case however I don&#8217;t want to blow up Southwark, so I follow Cheshire&#8217;s lead in using the scaling to specify a colour for each area in which similar colours indicate similar areas in terms of OAC variables and different colours represent different areas. I experimented with both greyscale and RGB colour scales for this representation. Firstly though, a note on how I got there:</p>
<ol>
<li>Download the OAC variables from CASWEB, using the &#8216;recipe&#8217; specified by <a title="Vickers Working Paper OAC" href="http://eprints.whiterose.ac.uk/5003/1/05-2.pdf" target="_blank">Vickers et al (2005)</a>.</li>
<li>Standardise all the variables &#8211; I used Z-score without really checking for normality, although in reality this would be preferable &#8211; Vickers suggests some other methods of standardisation.</li>
<li>Compute a distance matrix for the MDS. This means calculating the similarity of each pair of OAs, given n OAs this thus leads to an n x n size matrix, a size that can rapidly become unmanageable beyond local scales. I used &#8216;canberra distance&#8217; (an arbitrary choice) to compute the matrix which is given by: <a href="http://danieljlewis.org/files/2010/04/CodeCogsEqn3.gif"><img class="aligncenter size-full wp-image-287" title="CodeCogsEqn(3)" src="http://danieljlewis.org/files/2010/04/CodeCogsEqn3.gif" alt="" width="171" height="61" /></a>where i relates to the value of the first object in a pair and j the  second, and k denotes the variable in question.</li>
<li>This matrix is then input into an MDS solver, as a python fan I used the  fantastic code written using Numpy by <a title="MDS Python Script" href="http://code.google.com/p/pyrouette/source/browse/alg/mds.py" target="_blank">Jeremy Stober</a>, although I added to it to do all the standardisation, distance matrix creation etc as part of a logical process.</li>
<li>Specifying the number of output dimensions (I used 1 and 3) allows you to reduce the large distance matrix into a vector (1d) or matrix (3d) of values, these can then be scaled between 0 and 255 to be converted into digital numbers for visual display. Thanks to James Cheshire for the ArcGIS script to assign RGB values in Arc.</li>
</ol>
<p>The results I got from this preliminary exploration were as follows:</p>
<p style="text-align: left">
<div id="attachment_289" class="wp-caption aligncenter" style="width: 516px"><a href="http://danieljlewis.org/files/2010/04/OACswkBW.jpg"><img class="size-large wp-image-289 " title="OACswkBW" src="http://danieljlewis.org/files/2010/04/OACswkBW-723x1024.jpg" alt="" width="506" height="717" /></a><p class="wp-caption-text">MDS Scaling of OAC variables into Greyscale Representation.</p></div>
<p>This is a very interesting way of looking at the OAC data, as the comfortable uniformity of the seven classes has been lost, instead we can see trends and similarities, but also a fair amount of discountinuity and noise. In the black and white representation a spectrum is presented in which very dark and very light colours are the most dissimilar slowly converging through the spectrum. The resultant mapping clearly displays areas of similarity, the more affluent southern tip of Southwark, the Southbank region in the north of the borough, and the former docklands in the north-west. Counterpoint to these areas is the middle band of Southwark represented by darker hues, and roughly aligned with known areas of deprivation characterised by high-levels of social housing, higher levels of non-white residents, lower level of educational attainment, poorer health etc. What is clear though is that the picture is not uniform as suggested by OAC, and that there exist notable pockets of difference, possibly interpretable as gentrification, particularly around parks. There is also evidence for some fairly notable discontinuities in demographic structure which isn&#8217;t immediately obvious in the OAC classification.</p>
<p>I also mapped an MDS output for 3 dimensions onto an RGB colour scale, as below:</p>
<div id="attachment_292" class="wp-caption aligncenter" style="width: 540px"><a href="http://danieljlewis.org/files/2010/04/OACSwkRGB.jpg"><img class="size-full wp-image-292 " title="OACSwkRGB" src="http://danieljlewis.org/files/2010/04/OACSwkRGB.jpg" alt="OAC" width="530" height="750" /></a><p class="wp-caption-text">MDS Scaling of OAC variables into Colour Representation.</p></div>
<p style="text-align: left">The colour representation should be a more nuanced reading of the similarities and differences, although it is immediately more challenging to interpret. One of the interesting factors is how the southern area of Southern, most characterised by a blue/purple colour has now been distanced from the southbank and former docklands areas, suggesting they are more distinguishably different than previously. The previously dark area is now a pinkish hue, again suggesting a uniformity in that area, however it is flecked with a variety of colours suggesting that deviations in demographics amongst the areas of high deprivation are not similar to each other, but distinct enclaves each with their own specific character.</p>
<p style="text-align: left">This constituted a preliminary study, time permitting I will continue to investigate interesting methods such as this. It is however a computationally intensive process, and a treatment of, for example the UK in this manner is out of the question. Nevertheless, I may update it at different scale in the future.</p>
<p style="text-align: left">Acknowledgment: Boundaries Crown Copyright 2010 Ordnance Survey. A UKBorders/JISC Supplied Service. Data from CASWeb.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/04/28/multi-dimensional-scaling-of-southwark-oac-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Uncertainty: Southwark&#8217;s Disappearing Estates</title>
		<link>http://danieljlewis.org/2010/02/09/data-uncertainty-southwarks-disappearing-estates/</link>
		<comments>http://danieljlewis.org/2010/02/09/data-uncertainty-southwarks-disappearing-estates/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 20:31:23 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[PhD Work]]></category>
		<category><![CDATA[Southwark]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[estates]]></category>
		<category><![CDATA[geocoding]]></category>
		<category><![CDATA[hidden]]></category>
		<category><![CDATA[regeneration]]></category>
		<category><![CDATA[uncertainty]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=170</guid>
		<description><![CDATA[I&#8217;ve spent some time recently working towards a situation in which the whole dataset for patients registered to General Practices in the London Borough of Southwark is coded to address level. Previously I had been working with the data at postcode level and I wanted to start investigating the effects of households on uptake of [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F02%2F09%2Fdata-uncertainty-southwarks-disappearing-estates%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F02%2F09%2Fdata-uncertainty-southwarks-disappearing-estates%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I&#8217;ve spent some time recently working towards a situation in which the whole dataset for patients registered to General Practices in the London Borough of Southwark is coded to address level. Previously I had been working with the data at postcode level and I wanted to start investigating the effects of households on uptake of service, and well as profiling patients at a finer granularity and integrating geographically more sensitive analyses. The geocoding project obeyed the general rules set out for this kind of work; it was reasonably easy, in the end, to address match 92% of the data by scripting, somewhat frustrating to push that total up to 99% (through semi-automated methods of address matching) and all but impossible to match the last 0.5% of patients.</p>
<p>This last group, roughly equivilant to 1500 people who have given addresses which i cannot, even manually, match. This tends to be because, perhaps unwittingly, the postcode doesn&#8217;t exist, there is too much uncertainty meaning it could relate to several possible places or the house or the road simply does not exist. In some cases it was easy to clean up the data, for instance it became clear that in a number of cases the addresses actually related to boats moored in <a title="South Dock Marina - Google Maps" href="http://maps.google.co.uk/maps?q=SE16+7SZ&amp;oe=utf-8&amp;rls=org.mozilla:en-GB:official&amp;client=firefox-a&amp;um=1&amp;ie=UTF-8&amp;hq=&amp;hnear=London+SE16+7SZ&amp;gl=uk&amp;ei=ynlxS7qKGYnu0gTz3pWpCw&amp;sa=X&amp;oi=geocode_result&amp;ct=image&amp;resnum=1&amp;ved=0CAsQ8gEwAA" target="_blank">South Dock Marina</a>, London&#8217;s largest marina. Obviously people that live on boats still need health care, but do not have an address as such, in this case I registered boats to the Dock Office. Similar issues occured with students registered as living in one of Southwark&#8217;s numerous student residences, the student&#8217;s transient nature meant that there were numerous different ways of recording their residences. In a similar vein it was interesting to deal with the fairly substantial group of people who were either registered as NFA (no fixed abode) and to the GP surgery&#8217;s postcode, or to one of several shelters or missions such as the Salvation Army or St. Mungos. This aspect of the data gives an insight that is otherwise quite hard to get at, naturally homeless people require health care from time to time, and it order to receive it they need to go into the system in some way, the fixed address structure of registration means these people occur as somewhat anomolous results within the database. This has the potential to give an insight into the homeless situation in Southwark. Finally there seemed to be some trouble matching patietns that were registered as living in care homes, again these were easy to address match, it was simply that the address information itself had been misreported, or simply read the name of the particular care home in question.</p>
<p>Having gone through the unmatched patients and weeded out cases such as those above that were valid patients, but who didn&#8217;t neatly fit into a database with an address-based structure I was left with what appeared to be whole sets of estates that were completely unmatched. I ran a series of wildcard searches on the AddressLayer2 database I have set up in order to try and find these estates, but kept returning empty sets of results. One of the estates that I couldn&#8217;t match was the &#8220;Sumner Estate&#8221;, this rang a bell as I used to live in Peckham and cycled through this estate everyday on the way to LSE, I vaguely remembered reading about its scheduled demolition in The Economist in about 2006-2007. I did a quick google search and found that it was in fact part of the Aylesbury Regeneration scheme, a £2.5bn regeneration by Southwark Council that aimed to clear and rebuild some of Southwark&#8217;s worst and most notorious social housing estates. This estate was bad from the beginning and in fact lasted fewer than 50 years, with the most recent 20 being acknowledged as in a state of critical decay.</p>
<p style="text-align: center">
<div id="attachment_175" class="wp-caption aligncenter" style="width: 490px"><a href="http://danieljlewis.org/files/2010/02/Aylesbury.jpg"><img class="size-full wp-image-175 " title="Aylesbury" src="http://danieljlewis.org/files/2010/02/Aylesbury.jpg" alt="" width="480" height="360" /></a><p class="wp-caption-text">Aylesbury Estate. Source: http://www.flickr.com/photos/se9</p></div>
<p>I conducted a number of further searches on google of the following places: Wood Dene; Alison House; Marchant House; Yeoman House; Saul House; Sharpness House; Rainswick Court; Lambourne House; Silwood Estate; Kingshill; Dobson House; Dufrey House; Ayton House; Habington House; Hordle Promenade South and North; North Peckham Estate. I found that all of these houses or estates had been demolished at some point in the mid 2000s. This accounted for around 600-700 patients in my dataset, the larger issue here is data uncertainty: if there exists people in the dataset that don&#8217;t actually exist in reality then we have an issue. Having said that, the 600 people that I uncovered as having a defunct registered address only accounts for 0.17% of the dataset, so maybe it&#8217;s not too bad. What I actually wanted to focus on here is the hidden nature of these regenerated places.</p>
<p>In conducting internet-based searches for information on the various housing estates listed above I found a very dark picture. To start with inforamtion is very scarce, there is little record on Southwark Council website express regarding regeneration and which blocks were torn down. Some information came from copies of local papers and bulletins. Sadly a great deal was also associated with news media that was reporting the regeneration of an estate as an aside to far graver news, most notably the murder of Damilola Taylor on the North Peckham Estate. Indeed several estates were conspicious in their absence from any online resource or comment other than court documents acknowledging that a defendant heralded from such an estate. In redeveloping large estates, whole roads were removed, the aforementioned Hordle Promenade North and South, as well as Clanfield Way or Walkford Way. The legacy these roads leave however is quite interesting, <a title="Hordle Promenade North - Google Maps" href="http://maps.google.co.uk/maps?hl=en&amp;source=hp&amp;q=Hordle+Promenade+N,+Camberwell,+Greater+London+SE15+6,+UK&amp;um=1&amp;ie=UTF-8&amp;hq=&amp;hnear=Hordle+Promenade+N,+Camberwell,+Greater+London+SE15+6&amp;gl=uk&amp;ei=H8BxS6foIaX20wS6yKmmCw&amp;sa=X&amp;oi=geocode_result&amp;ct=title&amp;resnum=1&amp;ved=0CAgQ8gEwAA" target="_blank">Hordle Promenade North</a> is a Google maps POI despite no longer existing. Similarly the postcode for Clanfield Way &#8211; <a title="Clanfield Way - Google Maps" href="http://maps.google.co.uk/maps?hl=en&amp;gl=uk&amp;q=London+SE15+6EW,+UK&amp;oq=&amp;um=1&amp;ie=UTF-8&amp;hq=&amp;hnear=London+SE15+6EW&amp;gl=uk&amp;ei=QsFxS7bnMpLu0gS52O2xCw&amp;sa=X&amp;oi=geocode_result&amp;ct=image&amp;resnum=1&amp;ved=0CAsQ8gEwAA" target="_blank">SE15 6EW</a> remains a poi, allocated to a different stretch of road now, as well as <a title="Walkford Way - Google Maps" href="http://maps.google.co.uk/maps?hl=en&amp;q=SE15+6EY&amp;oq=&amp;um=1&amp;gl=uk&amp;resnum=1&amp;ie=UTF-8&amp;hq=&amp;hnear=London+SE15+6EY&amp;gl=uk&amp;ei=18NxS_SvBJ380wTE65SjCw&amp;sa=X&amp;oi=geocode_result&amp;ct=image&amp;resnum=1&amp;ved=0CAsQ8gEwAA" target="_blank">SE16 6EY</a> former postcode for the now demolished Walkford Way. Occasionally planning documents deal with house and estate clearing in a very matter of fact way. There is almost not voice online for any of the inhabitants of these places.</p>
<p>It is easy to view the city as a static entity, it changes so slowly compared to the pace of life, and yet when changes do occur they are easily assimilated into our internal map, as if a change never occured. However, these estates still linger, as hidden reminders of the palimpsestic nature of the city &#8211; slums torn down and regenerated, deprivation papered over, tragic events of the past lapsing into memory, slowly forgotten as the city turns over and adjusts its morphology.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/02/09/data-uncertainty-southwarks-disappearing-estates/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

