<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Volunteered Geographic Information &#187; Representation</title>
	<atom:link href="http://danieljlewis.org/category/representation/feed/" rel="self" type="application/rss+xml" />
	<link>http://danieljlewis.org</link>
	<description>A Geography/GIS blog by Daniel J Lewis</description>
	<lastBuildDate>Tue, 20 Dec 2011 17:15:30 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4-alpha-20124</generator>
		<item>
		<title>Generalising OS MasterMap Buildings</title>
		<link>http://danieljlewis.org/2011/12/20/generalising-os-mastermap-buildings/</link>
		<comments>http://danieljlewis.org/2011/12/20/generalising-os-mastermap-buildings/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 17:15:30 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[GIS]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[buildings]]></category>
		<category><![CDATA[generalisation]]></category>
		<category><![CDATA[geometry]]></category>
		<category><![CDATA[mastermap]]></category>
		<category><![CDATA[scale]]></category>

		<guid isPermaLink="false">http://danieljlewis.org.blogs.splintdev.geog.ucl.ac.uk/?p=547</guid>
		<description><![CDATA[The purpose of map generalisation is to represent spatial data in a way that makes it possible to effectively view the data at scales smaller than that for which it was originally intended. In the case of the Ordnance Survey&#8217;s MasterMap product you have data at an incredibly fine level of spatial resolution, which is [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F12%2F20%2Fgeneralising-os-mastermap-buildings%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F12%2F20%2Fgeneralising-os-mastermap-buildings%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>The purpose of map generalisation is to represent spatial data in a way that makes it possible to effectively view the data at scales smaller than that for which it was originally intended. In the case of the Ordnance Survey&#8217;s MasterMap product you have data at an incredibly fine level of spatial resolution, which is ideally viewed at a scale of approximately 1:1000 give or take 500. When you are reliant on MasterMap, but need to create map of a wider area you are faced with the challenge of generalising data so that it can be ably understood, this means reducing the complexity of components of the spatial data, for instance smoothing wiggly lines, or transforming complicated polygons into simpler ones, as well as aggregating an abundance of small features into larger ones. Such interventions are necessary because it is increasingly difficult to resolve fine detail as map scale decreases, leading to complex shapes appearing messy and disordered when visualised at smaller scales that that which they were intended for. The actual scale of a map gives us an insight into the types of objects that it is possible to resolve at different scales; at a scale of 1:1000 a physical distance of 100cm represents 1km, at 1:10000 and 1:100000 the distance of 1km is covered by 10cm and 1cm respectively. If we conservatively suggest that we can resolve features that are 5mm across, then at 1:1000; 1:10000; and 1:100000, the smallest real world objects that can be represented are 5m,  50m, and 500m respectively. These distances equate to real world objects such as large cars and trucks (c. 5m in length), Olympic-sized swimming pools and office buildings (50m), whilst a distance of 500m is twice the span of Tower Bridge. Evidently, there are significant difference in what constitutes appropriate detail at each of these scales.</p>
<p>I&#8217;ve been dealing with one such problem recently, involving the representation of MasterMap building outlines at a scale of 1:10000, somewhat smaller that the 1:1000ish that it was intended for. In order to create an effective map I did needed to generalise the building outlines, however, unfortunately I don&#8217;t have access to ESRI&#8217;s ArcGIS &#8220;simplify building&#8221; tool due to licensing restrictions, so I had to come up with another solution. Initially I attempted the classic line generalisation procedure &#8211; the Douglas-Peucker algorithm, which simplifies by reducing the number of points in a curve subject to some pre-specified threshold value. However, buildings are strong geometric shapes, often rectangular and orthogonal, so an algorithm such as the Douglas-Peucker can have the effect of disrupting the geometric regularity of building outlines, removing corners etc. What is required is a polygon simplification algorithm that preserves orthogonality, however I couldn&#8217;t find anything that did this whilst being accessible, instead I had to come up with a procedure to approximate a generalisation of the building polygons by another methods. The image below reveals the result, which I think is successful enough to use.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2011/12/RawGenBuildings.png"><img class="aligncenter size-large wp-image-548" src="http://danieljlewis.org/files/2011/12/RawGenBuildings-1024x724.png" alt="" width="502" height="355" /></a></p>
<p style="text-align: left">In the image above, A is the raw data, and B is the generalised data. I experimented with a few approaches, but the one I assessed as being the best was to position an enclosing rectangle around each building polygon, so that the area of the enclosing rectangle was minimised, and subsequently buffer the result to close any small gaps, choosing to dissolve as well in order to further reduce the complexity. Subsequently I removed the particularly small buildings. The generalisation is more in evidence in the image below, in which A and B are the same as before.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2011/12/RawGenBuildings1500.png"><img class="aligncenter size-large wp-image-551" src="http://danieljlewis.org/files/2011/12/RawGenBuildings1500-1024x724.png" alt="" width="502" height="355" /></a></p>
<p style="text-align: left">I am reasonably pleased with the result, which was achieved after a little trial and error. Whilst technical approaches to orthogonal simplification exist I can&#8217;t imagine them being much more effective at this scale, although perhaps at smaller scales they would be more appropriate as they can create meaningful aggregations of building based upon characteristics such as nearest-neighbour distance.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2011/12/20/generalising-os-mastermap-buildings/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Weighted Mean Direction Surfaces in Python</title>
		<link>http://danieljlewis.org/2011/08/31/weighted-mean-direction-surfaces-in-python/</link>
		<comments>http://danieljlewis.org/2011/08/31/weighted-mean-direction-surfaces-in-python/#comments</comments>
		<pubDate>Wed, 31 Aug 2011 13:18:18 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[GIS]]></category>
		<category><![CDATA[Modeling]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[Southwark]]></category>
		<category><![CDATA[Brunsdon]]></category>
		<category><![CDATA[Charlton]]></category>
		<category><![CDATA[circular statistics]]></category>
		<category><![CDATA[mean direction]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[weighting]]></category>

		<guid isPermaLink="false">http://danieljlewis.org.blogs.splintdev.geog.ucl.ac.uk/?p=537</guid>
		<description><![CDATA[I work a lot with flows and spatial interactions, one thing that I&#8217;ve wanted to do for a while is compute a mean flow direction surface. Unfortunately, arithmetic means don&#8217;t work for angular data, this is because it cannot account for the circular nature of the distribution of angular measurements. For instance the angles 5 [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F08%2F31%2Fweighted-mean-direction-surfaces-in-python%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F08%2F31%2Fweighted-mean-direction-surfaces-in-python%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I work a lot with flows and spatial interactions, one thing that I&#8217;ve wanted to do for a while is compute a mean flow direction surface. Unfortunately, arithmetic means don&#8217;t work for angular data, this is because it cannot account for the circular nature of the distribution of angular measurements. For instance the angles 5 degrees and 355 degrees are seperated only by 10 degrees, but their arithmetic mean is 180 degrees -w ay off, it should be 0 degrees!</p>
<p>Luckily, <a title="Local trend Statistics for Direction Data" href="http://leicester.academia.edu/ChrisBrunsdon/Papers/534394/Local_trend_statistics_for_directional_data--A_moving_window_approach">Brunsdon and Charlton</a> have published on this very subject, so I took it upon myself to implement a weighted circular mean function in Python. The key obstacle was learning about complex numbers, about which, up until this point, I had no idea about at all!</p>
<p>The first thing to do is calculate the angle between a set of candidate points (such as people) and a set of services (such as Medical Centres). This is simple enough to do using, and would look something like:</p>
<pre>import math</pre>
<pre>math.atan2((y2-y1),(x2-x1))</pre>
<p>In which the pair (x1,y1) is the location of the candidate point, and (x2,y2) the location of the allocated service for that candidate point. The line linking these two points defines a flow from a candidate point, to a servcie and vice versa.</p>
<p>Having calculated all of the angles, I used ArcGIS to create an output grid, at the extent of the candidate points, using the &#8220;fishnet&#8221; function which creates a vector grid of prespecified dimensions.</p>
<p>The beauty of Brunsdon and Charlton&#8217;s method is that it uses a local method of approximation, this means that for each cell in the output grid, a mean direction can be calculated based upon the values of nearby points, applying a weighting allows for more distance points to have less of an effect on the mean direction.</p>
<p>Firstly, I read all the candidate points into a KDTree structure, this allows me to search for local points, at the same time I also create an array of the angles for those candidate points.</p>
<pre>from scipy.spatial import cKDTree
import numpy as np

tree = cKDTree(treepoints)
res, idx = tree.query(testpoint,300000,0,2,100)
res = res[0][np.where(res[0] &lt; np.Inf)[0]]
idx = idx[0][:len(res)]</pre>
<p>The tree takes a numpy array of coordinate pairs, and the query method returns an array of distances to points (res) and their index value in the original array of coordinates (idx). The testpoint is a cell in the vector grid; 300000 is the k-number of nearest neighbours to find, here I have simply set it arbitrarily high in the context of my dataset; 0 is for approximate nearest neighbours, here I&#8217;ve specified exact; 2 indicates the use of euclidian distance; and 100 is the threshold, neighbours won&#8217;t be returned if they are further than 100 metres away. The penultimate line simply returns an array that is shortened to just those values which are less than 100m away (i.e. less than infinity) &#8211; points over 100m away are returned as value Inf.</p>
<p>The next step is to actually compute the mean direction, this requires a special approach using complex numbers however. Brunsdon and Charlton show that a direction can be stated as a complex number <em>z</em> in which <em>z = exp(iθ)</em> this is effectively: <em>z = cos(θ) + i sin(θ)  </em>in which <em>i</em> is an imaginary number. We can restate our directions in Python using:</p>
<pre>import cmath

thetas = angles[idx]
cThetas = []
for i in xrange(0,len(thetas)):
    cThetas.append(complex(np.cos(thetas[i]),np.sin(thetas[i])))
cThetas = np.array(cThetas)</pre>
<p>Here, the complex function allows the complex number representing an angle to be stored in a list, which I convert (lazily) to a numpy array. The first term, thetas, is using the idx array from the cKDTree to cleverly index the relevant angle records from the angles array which stores all the angle values in the order of entries for the cKDTree.</p>
<p>Next a temporary variable is created which calculates the mean direction:</p>
<pre>temp = np.sum(cThetas)/np.absolute(np.sum(cThetas))
MeanDir = np.angle(temp, deg = True)</pre>
<p>The mean direction is given by the argument (Arg) of the resultant complex number, Python implements this with the np.angle function, where deg = True returns the angle in degrees, and False in radians.</p>
<p>So far this is the unweighted mean, aggregating directional observations within a 100m disk (see also: uniform disk smoothing). To introduce weighting we must first define a weighting scheme, I&#8217;ve used the one suggested by Brunsdon and Charlton, which is Gaussian, and might look at bit like this:</p>
<pre>def gaussW(dists,band):
    out = np.zeros(dists.shape)
    for i in xrange(0,len(out)):
        temp = np.power(dists[i],2)/(2.0*np.power(float(band),2))
        out[i] = np.exp(-1.0 * temp)
    return out

weight = gaussW(res,100)</pre>
<p>Quite simply, I pass the distance array res to the gaussW function and it gives me back an array of weights for that ordering of distances. Using this I can redo the mean direction thus:</p>
<pre>temp = np.sum(weight*cThetas)/np.absolute(np.sum(weight*cThetas))
MeanDir = np.angle(temp, deg = True)</pre>
<p>There you have it! Attached is the script I used. Obviously, Brunsdon and Charlton implement a variance and a couple of visualisation devices, but these should be simple enough to implement now!</p>
<p>I created an output for flows of patients to GPs in Southwark, visualised using one of ESRI&#8217;s circular/direction colour ramps from <a title="Mapping Resources" href="http://mappingcenter.esri.com/index.cfm?fa=arcgisResources.gateway">colour ramp pack 2</a>. Not sure how best to visualise the legend at this point though. NB. 90 is north, -90 is South, 0/-0 is East and 180/-180 is West. The map is visualised to show the 4 cardinal directions, but the output is in fact continuous.</p>
<p style="text-align: left"><a href="http://danieljlewis.org/files/2011/08/MeanDirectionFlows.png"><img class="aligncenter size-large wp-image-538" src="http://danieljlewis.org/files/2011/08/MeanDirectionFlows-724x1024.png" alt="" width="434" height="614" /></a>My example script is <a href="http://danieljlewis.org/files/2011/08/meanDirection.txt">here. </a> Note that I am using dbfpy to read and write to shapefile DBF tables directly.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2011/08/31/weighted-mean-direction-surfaces-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Spatial Approach to Location Quotients</title>
		<link>http://danieljlewis.org/2011/06/17/a-spatial-approach-to-location-quotients-2/</link>
		<comments>http://danieljlewis.org/2011/06/17/a-spatial-approach-to-location-quotients-2/#comments</comments>
		<pubDate>Fri, 17 Jun 2011 14:46:21 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Geography]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[Southwark]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[density]]></category>
		<category><![CDATA[KDE]]></category>
		<category><![CDATA[Location Quotient]]></category>

		<guid isPermaLink="false">http://danieljlewis.org.blogs.splintdev.geog.ucl.ac.uk/?p=529</guid>
		<description><![CDATA[The intent of this post is not simply to uncover where the highest density of people belonging to a particular ethnic group are, but rather to use the ‘location quotient’ (LQ) technique to compare the ethnic density in any one area to the overall ethnic density in Southwark, thus providing a relative insight into where [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F06%2F17%2Fa-spatial-approach-to-location-quotients-2%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F06%2F17%2Fa-spatial-approach-to-location-quotients-2%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>The intent of this post is not simply to uncover where the highest density of people belonging to a particular ethnic group are, but rather to use the ‘location quotient’ (LQ) technique to compare the ethnic density in any one area to the overall ethnic density in Southwark, thus providing a relative insight into where the density of particular groups is more, less or as dense as expected.</p>
<p>Location Quotients tend to work with areal units, characterising different areas subject to a larger region and providing a basic insight into where functions are clustered. Because the Southwark patient register data is address geocoded, we would be losing some spatial information if we choose to aggregate the data, not to mention the question of which areal aggregation is best. More info on how to create location quotients <a title="Wikipedia with LQs" href="http://en.wikipedia.org/wiki/Economic_base_analysis">here</a>.</p>
<p>A Location Quotient has 3 possible interpretations; if it is around 1 then the ethnic population in that area is at the level we would expect given what we observe nationally. If the LQ is less than 1 then that area has a lesser population of a particular ethnic group that what we would expect based upon national figures. Finally, in the LQ value is over 1 this suggests a concentration of the ethnic group in the area which is greater than we would expect given nationally observed levels. A LQ is quite simply a rate-ratio.</p>
<p>Instead of the standrad areal approach, the maps here use a density estimation approach in which disaggregate point data is transformed into a representation of the continuous density function of the point distribution. The LQ can then be computed for each cell based on the density of that cell with respect to the total density of the surface. This creates a smoothed LQ surface which is readily interpretable in the same manner as above. The Kernel Density Estimation used to create the ethnic and total population density surfaces should be parameterised in the same way; these examples use a 250m bandwidth and a 25m cells size, which is largely empirically redundant, based on the input dataset’s spatial resolution, but creates a more aesthetically appealing mapped representation. Naturally, the procedure works well for clustered data, in Southwarks case for the African and Muslim groups.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2011/06/AfricanLQ.png"><img class="aligncenter size-large wp-image-530" src="http://danieljlewis.org/files/2011/06/AfricanLQ-724x1024.png" alt="" width="463" height="655" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2011/06/17/a-spatial-approach-to-location-quotients-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Network Population Density for Southwark</title>
		<link>http://danieljlewis.org/2011/05/03/network-population-density-for-southwark/</link>
		<comments>http://danieljlewis.org/2011/05/03/network-population-density-for-southwark/#comments</comments>
		<pubDate>Tue, 03 May 2011 02:29:45 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Modeling]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[density]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[population]]></category>
		<category><![CDATA[sanet]]></category>
		<category><![CDATA[visualisation]]></category>

		<guid isPermaLink="false">http://danieljlewis.org.blogs.splintdev.geog.ucl.ac.uk/?p=517</guid>
		<description><![CDATA[Using the excellent SANET extension for ArcGIS 9.3 I was able to take some of my data for Southwark that I had geocoded to address level, and estimate the population density using the OS Mastermap ITN product. The procedure is essentially a Kernel Density Estimation that takes place on a given network rather than across 2D space, this effectively controls [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F05%2F03%2Fnetwork-population-density-for-southwark%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2011%2F05%2F03%2Fnetwork-population-density-for-southwark%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Using the excellent <a title="SANET Website" href="http://sanet.csis.u-tokyo.ac.jp/">SANET</a> extension for ArcGIS 9.3 I was able to take some of my data for Southwark that I had geocoded to address level, and estimate the population density using the OS Mastermap ITN product. The procedure is essentially a Kernel Density Estimation that takes place on a given network rather than across 2D space, this effectively controls for the effect of spatial structure, such as urban form, of which the data relates to residential locations. The estimation is made for c.300,000 people in Southwark on a network with around 30,000 road segments so it is to be expected that the calculation takes several hours to run. The KDE process is parameterised in much the same way as the straightforward density estimation procedures in the ARCGIS Spatial Analyst toolboxes, bandwidth and cell size are specified. In this case though cell size relates to the length of segments into which the network has to be cut in order to represent the output. Additionally, SANET allows you to control how you handle road intersections, either by using a continuous or discontinuous approach to the bifurcation, i arbitrarily chose the continuous approach, essentially meaning that the density estimation can turn corners. A straightforward representation can be made in 2D as below.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2011/05/SouthwarkNetworkDensitySanet.png"><img class="aligncenter size-large wp-image-518" src="http://danieljlewis.org/files/2011/05/SouthwarkNetworkDensitySanet-791x1024.png" alt="" width="428" height="553" /></a></p>
<p style="text-align: left">The interesting aspect to this image that is obscured in 2D smoothed representations is the relative usage of different streets, clearly visible are the residential streets as distinct from the more commercial area on Southwark&#8217;s Bankside, and along major roads, and the effect of open space and water features in reducing network density (i.e. if only one side of a road has residences on it). I&#8217;ve attempted to explore this further by using ArcScene&#8217;s 3D visualisation capabilities, but the complexity of the data make this an incredibly arduous process. The result i was able to obtain outside of ArcScene simply crashing are below.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2011/05/testhigherres.png"><img class="aligncenter size-large wp-image-521" src="http://danieljlewis.org/files/2011/05/testhigherres-1024x610.png" alt="" width="553" height="329" /></a></p>
<p style="text-align: left">In this example, Southwark is presented in a kind of 2.5D perspective in which the streets have been extruded so that their height represents the population density at that point. I&#8217;ve included some contextual elements, the Thames, and parks, wooded areas, and other water features. Whether or not this image is in anyway an improvement over a simple 2D representation is open to debate, but the selections below do present an interesting cross section of the data.</p>
<p style="text-align: left"><a href="http://danieljlewis.org/files/2011/05/SelectionSanetSwk.png"><img class="aligncenter size-medium wp-image-522" src="http://danieljlewis.org/files/2011/05/SelectionSanetSwk-300x178.png" alt="" width="300" height="178" /></a><a href="http://danieljlewis.org/files/2011/05/SelectionSanetSwk2.png"><img class="aligncenter size-medium wp-image-523" src="http://danieljlewis.org/files/2011/05/SelectionSanetSwk2-300x178.png" alt="" width="300" height="178" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2011/05/03/network-population-density-for-southwark/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Frozen Britain and No Central Heating?</title>
		<link>http://danieljlewis.org/2010/12/20/frozen-britain-and-no-central-heating/</link>
		<comments>http://danieljlewis.org/2010/12/20/frozen-britain-and-no-central-heating/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 19:12:15 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Representation]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[cartogram]]></category>
		<category><![CDATA[central heating]]></category>
		<category><![CDATA[snow]]></category>
		<category><![CDATA[winter]]></category>

		<guid isPermaLink="false">http://danieljlewis.org.blogs.splintdev.geog.ucl.ac.uk/?p=475</guid>
		<description><![CDATA[I liked Ben Hennig&#8217;s population cartogram of the UK under snow, but I thought it could perhaps show something a little more serious than simply where the people are. To do this I went to the UK Census 2001 (I know, an old data source, but the only thing I was aware of that could [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F12%2F20%2Ffrozen-britain-and-no-central-heating%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F12%2F20%2Ffrozen-britain-and-no-central-heating%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I liked <a title="Views of the World" href="http://www.viewsoftheworld.net/?p=1101" target="_blank">Ben Hennig&#8217;s population cartogram of the UK under snow</a>, but I thought it could perhaps show something a little more serious than simply where the people are. To do this I went to the UK Census 2001 (I know, an old data source, but the only thing I was aware of that could help me) and downloaded a dataset of counts by area (LSOA) of households without central heating. Using these counts as a base population, I created the cartogram below.</p>
<p><a href="http://danieljlewis.org/files/2010/12/UKSnowCentralHeat.png"></a><a href="http://danieljlewis.org/files/2010/12/UKSnowCentralHeat1.png"><img class="aligncenter size-large wp-image-478" src="http://danieljlewis.org/files/2010/12/UKSnowCentralHeat1-723x1024.png" alt="" width="520" height="737" /></a></p>
<p style="text-align: left">Whilst very similar to Ben&#8217;s cartogram, there are some differences, notably Scotland is not as prominant as in Ben&#8217;s. Perhaps the higher frequency of harsh winters in Scotland has made central heating a necessity. This also seems to be true in the far north of England. Likewise, Wales shrinks away in all areas aside from Cardiff which is a notable bulge of people without central heating. It is clear, however, that the people most effected by a lack of central heating are those that live in the south and middle of England in large population centres such as London &#8211; perhaps complacency to cold weather, plus a stock of substandard housing, or high levels of deprivation have caused this. Needless to say, it is likely to be these people that disproportionately feel the cold this winter.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/12/20/frozen-britain-and-no-central-heating/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Representing Populations: a Spatial Ecology</title>
		<link>http://danieljlewis.org/2010/12/20/representing-populations-a-spatial-ecology/</link>
		<comments>http://danieljlewis.org/2010/12/20/representing-populations-a-spatial-ecology/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 12:50:09 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Representation]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[choropleth]]></category>
		<category><![CDATA[dasymetric]]></category>
		<category><![CDATA[dot density]]></category>
		<category><![CDATA[new york times]]></category>
		<category><![CDATA[uncertainties]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=467</guid>
		<description><![CDATA[A subtitle to this post might also be: Are we all being mislead by the New York Times? In stating this I am referring to the recent maps released by the New York Times looking at ethnic distributions from the US Census Bureau&#8217;s American Community Survey. The most immediate thing we can learn about this [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F12%2F20%2Frepresenting-populations-a-spatial-ecology%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F12%2F20%2Frepresenting-populations-a-spatial-ecology%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>A subtitle to this post might also be: Are we all being mislead by the New York Times? In stating this I am referring to the recent maps released by the <a title="NYT Census Explorer" href="http://projects.nytimes.com/census/2010/explorer" target="_blank">New York Times</a> looking at ethnic distributions from the US Census Bureau&#8217;s American Community Survey.</p>
<p>The most immediate thing we can learn about this project is that it is a spatial ecology, that is, an examination of the spatial patterning of a phenomena, here it&#8217;s ethnicity, at a given level of spatial aggregation, in this case &#8220;every city, every block&#8221;.This much is apparent both when you drag the mouse across the geography of America and the Census areas are highlighted, as well as when you zoom in, and you navigate from the Census tract level to the Census block level, a finer scale areal aggregation.</p>
<p>On the one hand, what has been achieved in this map is tremendous, and the use of dot density mapping allows for a singular look at multivariate data. The sheer level of residential segregation in the US also makes the dot density approach a very persuasive cartographic representation. However, first let us consider what the dot density approach is.</p>
<p>First and foremost, it is important to note that the dot density approach does not represent the real-world locations of individuals, far from it, dot density maps are simply another way of drawing a choropleth map. Choropleth maps show data aggregated into predefined areas (e.g. Census Blocks) and thematically colour these areas based upon some classification of the share of the mapped phenomenon that each area has. In a dot density map, each dot represents an observation, or number of observations, that occur within an area, each dot is then randomly positioned within that area. This means that phenomena do not strictly occur where they were sampled, which can (in increasingly large areas) lead to increasingly large uncertainties and misrepresentations. A higher number of dots within an area indicates a greater number of observations, with density described by the relative spacing of the dots in each area: smaller spacings indicate higher density.</p>
<p>Herein lies the difficulty &#8211; most ways of dividing up territory, and census delineations in particular, use a space covering approach. This continuous, spatially extensive way of dividing up land means that all land areas, even areas that have no people living in them, are potentially subject to the random placement of a dot, in the image below this is shown by the placement of dots in water bodies. Dot density can be logically unsound, particularly when two adjoining census blocks have significantly different population densities, shown by the representation of apparently hard &#8216;edges&#8217; at areal boundaries as in the image below.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2010/12/NYTethnicity.png"><img class="aligncenter size-full wp-image-470" src="http://danieljlewis.org/files/2010/12/NYTethnicity.png" alt="" width="451" height="244" /></a></p>
<p style="text-align: left">One solution that could work to mitigate the issue of representing areal data using dot density maps would be to apply dasymetric mapping. The dasymetric mapping technique is a method of reallocating a population recorded on a continuous areal basis to one which is a better representation of where people actually are. To do this, more information than simply population counts are usually required, such as landuse classifications, or delineations of developed area. In reallocating population counts from an areal unit created on a continuous basis, to one which aims at a more realistic placing of people in space, the volume of people per area is preserved, this means that you will never end up with more or less people than you started with. David Martin has, in the UK, been responsible for some notable dasysemtric outputs with regard to the UK Census, and provides a software tool, <a title="Dave Martin Surface Builder" href="http://www.public.geog.soton.ac.uk/users/martindj/davehome/software.htm" target="_blank">SurfaceBuilder, here</a>.</p>
<p style="text-align: left">The overarching goal of dasymetric mapping is to circumnavigate the ecological fallacy, which manifests itself in issues I have suggested exist in the dot density mapping of the US. Whilst dasymetric mapping would resolve some issues, dot density would still be subject to some mislocation of data, which largely stems from the conflicting ontology of representing an areal-based data, such as a population count by census area, as a series of points within that area; it is too easy for the viewer to interpret the points as having some level of significance above and beyond the areal container within which they sit. Therefore it is useful that the New York Times mapping also provides an option to look solely at the thematic choropleths classified by colouring the areas for each individual ethnicity. In this representation the viewer cannot confer the same kind of absolute interpretation upon the meaning or location of points, as they may do for dot density representations.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/12/20/representing-populations-a-spatial-ecology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Carl Steinitz, Symap and Place</title>
		<link>http://danieljlewis.org/2010/08/11/carl-steinitz-symap-and-place/</link>
		<comments>http://danieljlewis.org/2010/08/11/carl-steinitz-symap-and-place/#comments</comments>
		<pubDate>Wed, 11 Aug 2010 16:54:14 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Cartography]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Boston]]></category>
		<category><![CDATA[LSE]]></category>
		<category><![CDATA[MIT]]></category>
		<category><![CDATA[place]]></category>
		<category><![CDATA[Steinitz]]></category>
		<category><![CDATA[Symap]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=387</guid>
		<description><![CDATA[Recently, all and sundry had the chance to rummage through LSE Geography&#8217;s map library and liberate any maps of their choosing. Naturally some got over excited (cf. James Cheshire) and took numerous maps of all sorts. I was slightly more selective, and whilst being mostly on the look out for maps that represented social areas [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F08%2F11%2Fcarl-steinitz-symap-and-place%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F08%2F11%2Fcarl-steinitz-symap-and-place%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Recently, all and sundry had the chance to rummage through LSE Geography&#8217;s map library and liberate any maps of their choosing. Naturally some got over excited (cf.<a title="James Cheshire's Blog" href="http://spatialanalysis.co.uk/"> James Cheshire</a>) and took numerous maps of all sorts. I was slightly more selective, and whilst being mostly on the look out for maps that represented social areas (cf. <a title="LSE Booth Map Portal" href="http://booth.lse.ac.uk/" target="_blank">Booth Maps</a>) I did find one particularly interesting map.</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2010/08/Steinitz.jpg"><img class="aligncenter size-full wp-image-388" title="Steinitz" src="http://danieljlewis.org/files/2010/08/Steinitz.jpg" alt="" width="436" height="645" /></a></p>
<p>The map is by Carl Steinitz, from a time when he was at MIT Department of City and Regional Planning, it appears to be made using Symap. The map is entitled &#8220;The Principle Local Activity of a Place&#8221;. I think this title is both fascinating, and in terms of the development of spatial analysis quite telling. First however, some background.</p>
<p>Carl Steinitz is a Professor at Harvard Graduate School of Design, and has been a regular visitor at CASA for as long as I&#8217;ve been at UCL. He trained as an architect and planner, but became known as an early evangelist of Geographic Information Systems (GIS), his ongoing work concerns the design of environments, often urban, and the use of GIS to describe possible development trajectories. I respect him most for his impassioned stance against needless 3d visualisations, particularly if those visualisation have a musical backing.</p>
<p>Symap, aka synergraphic mapping system, is one of the first software packages that could create outputs that actively resemble current desktop GIS outputs. It was developed in the mid-1960s and carried with it a distinctive style of using ascii characters in order to draw map elements. Andrew Crooks has a couple of interesting examples and some background on his <a title="Symap info from GIS Agents" href="http://gisagents.blogspot.com/2009/10/symap-movie.html" target="_blank">blog</a>.</p>
<p>Now the map in question here doesn&#8217;t seem to have a date, which is a shame, and it does not give a specific location to the mapped area, although given that the map was made in MIT it becomes apparent that the area in question is Boston, Massachusetts with the Charles River Basin in the north of the map, and Boston Harbour to the east. The legend denotes different kinds of &#8216;principle local activity&#8217;, using different ascii characters to create a colour graduation. Unfortunately some of the particular legend categories are lost due to the low quality of reproduction on this particular map, nevertheless we see that Boston exhibits a distinct spatial patterning with respect to principle activity. This kind of map is not unusual, land use mapping is still an actively researched area that continues to generate copious debate &#8211; what interests me actually seems rather minor, it is the description of the map as presenting &#8220;The principle local activity of a <em>place</em>&#8221; (emphasis added). Initially I wondered whether this phrasing was simply standard boilerplate, but a google search couldn&#8217;t find the exact phrase, or variations on it, anywhere else on the web (which is not to say that it isn&#8217;t standard, simply that it doesn&#8217;t exist on google, I imagine it would have appeared had it been related to statistical reporting at some time or another). What it may mean then is that it marks the way in which the author Carl Steinitz saw the representation at the time: as a representation of the local activity of a place.</p>
<p>This is interesting, first it is easy to assume that by place he meant &#8216;Boston&#8217;, Boston is after all a place. However, scale has a very interesting role to play in how we think about place: we can conceive of many places within Boston centred around communities of all kinds, these places will be at least partially defined by the &#8216;local activities&#8217; that occur there. As such the gridded representation of this map hints at the possibility of lots of places within Boston each with particular autobiographies, and each engaging people in different ways and offering different opportunities. Subsequent advances in GIS formalised the discourse of &#8216;space&#8217; and spatial analysis, after all GIS does fundamentally hinge on the euclidian system of representation, and as such the vast, expansive idea of space sits much better than a nuanced, specific, local concept such as place. It would be easy to disregard Steinitz&#8217;s map and say that of course it simply assesses land use in Boston by a grid of systematically defined areas, but that designation of &#8216;place&#8217; &#8211; purposeful or not- adds another layer of interpretation. Fundamentally it gives a different sense to what it is being represented here.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/08/11/carl-steinitz-symap-and-place/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>UK OAC map in Python</title>
		<link>http://danieljlewis.org/2010/06/02/uk-oac-map-in-python/</link>
		<comments>http://danieljlewis.org/2010/06/02/uk-oac-map-in-python/#comments</comments>
		<pubDate>Wed, 02 Jun 2010 11:05:57 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Cartography]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[map]]></category>
		<category><![CDATA[OAC]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[shapely]]></category>
		<category><![CDATA[UK]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=336</guid>
		<description><![CDATA[Here is a quick confirmation that you can use Python to draw very detailed maps; using the previously specified method I was unable to get python to draw all UK OAs due to their great number (c.220,000) and high complexity (c.50,000,000) vertices. Additionally I was unable to use the generalised OA boundaries for the UK [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F06%2F02%2Fuk-oac-map-in-python%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F06%2F02%2Fuk-oac-map-in-python%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Here is a quick confirmation that you can use Python to draw very detailed maps; using the previously specified method I was unable to get python to draw all UK OAs due to their great number (c.220,000) and high complexity (c.50,000,000) vertices. Additionally I was unable to use the generalised OA boundaries for the UK from UKBorders as they contain topological errors that the shapefile reader cannot deal with. ArcGIS is obviously a bit clever in how it handles bad topologies. So I extracted all the vertices and fed them into shapely polygons, and visualised them in the same way, but without reading shapefiles directly into python and was able to output this:</p>
<p style="text-align: left"><a href="http://danieljlewis.org/files/2010/06/UKOAC.png"><img class="aligncenter size-large wp-image-337" title="UKOAC" src="http://danieljlewis.org/files/2010/06/UKOAC-640x1024.png" alt="" width="576" height="922" /></a>This method has had an impact on the speed of computation as it can take roughly 25 minutes to output this map. The map looks pretty good, aside from a slightly odd polygon in the Bristol channel. Nevertheless, coupled with the operations that shapely, and other geo-libraries, can do this si increasing indication of the maturity of GIS in a variety of platforms. Oh, and it&#8217;s all free!</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/06/02/uk-oac-map-in-python/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>More Thematic Maps in Python &#8211; shapely and descartes</title>
		<link>http://danieljlewis.org/2010/05/27/more-thematic-maps-in-python-shapely-and-descartes/</link>
		<comments>http://danieljlewis.org/2010/05/27/more-thematic-maps-in-python-shapely-and-descartes/#comments</comments>
		<pubDate>Thu, 27 May 2010 16:58:14 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Representation]]></category>
		<category><![CDATA[descartes]]></category>
		<category><![CDATA[matplotlib]]></category>
		<category><![CDATA[OAC]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[shapely]]></category>
		<category><![CDATA[Wales]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=326</guid>
		<description><![CDATA[Thanks to Sean Gillies for commenting on my last post, he put me onto a couple of Python packages that he&#8217;s been involved in creating that allow you to do some really excellent geospatial things. The shapely package is a great implementation of a lot of spatial analyses that you can do on projected (i.e. [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F05%2F27%2Fmore-thematic-maps-in-python-shapely-and-descartes%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F05%2F27%2Fmore-thematic-maps-in-python-shapely-and-descartes%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Thanks to <a title="Sean Gillies Homepage" href="http://sgillies.net/" target="_blank">Sean Gillies</a> for commenting on my last post, he put me onto a couple of Python packages that he&#8217;s been involved in creating that allow you to do some really excellent geospatial things. The <a title="shapely" href="http://trac.gispython.org/lab/wiki/Shapely" target="_blank">shapely</a> package is a great implementation of a lot of spatial analyses that you can do on projected (i.e. flattened) datasets, including topological operations and a full set of object types. The <a title="Descartes package" href="http://pypi.python.org/pypi/descartes/1.0" target="_blank">descartes</a> package allows better integration of matplotlib with spatial data, particularly in terms of not having to use the &#8220;fill&#8221; plotting function repeatedly, but creating a more efficient set of &#8220;patches&#8221; which can then be added to the figure plot. The overal impression I got from descartes is that it wasn&#8217;t spectacularly different from the method detailed in my previous post, but it gives you more control and stability over the map plotting process; whereas using raw matplotlib you are inclined to hope that the map outputs correctly (it all seems a bit up to chance), using descartes you have a more robust and easily manipulable output.</p>
<p>In order to test this I rewrote my previous thematic map script to: firstly convert the shapefile geometries into shapely polygons, and secondly to pass those shapely polygons to descartes and draw a map plot using descartes-matplotlib. The only slightly odd piece of functionality that I found was that you can&#8217;t pass the shapely polygon object a list of shapely points in order to create the polygon, rather you have to pass a list of x,y tuples &#8211; much less satisfying!</p>
<p>Nonetheless, the changes were easy to implement, and with the previous script as given basically include:</p>
<pre>from shapely.geometry import Polygon

points = []
for i in range(0,<em>number of points in shapefile</em>):
 tempx = float(<em>x coord of point in shapefile polygon</em>)
 tempy = float(<em>y coord of point in shapefile polygon</em>)

 points.append((tempx,tempy))
polygon = Polygon(points)
</pre>
<p>The above method creates a simple polygon without holes, shapely can accomodate this is need be though. Having created the shapely polygons, all that remains is to create a patch.</p>
<pre>from descartes import PolygonPatch

patch = PolygonPatch(polygon, <em>plus colour and line considerations</em>)
</pre>
<p>Then you simply add the patch to the matplotlib figure you have already created so:</p>
<pre>from matplotlib import pyplot

fig = pyplot.figure(1, figsize = [10,10], dpi = 300)   #create 10x10 figure
ax = fig.addsubplot(111)    #Add the map frame (single plot)

# here you create all the polygons and patches

ax.addpatch(patch)   # simply add the patch to the subplot
# set plot vars
ax.set_xlim(<em>get xmin and xmax values from data</em>)
ax.set_ylim(<em>get ymin and ymax values from data</em>)
ax.set_aspect(1)

pyplot.show()
</pre>
<p>Using these basics I was able to create a basic OAC map using Welsh OAs as an example:</p>
<p style="text-align: center"><a href="http://danieljlewis.org/files/2010/05/WalesOAC1.png"><img class="aligncenter size-full wp-image-328" title="WalesOAC" src="http://danieljlewis.org/files/2010/05/WalesOAC1.png" alt="" width="520" height="545" /></a></p>
<pre>
</pre>
<pre>
</pre>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/05/27/more-thematic-maps-in-python-shapely-and-descartes/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Thematic Map in Python</title>
		<link>http://danieljlewis.org/2010/05/25/a-thematic-map-in-python/</link>
		<comments>http://danieljlewis.org/2010/05/25/a-thematic-map-in-python/#comments</comments>
		<pubDate>Tue, 25 May 2010 19:08:09 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Representation]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[automated]]></category>
		<category><![CDATA[categorical]]></category>
		<category><![CDATA[Maps]]></category>
		<category><![CDATA[matplotlib]]></category>
		<category><![CDATA[OAC]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[shapefile]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=309</guid>
		<description><![CDATA[I though I would explore the possibility of creating thematic maps using Python, this post documents my initial attempt. The output is hence rather basic, but encouraging. The primary reason that I wanted to test the mapping potential of python is to allow for some basic automated map production in order to quickly visually assess [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F05%2F25%2Fa-thematic-map-in-python%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F05%2F25%2Fa-thematic-map-in-python%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>I though I would explore the possibility of creating thematic maps using Python, this post documents my initial attempt. The output is hence rather basic, but encouraging. The primary reason that I wanted to test the mapping potential of python is to allow for some basic automated map production in order to quickly visually assess the geographical patterns contained within large data sets. This is something that I am at a loss to do in ESRI&#8217;s ArcGIS, although that might change in ArcGIS 10. For fans of R I know it can be done there, however R is too tricky for me! My colleague James Cheshire explains the method in R <a title="Making Maps in R" href="http://spatialanalysis.co.uk/2010/01/13/making-maps-with-r/" target="_blank">here.</a></p>
<p>The first hurdle in map making is getting the data in, for this I used the <a title="Shapefile Reader" href="http://indiemaps.com/blog/2008/03/easy-shapefile-loading-in-python/" target="_blank">shapefile reader</a> that <a title="Indiemaps Home" href="http://indiemaps.com/" target="_blank">Zachary Forest Johnson</a> put together for his excellent blog &#8216;<a title="Indiemaps Blog" href="http://indiemaps.com/blog">IndieMaps.com</a>&#8216;. This allowed me read in any of my masses of pre-existing Shapefile format datafiles, and indeed use the python scripting functionality in ArcGIS to perform spatial operations and then output a map quickly and without the hassle of dealing with ArcGIS layouts.</p>
<p>Once you have download the shapefile reader, it is easily implemented using:</p>
<pre>import shpUtils   #imports the shapefile reader
#Load a shapefile into an object called shpRecords
shpRecords = shpUtils.loadShapefile('\filename.shp')</pre>
<p>This is undoubtedly simple, what you then have is a (slightly) complex object which contians all of the shapefile data nested as lists and dictionaries. In order to get my head round this I spent some time investigating it, a standard shapefile that contains areal geographies (i.e. UK Output Areas) will have a similar set up to this:</p>
<ul>
<li>The first list (shpRecords[i]) records the number of complete geometries, this corresponds to the number of rows in the attribute table. Thus a single polygon has 1 row in the attribute table and 1 list (list index 0) in Python.</li>
<li>The second dictionary (shpRecords[i]['key']) records two branches, reporting either the &#8216;dbf_data&#8217; from the attribute table, or the &#8216;shp_data&#8217; from the .shp file describing the underlying geometry.</li>
<li>Choosing the &#8216;dbf_data&#8217; key (shpRecords[i]['dbf_data']) allows you to see the attributes recorded column-by-column for each row (and hence each geometry) in the attribute table. Thus shpRecords[i]['dbf_data']['name'] will return the attribute value for the field &#8216;name&#8217; for the <em>i</em>th geometry in the shapefile.</li>
<li>Choosing the &#8216;shp_data&#8217; key (shpRecords[i]['shp_data']) allows you to access the various components of the shapefile&#8217;s geometry. In the case of a polyline/polygon you get dictionary items &#8216;ymax&#8217;, &#8216;ymin&#8217;, &#8216;xmax&#8217;, &#8216;xmin&#8217;, &#8216;numpoints&#8217;, &#8216;numparts&#8217; and &#8216;parts&#8217;. Clearly the first 6 items are properties of the <em>i</em>th geometry you are querying, so it allows you to form a bounding box, get the number of vertices in the line/polygon, and draw separate lines/polygons if the shapefile is setup to have spatially discontinuous shapes for each row.</li>
<li>The thing we are most interested in is the &#8216;parts&#8217; dictionary key, as this contains all the coordinates for the particular geometry being considered, this is accessed as: shpRecords[i]['shp_data']['parts']. The next list (shpRecords[i]['shp_data']['parts'][j]) thus allows you to distinguish between parts in a multipart file. i.e. the <em>j</em>th part of the <em>i</em>th geometry.</li>
<li>Having come this far, one final dictionary allows us to see the coordinates themselves, this dictionary simply offers us &#8216;x&#8217; or &#8216;y&#8217;. Thus finding the x-coordinate of the <em>i</em>th geometry and <em>j</em>th part is accessed by: shpRecords[i]['shp_data']['parts'][j]['x'] &#8211; simple!</li>
</ul>
<p>I have been using <a title="MatPlotLib @ Sourceforge" href="http://matplotlib.sourceforge.net/" target="_blank">matplotlib</a> &#8211; a python library for scientific visualisation a lot recent, and have found it a very simple and powerful resource, so I thought I&#8217;d see if it could be made to draw a map.</p>
<p>Firstly import the pyplot element which does all the figure drawing:</p>
<pre>import matplotlib.pyplot as plt
</pre>
<p>Now lets use the &#8220;fill&#8221; component of matplotlib to draw all the geometries in a shapefile &#8211; my shapefile is Output Areas in Southwark. Firstly we need to loop through each geometry, and then draw a polygon using all the points contained within each geometry. I omitted a loop for multipart geometries as my shapefile has none, however this would be very easy if the data did have multiple parts- simply add a loop in the middle!</p>
<pre>for i in range(0,len(shpRecords)):
 # x and y are empty lists to be populated with the coords of each geometry.
 x = []
 y = []
 for j in range(0,len(shpRecords[i]['shp_data']['parts'][0]['points'])):
  # This is the number of vertices in the ith geometry.
  # The parts list is [0] as it is singlepart.

  # get x and y coordinates.
  tempx = float(shpRecords[i]['shp_data']['parts'][0]['points'][j]['x'])
  tempy = float(shpRecords[i]['shp_data']['parts'][0]['points'][j]['y'])
  x.append(tempx)
  y.append(tempy) # Populate the lists  

 # Creates a polygon in matplotlib for each geometry in the shapefile
 plt.fill(x,y)

plt.axis('equal')
# This sets the x and y axes as equal intervals.
# NB this script will only work for projected data, for geographical
# coordinate systems get ready to do some maths  

plt.show() # Draws the map!</pre>
<p>This is the simplest form of the script, it will simply draw the shapefile with each area filled a random colour. This is not that useful, but it is easy to create a thematic maps of categorical data, so let investigate a way of doing that. I&#8217;ve got data for the Output Area Classification, which is a clustering of areas by social characteristics, I know that there are 7 supergroups in the classification, named numerically, so before all the processing of the shapefile I can create a dictionary of colour choices for each group. I&#8217;m using hexadecimal colours that I got from <a title="Colour Brewer" href="http://colorbrewer2.org/" target="_blank">Cynthia Brewer&#8217;s</a> website for a &#8216;qualitative&#8217; 7 class classification. The dictionary looks like this:</p>
<pre>oacSGroups = {'1':'#A6761D','2':'#E6AB02','3':'#66A61E','4':'#E7298A',\
'5':'#7570B3','6':'#D95F02','7': '#1B9E77'}
</pre>
<p>Thus the key &#8217;1&#8242; returns the associated hex colour, this can be linked to the &#8216;dbf_data&#8217; key in the shapefile. In the plt.fill() component I simply have to specify the colour choice, thus we alter the line in the above script to read:</p>
<pre>plt.fill(x,y,fc = oacSGroups[str(int(shpRecords[i]['dbf_data']['supergroup']))]\
,ec = '0.7',lw=0.1)
</pre>
<p>&#8216;fc&#8217; is the &#8216;foreground colour&#8217; we are asking python to make the colour equal to the value in the oacSGroups dictionary where the key is the value contained in the attribute table for the <em>i</em>th row in the &#8216;supergroup&#8217; field. Thus if the <em>i</em>th row had a &#8216;supergroup&#8217; value of &#8217;7&#8242; that foreground colour would be set to &#8216;#1B9E77&#8242;. &#8216;ec&#8217; is &#8216;edge colour&#8217; and &#8216;lw&#8217; is linewidth, here I have set the values to display fine, light grey lines.</p>
<p>Finally, as basic a map as this will turn out to be, we wouldn&#8217;t be anywhere without a legend. The following a a very basic, wholy manual way to add a legend to the map:</p>
<pre>p1 = plt.Rectangle((0, 0), 1, 1, fc="#A6761D")
p2 = plt.Rectangle((0, 0), 1, 1, fc="#E6AB02")
p3 = plt.Rectangle((0, 0), 1, 1, fc="#66A61E")
p4 = plt.Rectangle((0, 0), 1, 1, fc="#E7298A")
p5 = plt.Rectangle((0, 0), 1, 1, fc="#7570B3")
p6 = plt.Rectangle((0, 0), 1, 1, fc="#D95F02")
p7 = plt.Rectangle((0, 0), 1, 1, fc="#1B9E77")

plt.legend([p1,p2,p3,p4,p5,p6,p7], ["Super Group 1","Super Group 2",\
"Super Group 3","Super Group 4","Super Group 5","Super Group 6","Super Group 7"], loc = 4)
</pre>
<p>This simply creates 7 rectangular plots which don&#8217;t appear on the plotted output, but instead are passed to the legend creator, each rectangle has the appropriate colour to match the mapped representation, and a label, shown int he legend as two ordered lists. The &#8216;loc&#8217; tag allows the setting of where the legend will appear, 4 denotes the bottom right corner. the tag &#8216;title&#8217; allows you to add a title to the legend as a string.</p>
<p style="text-align: left">An example output looks something like this:<a href="http://danieljlewis.org/files/2010/05/OACPythonMap.png"></a></p>
<p style="text-align: left"><a href="http://danieljlewis.org/files/2010/05/OACPythonMap1.png"><img class="aligncenter size-full wp-image-323" title="OACPythonMap" src="http://danieljlewis.org/files/2010/05/OACPythonMap1.png" alt="" width="564" height="650" /></a>This took a couple of seconds to produce, and accounts for 846 individual geometries, which actually have quite a number of vertices.</p>
<p style="text-align: left">I&#8217;ll update the blog should I find new methods to visualise spatial data in python.</p>
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/05/25/a-thematic-map-in-python/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

