<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Volunteered Geographic Information &#187; top 20</title>
	<atom:link href="http://danieljlewis.org/tag/top-20/feed/" rel="self" type="application/rss+xml" />
	<link>http://danieljlewis.org</link>
	<description>A Geography/GIS blog by Daniel J Lewis</description>
	<lastBuildDate>Tue, 20 Dec 2011 17:15:30 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4-alpha-20124</generator>
		<item>
		<title>Analysis of Surnames from Southwark Patient Register</title>
		<link>http://danieljlewis.org/2010/03/03/analysis-of-surnames-from-southwark-patient-register/</link>
		<comments>http://danieljlewis.org/2010/03/03/analysis-of-surnames-from-southwark-patient-register/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 15:32:51 +0000</pubDate>
		<dc:creator>Daniel Lewis</dc:creator>
				<category><![CDATA[Southwark]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[James Cheshire]]></category>
		<category><![CDATA[population]]></category>
		<category><![CDATA[surnames]]></category>
		<category><![CDATA[top 20]]></category>

		<guid isPermaLink="false">http://danieljlewis.org/?p=243</guid>
		<description><![CDATA[My colleague James Cheshire&#8217;s research deals with understanding and classifying spatial patterns in surnames. He has been able to show, through various techniques, that there exists in the UK a regional geography of surnames. This in mind, I thought I&#8217;d interogate my database of NHS patient registrations for Southwark and see what was going on [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F03%2F03%2Fanalysis-of-surnames-from-southwark-patient-register%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fdanieljlewis.org%2F2010%2F03%2F03%2Fanalysis-of-surnames-from-southwark-patient-register%2F&amp;source=gisdjl&amp;style=normal&amp;service=bit.ly&amp;service_api=gisdjl%3AR_cbf864f1d7672c90a5d0e63770588605&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>My colleague <a title="JC's Blog" href="http://spatialanalysis.co.uk/" target="_blank">James Cheshire&#8217;s</a> research deals with understanding and classifying spatial patterns in surnames. He has been able to show, through various techniques, that there exists in the UK a regional geography of surnames. This in mind, I thought I&#8217;d interogate my database of NHS patient registrations for Southwark and see what was going on in surname terms there. This first table shows the top 20 most popular surnames in Southwark, ranked by occurance.</p>
<div id="attachment_247" class="wp-caption aligncenter" style="width: 430px"><a href="http://danieljlewis.org/files/2010/03/Top20namesSouthwark.png"><img class="size-full wp-image-247" title="Top20namesSouthwark" src="http://danieljlewis.org/files/2010/03/Top20namesSouthwark.png" alt="" width="420" height="421" /></a><p class="wp-caption-text">Figure 1: Top 20 Surnames in Southwark, by occurance.</p></div>
<p>Unsurprisingly perhaps, the top places are dominated by surnames native to the UK, classically Smith, Williams, Jones etc. However, in line with Southwark&#8217;s reputation as a diverse borough and in light of it&#8217;s high inmigration figures, it is also clear that of these top 20 surnames some of them would be connected to inmigrant names: Kamara, Ahmed, Ali, Patel and Khan are all surnames that are increasingly associated with a previous period of migration to the UK. Interestingly the Vietnamese population is very small, less than 1% of the population of Southwark, but around 23% of these have the surname &#8216;Nguyen&#8217;. The ethnicity of the surnames is derived from <a title="Onomap" href="http://www.onomap.org/" target="_blank">Onomap</a>.</p>
<p>The frequency distribution of Southwark surnames looks like this:</p>
<div id="attachment_246" class="wp-caption aligncenter" style="width: 584px"><a href="http://danieljlewis.org/files/2010/03/SurnameFreq.png"><img class="size-large wp-image-246" title="SurnameFreq" src="http://danieljlewis.org/files/2010/03/SurnameFreq-1024x416.png" alt="" width="574" height="233" /></a><p class="wp-caption-text">Figure 2: Surname Frequency Distribution for Southwark, 2009</p></div>
<p style="text-align: left">Note the characteristic long tail, there are a huge number of unique, or almost unique surnames, and considerably fewer surnames which are possessed by a large number of people. Such a distribution seems to obey a <a title="Wiki Power Law" href="http://en.wikipedia.org/wiki/Power_law" target="_blank">power law</a> of some sort.</p>
<p style="text-align: left">We can dig deeper into this phenomenon by looking at the number of surnames that comprise a given percentage of the population:</p>
<div id="attachment_245" class="wp-caption aligncenter" style="width: 530px"><a href="http://danieljlewis.org/files/2010/03/PopSurnametablegraph.png"><img class="size-full wp-image-245" title="PopSurnametablegraph" src="http://danieljlewis.org/files/2010/03/PopSurnametablegraph.png" alt="" width="520" height="213" /></a><p class="wp-caption-text">Figure 3: Surnames comprising given percentages of the Southwark Population</p></div>
<p style="text-align: left">As we can see from the above figure, only 56 names account for 10% of the Southwark Population, but that in total there are 88,124 distinct surnames in Southwark. Again there is a characteristic decay to the curve.</p>
<p style="text-align: left">Finally, let us consider just the charactersitics of the long-tail of the distribution:</p>
<div id="attachment_244" class="wp-caption aligncenter" style="width: 560px"><a href="http://danieljlewis.org/files/2010/03/longtailsurnamegraphtable.png"><img class="size-full wp-image-244" title="longtailsurnamegraphtable" src="http://danieljlewis.org/files/2010/03/longtailsurnamegraphtable.png" alt="" width="550" height="221" /></a><p class="wp-caption-text">Figure 4: Focus on the long-tail - percentage population for given surname frequencies.</p></div>
<p style="text-align: left">From figure 4 it is clear that almost 25% of the Southwark population have a surname that is share by fewer that 11 people, indeed just over 16% of the Southwark population have a surname unique to the Southwark patient register. The shape of the curve in figure 4 demonstrate the effect of the long tail seen in figure 2.</p>
<p style="text-align: left">For more information on surnames research check out <a title="JC's Blog" href="http://spatialanalysis.co.uk/" target="_blank">James Cheshire&#8217;s blog</a>, <a title="JC's WP 149" href="http://www.casa.ucl.ac.uk/publications/workingPaperDetail.asp?ID=149" target="_blank">working paper</a> or <a title="Pablo's WP 116" href="http://www.casa.ucl.ac.uk/publications/workingPaperDetail.asp?ID=116" target="_blank">Pablo Mateos&#8217; working paper</a>.</p>
<p style="text-align: left">
]]></content:encoded>
			<wfw:commentRss>http://danieljlewis.org/2010/03/03/analysis-of-surnames-from-southwark-patient-register/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

