I’ve spent a chunk of time recently address geocoding the Southwark PCT patient register to Ordnance Survey Address Layer 2 data. What this means is that I can start identifying and (later) classifying households, this will allow me to ask questions about how different households approach healthcare. More broadly it allows me an insight into the demographic character of Southwark.
The data actually extends past the Southwark boundary as people in Lambeth, Lewisham, Bromley and Croyden do also to some extent use Southwark primary healthcare services (GPs) this means that although Southwark’s population is only c.300,000 the datset I’m using is for just over 340,000 people. There is some uncertainty in the data naturally, this results from the two datasets used; on the one hand addresses recorded in the Southwark patient register are not all necessarily complete, for example there is sometimes a failure to record which particular subdivision of a house someone lives in, or which flat in a larger block of social housing. On the other hand the AddressLayer2 data, although very rich, is not necessarily complete, this could be due to the prescence of unacknowledged subdivisions in residential housing, and although most social housing estates seem well documented, some commercial developments are not necessarily registered beyond the building level. Similarly, there are a number of instances of social institutions, such as the Salvation Army and St. Mungos, or marinas and dormitories having a single registered address for a high number of residents. This may have the effect of skewing the data slightly. With this in mind I created the following graph from the dataset of Number of households against number of inhabitants per household:
This shows that there is still a major trend for single-person households, but equally that around a quarter of all households are co-habited. The long tail in the graph (which i have truncated here) is caused by a few special cases, some examples of which are acknowledged in the previous paragraph. The average household size of 3.10 is itself higher than the UK average household sizes reported after the 2001 census which was 2.36; at the time the borough of Newham in East London had the highest household occupancy rate at 2.64. Of course there are any number of reasons why these data are not comparable, to start with the census took place 8 years before the Southwark dataset was created, similarly the uncertainty in the Southwark dataset is higher as it was not created with the primary purpose that it be able to successfully locate all patients as more often than not patients go to the Doctor and not vice-versa, whereas the census is distributed at a household level to each individual. The Southwark dataset does also include particularly tranisient communities which are missed by the census, such as the homeless who don’t have a fixed address (and hence may be using shelter or hostel addresses) but still require medical treatment at times.
Nevertheless, an interesting first look. The next steps will involve evaluating and validating the dataset to the best of my ability and then moving on to look at ways of examining and classifying household structure.
~ End Article and Begin Conversation ~
There are no comments yet...
~ Now It's Your Turn ~