As an extension of the previous post, I created MDS mappings for Inner London using the OAC variables. I am using the ONS definition of Inner London, as opposed the commonly accepted definition, thus Inner London includes Haringey, Newham and the City, but excludes Greenwich.
The nature of MDS is to scale a pairwise distance matrix, thus the number of entries in the matrix increases as the square of the number of observations, rapidly becoming very large. There are 9759 OAs in Inner London, requiring a distance matrix of 95,238,081 cells. I would like to compute the MDS mappings for Greater London as well, however there are 24,140 OAs giving a distance matrix of 582,739,600 cells, that is over 1/2 billion (American bn) entries and cannot be dimensioned on 32-bit machine. I’ve looked into 64-bit Python, but have yet to find a solution that doesn’t completely fill my computers memory and create an enormous pagefile.
In the previous example I used canberra distance, but found that in the larger data space of inner London I was having issues with very small fractions skewing the outcome, so I calculated the distance matrix using Bray-Curtis distance using SciPy’s spatial.distance library. The formula for bray-curtis is as follows (you’ll note it is a data normalisation method too):
where k is the particular variable relating to the pair xi and xj and dij is the resultant distance matrix.
The mappings were, as before, produced in greyscale and RGB.
I feel that the representations produced are quite effetcive in demonstrating the wide mix of social environments in Inner London, the greyscale is particualrly good at picking up the acknowledged patterns of deprivation, particularly east/west and north/south. The colour representation does this too, but with an additional layer of complexity that seems to give a more nuanced reading of socio stratification in Inner London, with specific colour groups apparently marking out notional neighbourhoods – complementing the reading of London as a ‘city of villages’.
Acknowledgement
Census data is Crown Copyright 2010 from CasWeb, boundaries are Crown Copyright 2010 from UKBorders, an Edina/JISC supplied service.


