Making P² People & Places


Each demographic data provider has their own formula for creating their dataset, and this choice of data makes each dataset work slightly better for different users. To help you to decide whether P² is the best demographic data for your organisation, we have given an overview of how we compile this rich dataset, and how we then process this data to make the final product, and keep it up to date and accurate.

Data Sources

We built P² People & Places using census data and further augmented this with Living Costs and Food Survey (LCF) and British Population Survey (BPS) data. We used census data for several key reasons:

  • The census is the UK’s only compulsory survey. This means that we can include details about every household in the UK.
  • The census is comprehensive. There are thousands of variables providing millions of data points.
  • The census is updated every 10* years with the census office releasing population alterations every year. For such a large and rich dataset, the census is reliable and provides an excellent measure of change over time.

Census data contains population statistics and age breakdowns. It also explores social factors like employment, housing types, overcrowding, household composition and the life-stage of occupants. Census data is released for each Output Area and provides the accuracy required to make informed decisions. Though highly detailed, P² does not disclose personal details that can be used to ascertain the identity of individuals. This dataset is supplied by the government census offices.

P² People & Places variables were carefully selected from census data to show differences between the demographic types, and ultimately provide scope for detailed segmentation. The variables include:

    •  Age and demographics
    •  Health
    •  Household structure
    •  Ethnicity, migration and religion
    •  Household type, tenure, overcrowding and size
    •  Qualifications and skills
    •  Employment
    •  Car ownership and commuting


On top of this core census data, we then overlay data from the Living Costs & Food Survey (LCF) and British Population Survey (BPS) to enhance P² descriptions and paint a vivid picture of the lifestyles and characteristics of UK consumers.

A lady being surveyed

Living Costs and Food Survey (LCF)

The LCF is a survey undertaken by the Office of National Statistics (ONS). Face-to-face interviews are carried out to assess topics such as spending patterns and the cost of living. LCF information accurately reflects weekly household budgets across the country. It is the most significant consumer survey in the UK and produces a modelled dataset based on about 12,000 household interviews each year. LCF is an important source of economic and social data for government and other research agencies. It also contributes to RPI (Retail Prices Index) calculations.

We worked with the ONS to code P² Trees and Branches with selected LCF variables. The LCF variables enhance the descriptions and provide details about household spending plus gross and disposable income. Covering eating and drinking, holidays, recreation, savings, tobacco use and more, LCF data provides valuable insight into lifestyles, attitudes and health.

British Population Survey (BPS)

The BPS records household income as well as shopping and spending habits. BPS data is collected using face-to-face interviews and each month between 6,000 and 8,000 interviews are completed. Over the course of a year over 80,000 interviews are carried out.

The interview panel is structured to give a representative sample of the UK population. BPS covers a wide range of topics such as household income, preferred supermarkets, internet access and internet use.

There is also a subset of the BPS which describes the effectiveness and use of marketing channels and the industries that use them. A selection of BPS variables were chosen to enhance the descriptions of P² People & Places classifications, rather than duplicate existing LCF or Census data.

Processing the data

Before starting the population analysis, we recognised that the UK could be divided into groups based on economic potential rather than simply location. We identified 9 groups that included different areas displaying similar characteristics such as commercial nature and industrial history. Our 9 groups of economic potential provide a better base for analysis than geography alone that can miss important details. These groups, built using census data and government economic statistics, reduced the risk that the analysis would be adversely influenced by strong geodemographic factors (e.g. many affluent households with no cars in London should not affect affluent rural households in Hampshire). The regional grouping represented the dynamic nature of a changing Britain proving better than the snapshot approach of some other classifications.

Millions of data points analysed to uncover the detail you need

Hundreds of census variables for millions of households naturally generates a lot of data points, even when aggregated to Output Areas. We analysed those data points with our partners from the Department of Geography and Planning at the University of Liverpool: Emeritus Professor Peter Batey (BSc MCD PhD CGeog FRTPI FRSA AcSS) and Dr. Peter Brown (BEng MCD PhD MRTPI FCILT). They have a deep understanding and experience of reducing huge datasets to manageable and comprehensible proportions.

From the initial datasets we identified a number of key variables that were the most useful. After testing these variables, we discarded those where there was little variation or duplicate characteristics. We were left with around 90 groups that were common across the different UK study regions.

Using Principal Component Analysis we then built a set of 'clusters' (groups of output areas that share similar characteristics). There is no unique "best" set of clusters - we were looking for those that captured the maximum distinction between clusters while having the widest application areas. The quality of the clusters was tested for:

  • Compactness: Do they cover too large a spread of variables?
  • Size: Do they have enough cases to make them relevant?
  • Distribution: Is the geographical distribution sensible?

When the initial set of clusters was completed we continued the Principal Component Analysis and created the working classification. The clusters were grouped together to provide 180 groups which were merged into 44 Branches and 16 Trees. The Branches and Trees conveniently describe people’s lifestyles, attitudes and behaviours. Once the Trees and Branches were finalised we attached labels to them to make them more intuitive to work with.

Finally, we cross-reference with LCF and BPS data to provide a rich, well-rounded portrait of each Tree and Branch.

Keeping the data up-to-date

Using Census data ensures that P² is current. We also include yearly population updates from the ONS and Royal Mail postcode updates to keep P² as up-to-date as possible.

Read how P² has changed, reflecting changes in the UK population.

Trying to identify individual propensity from national data.

 

 

For more details or a demonstration
Call  01904 701020, complete   or