Small area inequalities in health: Are we underestimating them?☆
Introduction
Spatially aggregated data are frequently used in order to present results, both because individual level data are not available and to protect respondent confidentiality. Examples can be seen for many official statistics, including statistics for health, crime, employment and housing, which are commonly used by public service organizations for monitoring and planning purposes. Aggregated data are also being used by an increasing number of researchers investigating area differences in morbidity and mortality and the contextual determinants of health (Davey Smith et al., 2001, Diez-Roux et al., 1997, Duncan et al., 1995, Fryer et al., 1979, Kawachi et al., 1999, Krieger, 1992, Shelton et al., 2007). However, the geography of the areal unit has important implications for reporting and analysis. Results can be influenced by the number of areas used and the choice of boundaries which define those areas. This has been termed the modifiable areal unit problem (MAUP) and is a well-known phenomenon (Downey, 2006, Flowerdew et al., 2001, Openshaw, 1984, Unwin, 1996). However, its implication for the monitoring and understanding of area inequalities in health has received little empirical attention in the public health literature. Here we outline the problem and assess the utility of different methods of data aggregation for health inequalities research and practice. We aim to create new areas that better represent the area level variation in factors that might influence health. We then compare the extent of inequalities in health and health-related behaviour using these newly designed areas with standard administrative boundaries.
Numerous studies from several countries show that mortality and morbidity vary across small areas, reviewed by Pickett and Pearl (2001) and Riva, Gauvin, and Barnett (2007) for example. Multilevel techniques have been used to analyse individual and area level data to partition the variation in health into that which occurs between areas and that which occurs between individuals living in the same area (Goldstein, 1995). Using a multilevel approach based on administrative boundaries, many studies have found statistically significant variations in health across areas (Duncan et al., 1995, Gould and Jones, 1996, Shouls et al., 1996, Stafford et al., 2001, Wiggins et al., 2002). However, area variations in health are generally found to be small in magnitude compared to those found between individuals (Oakes, 2004, Pickett and Pearl, 2001). From this, some researchers have inferred that the important determinants of health operate at an individual level and that area level determinants and residential segregation are relatively less important. However, as estimates of between-area variation depend on the way that area boundaries are defined, the debate remains as to whether the effects are fully captured by the areas and aggregations used. There are two aspects to the modifiable areal unit problem: the scale problem (results are influenced by the number of areas used) and the aggregation or zoning problem (results are influenced by the choice of boundaries which define areas so that, for a fixed number of output zones, different results will be found with different sets of boundaries). The larger the number of areas, the greater the variation between them will be. The absolute magnitude of the variance cannot be compared across different numbers of areas, although the between-area variation as a proportion of the total variation (known as the intra-class correlation coefficient) and other statistics which correct for number of areas can be compared. More difficult to assess is the seriousness of the aggregation problem for inequalities research. As Haynes, Daras, Reading, and Jones (2007) point out, modifiable areal units present a problem only where those units are arbitrary. In relation to health inequalities monitoring and research, administrative boundaries are somewhat arbitrary, having been designed for various reasons but not specifically to assess the extent of health inequalities. Local electoral ward boundaries are subject to annual change in conjunction with local elections and are required to satisfy a number of conditions, including that they should fit within existing district or borough boundaries, and that the ratio of electors per local councillor should as far as possible be the same for all wards within a district. Having met these conditions, further advice is offered (The Electoral Commission, 2002) that boundaries should be easily identifiable, should reflect existing boundaries such as parish councils and should take into account local identity. Thus wards bear some relationship to perceived local geography, although it is clear that the legal requirements for use of wards in local elections are paramount. Even given these constraints, it may be the case that the boundaries used do not reflect homogenous exposure to differing determinants of particular health outcomes. The challenge here is to create new areas that better represent the area level variation in factors that might influence health outcomes or behaviour using a more theoretical approach (Boyle and Willms, 1999, Curtis and Rees-Jones, 1998, Diez-Roux, 2001, Macintyre et al., 2002). If these alternatively defined areas are more homogeneous on the factors that determine health or if they correspond more closely to resident's experienced or perceived boundaries then they will yield larger health inequalities between areas compared with administrative units. Conversely, they may be equally or less homogeneous or not reflect resident's views and yield the same or smaller inequalities.
Here we focus on small residential areas – neighbourhoods – as providing one physical, social and cultural context for resident's lifestyle choices. “Neighbourhoods” have been variously defined and no single definition has been universally accepted, but the term generally refers to shared spaces possessing similar attributes within which people interact (Galster, 2001). Census boundaries, typically census output area, enumeration district and electoral ward in the UK and census tract and census block in the US, have often been used to define neighbourhoods for quantitative analysis. Whilst most studies have simply used administrative boundaries to define neighbourhoods, some have used alternative approaches.
One alternative approach is automated zone design. Very small areas, typically very small administrative units, are grouped together into larger zones according to set criteria to optimise homogeneity, population size, shape, or some other characteristics (Cockings and Martin, 2005, Haynes et al., 2007). Under automated zone design, a large number of component areas are randomly aggregated into a pre-determined number of larger zones. From this starting position, an iterative procedure is followed in which a component area lying on the boundary of a larger zone is randomly selected and then swapped between neighbouring zones. An objective function, such as homogeneity of the resident population, is re-calculated after each swap, and the swap is retained if the function results have improved. Multiple runs with different random starting positions are usually used in order to optimise the final results. For example, in order to test the association between neighbourhood deprivation and mortality, automated zone design might be employed to build up very small areas into larger zones with maximal internal homogeneity of deprivation. This would increase the statistical power to detect mortality differences across levels of deprivation in neighbourhoods defined by these larger zones.
Others have defined neighbourhoods using a combination of local knowledge, maps and cluster analysis of very small area administrative data (Luginaah et al., 2001, Ross et al., 2004, Sampson et al., 1997). This approach aims to define neighbourhoods through local knowledge and mapping of natural boundaries (e.g. rivers and contours) and the man-made landscape (e.g. major roads). Qualitative work suggests that residents use such physical attributes to demarcate their neighbourhood boundaries (Lebel, Pampalon & Villeneuve, 2007) and so neighbourhoods defined in this way should have meaning for local residents.
A “one-size-fits-all” definition of the neighbourhood may be too simplistic. The most appropriate neighbourhood boundary may depend on the epidemiological outcome of interest. A resident is likely to assess one geographical area as providing (or not) opportunities for safe and pleasant walking and another as providing opportunities to buy cigarettes, for example. We hypothesise that boundaries defined by physical attributes will be more relevant for physical activity, especially walking which is the major contribution to physical activity in the UK, whereas boundaries defined by social homogeneity will be more relevant for smoking behaviour. The effect of different boundary definitions on resident's perceptions of various neighbourhood attributes, including neighbourhood quality, fear of crime and social networks, has recently been examined (Haynes et al., 2007). Whereas perceptions of neighbourhood quality varied quite considerably between neighbourhoods defined by census boundaries (enumeration district) and somewhat less so between neighbourhoods defined using automated zone design, fear of crime and social networks showed much less variation across neighbourhoods irrespective of how they were defined.
Greater attention to boundary definition is important for the study of health inequalities for two reasons. First, current estimates of small area inequalities in health may under-estimate levels of inequality. If the boundaries define areas which are heterogeneous in terms of health and the determinants of health then analysis will not detect the full extent of small area differences. The reduction of inequalities in health between areas is a key UK government priority and specific targets to reduce inequalities in life expectancy by level of area deprivation by 2010 have been announced (Department of Health, 2005). It is therefore important that we know the full extent of inequalities. Second, analytic studies examining the relationship between the small area characteristics and the health of residents may underestimate the effect attributable to the area. If the boundaries used do not reflect the boundaries experienced in resident's everyday living then there will be some measurement error in the exposure. This will tend to bias results towards a null finding. On the other hand administrative boundaries, though chosen for reasons of convenience, may actually do a good job of capturing the extent of variation in health and the determinants of health inequalities. Confirmation of this would enable researchers to place more confidence in the findings of the huge number of existing studies that have used administrative boundaries.
This study uses three different ways of defining neighbourhood – administrative boundaries, mapping of the natural and urban landscape, and automated zone design to optimise social homogeneity – and assesses the extent to which health and behaviour vary between and within those neighbourhoods.
Section snippets
Data and methods
The three methods of defining neighbourhoods are illustrated using data from the London boroughs of Camden and Islington. This location was chosen because individual level data on health and health behaviour were available for a sufficiently large sample and because the authors have extensive knowledge of it, having lived and worked there for several years. The location makes a good case study of the methods because it is has a range of socio-economically deprived and affluent neighbourhoods
Results
The demographic and health characteristics of the study sample are summarised in Table 1. Mean body mass index was 25.6 kg/m2. Mean alcohol intake was higher for men than for women. Although alcohol intake was highly skewed, with several participants reporting zero intake in the last week, the findings were very similar when intake was analysed as heavy versus no/light drinking and so results for the continuous model are presented here. Similarly, the findings were essentially the same when
Discussion
This study set out to demonstrate that estimates of area inequalities in health are determined by how those area boundaries are defined. Using three different approaches to defining area boundaries, we obtained three different estimates of the extent of area inequalities in health. Although the statistical significance of the estimated variation between neighbourhoods depended on the definition of neighbourhood under consideration, the magnitude of the estimates was essentially very similar
References (47)
- et al.
Zone design for environment and health studies using pre-aggregated data
Social Science & Medicine
(2005) Estimating neighborhood health effects: the challenges of causal inference in a complex world
Social Science & Medicine
(2004)- et al.
Analysing perceived limiting long term illness using UK census microdata
Social Science and Medicine
(1996) - et al.
Modifiable neighbourhood units, zone design and residents' perceptions
Health and Place
(2007) - et al.
Place effects on health: how can we conceptualise, operationalise and measure them?
Social Science & Medicine
(2002) Social determinants of health inequalities
Lancet
(2005)The (mis)estimation of neighbourhood effects: causal inference for a practicable social epidemiology
Social Science & Medicine
(2004)- et al.
Local neighbourhood and mental health: evidence from the UK
Social Science & Medicine
(2005) - et al.
Neighbourhood influences on health in Montreal, Canada
Social Science & Medicine
(2004) - et al.
Defining regions for locality health care planning: a multidimensional approach
Social Science & Medicine
(2005)
Characteristics of individuals and characteristics of areas: investigating their influence on health in the Whitehall II study
Health and Place
The relevance of multilevel statistical methods for identifying causal neighbourhood effects
Social Science & Medicine
Place and personal circumstances in a multilevel account of women's long-term illness
Social Science & Medicine
Designing your own geographies
Statistical and substantive inferences in public health: issues in the application of multilevel models
Annual Reviews of Public Health
Place effects for areas defined by administrative boundaries
American Journal of Epidemiology
Super-gentrification in Barnsbury, London: globalization and gentrifying global elites at the neighbourhood level
Transactions of the Institute of British Geographers
Mapping residents' perceptions of neighborhood boundaries: a methodological note
American Journal of Community Psychology
Is there a place for geography in the analysis of health inequality?
Sociology of Health & Illness
Area based measures of social and economic circumstances: cause specific mortality patterns depend on the choice of index
Journal of Epidemiology & Community Health
Tackling health inequalities: Status report on the programme for action
Investigating neighborhood and area effects on health
American Journal of Public Health
Neighborhood environments and coronary heart disease: a multilevel analysis
American Journal of Epidemiology
Cited by (73)
The neighborhood deprivation gradient and child physical abuse and neglect: A Bayesian spatial model
2023, Child Abuse and NeglectA multi-objective genetic algorithm approach to design optimal zoning systems for freight transportation planning
2021, Journal of Transport GeographyHealthy built environment: Spatial patterns and relationships of multiple exposures and deprivation in Toronto, Montreal and Vancouver
2020, Environment InternationalCitation Excerpt :As with other environmental and public health research making use of administrative boundaries, the modifiable area unit problem may be a limitation. However, the use of alternative administrative boundaries in health inequalities research have not shown substantive effects on study results(Stafford et al. 2008). Lastly, results in our study were relative to conditions within each city so absolute levels of each exposure were not compared.
Designing Zoning Systems for Freight Transportation Planning: A GIS-based approach for Automated Zone Design using Public Data Sources
2020, Transportation Research ProcediaAssessing the extent of modifiable areal unit problem in modelling freight (trip) generation: Relationship between zone design and model estimation results
2019, Journal of Transport GeographyCitation Excerpt :For instance, Yannis et al. (2007) revealed that AZP based on spatial homogeneity in transport characteristics (vehicle ownership, fuel consumption) and road safety parameters (speed violations) uncovered regional effects on road accidents in Greece; these spatial patterns were otherwise inconsistent in traditional ad-hoc geographical clusters. This idea was subsequently explored in several studies using alternate methodologies or variables (Lee et al., 2014; Stafford et al., 2008; Viegas et al., 2009) to improve model performance. For example, traffic safety analysis zones (TSAZ) created using homogeneity of crash rates (Lee et al., 2014) had better fit than traditional TAZ based models.
- ☆
Mai Stafford is funded by a NIHR fellowship. The core Health Survey for England 1999 was funded by the Department of Health and the local boost by the Camden and Islington Primary Care Trust.