Elsevier

Social Science & Medicine

Volume 67, Issue 6, September 2008, Pages 891-899
Social Science & Medicine

Small area inequalities in health: Are we underestimating them?

https://doi.org/10.1016/j.socscimed.2008.05.028Get rights and content

Abstract

Spatially aggregated data are frequently used for official statistics and by researchers investigating the contextual determinants of health. Results of reporting and analysis vary according to the choice of areal unit. This is the well-known Modifiable Areal Unit Problem or MAUP. Its implication for the monitoring and understanding of area inequalities in health has received little empirical attention in the public health literature. Health differences will likely be smallest across arbitrarily chosen areas whereas boundaries acknowledging the physical and social geography should indicate greater differences between areas. Here we use three methods to define area boundaries and compare the extent of health inequalities across each drawing on data from the London boroughs of Camden and Islington.

Irrespective of the boundary definition used, between-area inequalities in obesity, alcohol intake, smoking, walking and self-rated health were small compared with inequalities between individuals. There was a tendency for slightly larger estimated inequalities across areas defined by socioeconomic homogeneity compared with other definitions, but differences between methods were very small in magnitude.

Existing studies predominantly use area boundaries that are based on administrative boundaries. Although these have little theoretical basis for the study of neighbourhood inequalities in health, our findings indicate that alternative definitions of the neighbourhood boundaries have no substantive effect on the estimates of those inequalities. Based on these findings, we can have greater confidence in the results of numerous studies which have used administrative boundaries to define the neighbourhood.

Introduction

Spatially aggregated data are frequently used in order to present results, both because individual level data are not available and to protect respondent confidentiality. Examples can be seen for many official statistics, including statistics for health, crime, employment and housing, which are commonly used by public service organizations for monitoring and planning purposes. Aggregated data are also being used by an increasing number of researchers investigating area differences in morbidity and mortality and the contextual determinants of health (Davey Smith et al., 2001, Diez-Roux et al., 1997, Duncan et al., 1995, Fryer et al., 1979, Kawachi et al., 1999, Krieger, 1992, Shelton et al., 2007). However, the geography of the areal unit has important implications for reporting and analysis. Results can be influenced by the number of areas used and the choice of boundaries which define those areas. This has been termed the modifiable areal unit problem (MAUP) and is a well-known phenomenon (Downey, 2006, Flowerdew et al., 2001, Openshaw, 1984, Unwin, 1996). However, its implication for the monitoring and understanding of area inequalities in health has received little empirical attention in the public health literature. Here we outline the problem and assess the utility of different methods of data aggregation for health inequalities research and practice. We aim to create new areas that better represent the area level variation in factors that might influence health. We then compare the extent of inequalities in health and health-related behaviour using these newly designed areas with standard administrative boundaries.

Numerous studies from several countries show that mortality and morbidity vary across small areas, reviewed by Pickett and Pearl (2001) and Riva, Gauvin, and Barnett (2007) for example. Multilevel techniques have been used to analyse individual and area level data to partition the variation in health into that which occurs between areas and that which occurs between individuals living in the same area (Goldstein, 1995). Using a multilevel approach based on administrative boundaries, many studies have found statistically significant variations in health across areas (Duncan et al., 1995, Gould and Jones, 1996, Shouls et al., 1996, Stafford et al., 2001, Wiggins et al., 2002). However, area variations in health are generally found to be small in magnitude compared to those found between individuals (Oakes, 2004, Pickett and Pearl, 2001). From this, some researchers have inferred that the important determinants of health operate at an individual level and that area level determinants and residential segregation are relatively less important. However, as estimates of between-area variation depend on the way that area boundaries are defined, the debate remains as to whether the effects are fully captured by the areas and aggregations used. There are two aspects to the modifiable areal unit problem: the scale problem (results are influenced by the number of areas used) and the aggregation or zoning problem (results are influenced by the choice of boundaries which define areas so that, for a fixed number of output zones, different results will be found with different sets of boundaries). The larger the number of areas, the greater the variation between them will be. The absolute magnitude of the variance cannot be compared across different numbers of areas, although the between-area variation as a proportion of the total variation (known as the intra-class correlation coefficient) and other statistics which correct for number of areas can be compared. More difficult to assess is the seriousness of the aggregation problem for inequalities research. As Haynes, Daras, Reading, and Jones (2007) point out, modifiable areal units present a problem only where those units are arbitrary. In relation to health inequalities monitoring and research, administrative boundaries are somewhat arbitrary, having been designed for various reasons but not specifically to assess the extent of health inequalities. Local electoral ward boundaries are subject to annual change in conjunction with local elections and are required to satisfy a number of conditions, including that they should fit within existing district or borough boundaries, and that the ratio of electors per local councillor should as far as possible be the same for all wards within a district. Having met these conditions, further advice is offered (The Electoral Commission, 2002) that boundaries should be easily identifiable, should reflect existing boundaries such as parish councils and should take into account local identity. Thus wards bear some relationship to perceived local geography, although it is clear that the legal requirements for use of wards in local elections are paramount. Even given these constraints, it may be the case that the boundaries used do not reflect homogenous exposure to differing determinants of particular health outcomes. The challenge here is to create new areas that better represent the area level variation in factors that might influence health outcomes or behaviour using a more theoretical approach (Boyle and Willms, 1999, Curtis and Rees-Jones, 1998, Diez-Roux, 2001, Macintyre et al., 2002). If these alternatively defined areas are more homogeneous on the factors that determine health or if they correspond more closely to resident's experienced or perceived boundaries then they will yield larger health inequalities between areas compared with administrative units. Conversely, they may be equally or less homogeneous or not reflect resident's views and yield the same or smaller inequalities.

Here we focus on small residential areas – neighbourhoods – as providing one physical, social and cultural context for resident's lifestyle choices. “Neighbourhoods” have been variously defined and no single definition has been universally accepted, but the term generally refers to shared spaces possessing similar attributes within which people interact (Galster, 2001). Census boundaries, typically census output area, enumeration district and electoral ward in the UK and census tract and census block in the US, have often been used to define neighbourhoods for quantitative analysis. Whilst most studies have simply used administrative boundaries to define neighbourhoods, some have used alternative approaches.

One alternative approach is automated zone design. Very small areas, typically very small administrative units, are grouped together into larger zones according to set criteria to optimise homogeneity, population size, shape, or some other characteristics (Cockings and Martin, 2005, Haynes et al., 2007). Under automated zone design, a large number of component areas are randomly aggregated into a pre-determined number of larger zones. From this starting position, an iterative procedure is followed in which a component area lying on the boundary of a larger zone is randomly selected and then swapped between neighbouring zones. An objective function, such as homogeneity of the resident population, is re-calculated after each swap, and the swap is retained if the function results have improved. Multiple runs with different random starting positions are usually used in order to optimise the final results. For example, in order to test the association between neighbourhood deprivation and mortality, automated zone design might be employed to build up very small areas into larger zones with maximal internal homogeneity of deprivation. This would increase the statistical power to detect mortality differences across levels of deprivation in neighbourhoods defined by these larger zones.

Others have defined neighbourhoods using a combination of local knowledge, maps and cluster analysis of very small area administrative data (Luginaah et al., 2001, Ross et al., 2004, Sampson et al., 1997). This approach aims to define neighbourhoods through local knowledge and mapping of natural boundaries (e.g. rivers and contours) and the man-made landscape (e.g. major roads). Qualitative work suggests that residents use such physical attributes to demarcate their neighbourhood boundaries (Lebel, Pampalon & Villeneuve, 2007) and so neighbourhoods defined in this way should have meaning for local residents.

A “one-size-fits-all” definition of the neighbourhood may be too simplistic. The most appropriate neighbourhood boundary may depend on the epidemiological outcome of interest. A resident is likely to assess one geographical area as providing (or not) opportunities for safe and pleasant walking and another as providing opportunities to buy cigarettes, for example. We hypothesise that boundaries defined by physical attributes will be more relevant for physical activity, especially walking which is the major contribution to physical activity in the UK, whereas boundaries defined by social homogeneity will be more relevant for smoking behaviour. The effect of different boundary definitions on resident's perceptions of various neighbourhood attributes, including neighbourhood quality, fear of crime and social networks, has recently been examined (Haynes et al., 2007). Whereas perceptions of neighbourhood quality varied quite considerably between neighbourhoods defined by census boundaries (enumeration district) and somewhat less so between neighbourhoods defined using automated zone design, fear of crime and social networks showed much less variation across neighbourhoods irrespective of how they were defined.

Greater attention to boundary definition is important for the study of health inequalities for two reasons. First, current estimates of small area inequalities in health may under-estimate levels of inequality. If the boundaries define areas which are heterogeneous in terms of health and the determinants of health then analysis will not detect the full extent of small area differences. The reduction of inequalities in health between areas is a key UK government priority and specific targets to reduce inequalities in life expectancy by level of area deprivation by 2010 have been announced (Department of Health, 2005). It is therefore important that we know the full extent of inequalities. Second, analytic studies examining the relationship between the small area characteristics and the health of residents may underestimate the effect attributable to the area. If the boundaries used do not reflect the boundaries experienced in resident's everyday living then there will be some measurement error in the exposure. This will tend to bias results towards a null finding. On the other hand administrative boundaries, though chosen for reasons of convenience, may actually do a good job of capturing the extent of variation in health and the determinants of health inequalities. Confirmation of this would enable researchers to place more confidence in the findings of the huge number of existing studies that have used administrative boundaries.

This study uses three different ways of defining neighbourhood – administrative boundaries, mapping of the natural and urban landscape, and automated zone design to optimise social homogeneity – and assesses the extent to which health and behaviour vary between and within those neighbourhoods.

Section snippets

Data and methods

The three methods of defining neighbourhoods are illustrated using data from the London boroughs of Camden and Islington. This location was chosen because individual level data on health and health behaviour were available for a sufficiently large sample and because the authors have extensive knowledge of it, having lived and worked there for several years. The location makes a good case study of the methods because it is has a range of socio-economically deprived and affluent neighbourhoods

Results

The demographic and health characteristics of the study sample are summarised in Table 1. Mean body mass index was 25.6 kg/m2. Mean alcohol intake was higher for men than for women. Although alcohol intake was highly skewed, with several participants reporting zero intake in the last week, the findings were very similar when intake was analysed as heavy versus no/light drinking and so results for the continuous model are presented here. Similarly, the findings were essentially the same when

Discussion

This study set out to demonstrate that estimates of area inequalities in health are determined by how those area boundaries are defined. Using three different approaches to defining area boundaries, we obtained three different estimates of the extent of area inequalities in health. Although the statistical significance of the estimated variation between neighbourhoods depended on the definition of neighbourhood under consideration, the magnitude of the estimates was essentially very similar

References (47)

  • M. Stafford et al.

    Characteristics of individuals and characteristics of areas: investigating their influence on health in the Whitehall II study

    Health and Place

    (2001)
  • S.V. Subramanian

    The relevance of multilevel statistical methods for identifying causal neighbourhood effects

    Social Science & Medicine

    (2004)
  • R.D. Wiggins et al.

    Place and personal circumstances in a multilevel account of women's long-term illness

    Social Science & Medicine

    (2002)
  • S. Alvanides et al.

    Designing your own geographies

  • J.B. Bingenheimer et al.

    Statistical and substantive inferences in public health: issues in the application of multilevel models

    Annual Reviews of Public Health

    (2004)
  • M.H. Boyle et al.

    Place effects for areas defined by administrative boundaries

    American Journal of Epidemiology

    (1999)
  • T. Butler et al.

    Super-gentrification in Barnsbury, London: globalization and gentrifying global elites at the neighbourhood level

    Transactions of the Institute of British Geographers

    (2006)
  • C.J. Coulton et al.

    Mapping residents' perceptions of neighborhood boundaries: a methodological note

    American Journal of Community Psychology

    (2001)
  • S. Curtis et al.

    Is there a place for geography in the analysis of health inequality?

    Sociology of Health & Illness

    (1998)
  • G. Davey Smith et al.

    Area based measures of social and economic circumstances: cause specific mortality patterns depend on the choice of index

    Journal of Epidemiology & Community Health

    (2001)
  • Department of Health

    Tackling health inequalities: Status report on the programme for action

    (2005)
  • A.V. Diez-Roux

    Investigating neighborhood and area effects on health

    American Journal of Public Health

    (2001)
  • A.V. Diez-Roux et al.

    Neighborhood environments and coronary heart disease: a multilevel analysis

    American Journal of Epidemiology

    (1997)
  • Cited by (73)

    • Healthy built environment: Spatial patterns and relationships of multiple exposures and deprivation in Toronto, Montreal and Vancouver

      2020, Environment International
      Citation Excerpt :

      As with other environmental and public health research making use of administrative boundaries, the modifiable area unit problem may be a limitation. However, the use of alternative administrative boundaries in health inequalities research have not shown substantive effects on study results(Stafford et al. 2008). Lastly, results in our study were relative to conditions within each city so absolute levels of each exposure were not compared.

    • Assessing the extent of modifiable areal unit problem in modelling freight (trip) generation: Relationship between zone design and model estimation results

      2019, Journal of Transport Geography
      Citation Excerpt :

      For instance, Yannis et al. (2007) revealed that AZP based on spatial homogeneity in transport characteristics (vehicle ownership, fuel consumption) and road safety parameters (speed violations) uncovered regional effects on road accidents in Greece; these spatial patterns were otherwise inconsistent in traditional ad-hoc geographical clusters. This idea was subsequently explored in several studies using alternate methodologies or variables (Lee et al., 2014; Stafford et al., 2008; Viegas et al., 2009) to improve model performance. For example, traffic safety analysis zones (TSAZ) created using homogeneity of crash rates (Lee et al., 2014) had better fit than traditional TAZ based models.

    View all citing articles on Scopus

    Mai Stafford is funded by a NIHR fellowship. The core Health Survey for England 1999 was funded by the Department of Health and the local boost by the Camden and Islington Primary Care Trust.

    View full text