2 The frameworkThis chapter introduces the notation and basic concepts that will be used throughout the book. The chapter is organized into four sections. Section 2.1 starts with a review of unidimensional poverty measurement with particular attention to the well-known FGT measures (Foster, Greer, and Thorbecke 1984) because many methods presented in Chapter 3, as well as the Alkire and Foster (2007, 2011a) measures presented in Chapter 5, are based on FGT indices. Section 2.2 introduces the notation and basic concepts for multidimensional poverty measurement that will be used in subsequent chapters. Section 2.3 delves into the issue of indicators’ scales of measurement, an aspect often overlooked when discussing methods for multidimensional analysis and which is central to this book. Section 2.4 addresses comparability across people and dimensions. Finally Section 2.5 presents in a detailed form the different properties that have been proposed in axiomatic approaches to multidimensional poverty measurement. Such properties enable the analyst to understand the ethical principles embodied in a measure and to be aware of the direction of change they will exhibit under certain transformations.
2.1 Review of Unidimensional Measurement and FGT MeasuresThe measurement of multidimensional poverty builds upon a long tradition of unidimensional poverty measurement. Because both approaches are technically closely linked, the measurement of poverty in a unidimensional way can be seen as a special case of multidimensional poverty measurement. This section introduces the basic concepts of unidimensional poverty measurement using the lens of the multidimensional framework, so serves as a springboard for the later work. The measurement of poverty requires a reference population, such as all people in a country. We refer to the reference population under study as a society. We assume that any society consists of at least one observation or unit of analysis. This unit varies depending on the measurement exercise. For example, the unit of analysis is a child if one is measuring child poverty, it is an elderly person if one is measuring poverty among the elderly, and it is a person or—sometimes due to data constraints—the household for measures covering the whole population. For simplicity, unless otherwise indicated, we refer to the unit of analysis within a society as a person (Chapter 6 and Chapter 7) We denote the number of person(s) within a society by , such that is in or , where is the set of positive integers. Note that unless otherwise specified, refers to the total population of a society and not a sample of it. Assume that poverty is to be assessed using number of dimensions, such that . We refer to the performance of a person in a dimension as an achievement in a very general way, and we assume that achievements in each dimension can be represented by a non-negative real valued indicator. We denote the achievement of person in dimension by for all and , where is the set of non-negative real numbers, which is a proper subset of the set of real numbers . Subsequently, we denote the set of strictly positive real numbers by Throughout this book, we allow the population size of a society to vary, which allows comparisons of societies with different populations. When we seek to permit comparability of poverty estimates across different populations, we assume to denote a fixed set (and number) of dimensions. The achievements of all persons within a society are denoted by an -dimensional achievement matrix which looks as follows: We denote the set of all possible matrices of size by and the set of all possible achievement matrices by , such that . If , then matrix contains achievements for persons in dimensions. Unless specified otherwise, whenever we refer to matrix , we assume . The achievements of any person in all dimensions, which is row of matrix , are represented by the -dimensional vector for all . The achievements in any dimension for all persons, which is column of matrix , are represented by the -dimensional vector for all . In the unidimensional context, the dimensions considered in matrix —which are typically assumed to be cardinal—can be meaningfully combined into a well-defined overall achievement or resource variable for each person , which is denoted by . One possibility, from a welfarist approach, would be to construct each person’s welfare from her vector of achievements using a utility function . Another possibility is that each dimension refers to a different source of income (labour income, rents, family allowances, etc.). Then, one can construct the total income level for each person as the sum of the income level obtained from each source, that is . Alternatively, each dimension can be measured in the quantity of a good or service that can be acquired in a market. Then, one can construct the total consumption expenditure level for each person as the sum of the quantities acquired at market price, that is , where , the price of commodity is used as its weight. In any of these three cases, the achievement matrix is reduced to a vector containing the welfare level or the resource variables of all persons. In other words, the distinctive feature of the unidimensional approach is not that it necessarily considers only one dimension, but rather that it maps multiple dimensions of poverty assessment into a single dimension using a common unit of account.
2.1.1 Identification of the Income PoorSince Sen (1976), the measurement of poverty has been conceptualised as following two main steps: identification of who the poor are and aggregation of the information about poverty across society. In unidimensional space, the identification of who is poor is relatively straightforward: the poor are those whose overall achievement or resource variable falls below the poverty line , where the subscript simply signals that this is a poverty line used in the unidimensional space. Analogous to the construction of the resource variable, the poverty line can be obtained aggregating the minimum quantities or achievements considered necessary in each dimension. It is assumed that such quantities or levels are positive values, that is . These minimum levels are collected in the -dimensional vector . If the overall achievement is the level of utility, a utility poverty line needs to be set as . On the other hand, when the overall achievement is total income or total consumption expenditure, the poverty line is given by the estimated cost of the basic consumption basket —or some increment thereof. Then, given the person’s overall resource value or utility value and the poverty line, we can define the identification function as follows: identifies person as poor if , that is, whenever the resource or utility variable is below the poverty line, and identifies person as non-poor if . We denote the number of unidimensionally poor persons in a society by and the set of poor persons in a society by , such that .
2.1.2 Aggregation of the Income PoorIn terms of aggregation, a variety of indices have been proposed. Among them, the Foster, Greer and Thorbecke or FGT (1984) family of indices has been the most widely used measures of poverty by international organizations such as the World Bank and UN agencies, national governments, researchers, and practitioners. For simplicity, we assume the unidimensional variable to be income. Building on previous poverty indices including Sen (1976) and Thon (1979), the FGT family of indices is based on the normalized income gap—called the ‘poverty gap’ in the unidimensional poverty literature—which is defined as follows: Given the income distribution , one can obtain a censored income distribution by replacing the values above the poverty line by the poverty line value itself and leaving the other values unchanged. Formally, if and if . Then, the normalized income gap is given by: The normalized income gap of person is her income shortfall expressed as a share of the poverty line. The income gap of those who are non-poor is equal to 0. The individual income gaps can be collected in an -dimensional vector . Each element is the normalized poverty gap raised to the power and it can be interpreted as a measure of individual poverty where is a ‘poverty aversion’ parameter. The class of FGT measures is defined as , thus can be interpreted as the average poverty in the population. The FGT measures can also be expressed in a more synthetic way as , where is the mean operator and thus denotes the average or mean of the elements of vector . This presentation of the FGT indices is useful in understanding the AF class (Alkire and Foster 2011a). Within the FGT measures, three measures, associated with three different values of the parameter , have been used most frequently. The deprivation vector , for , replaces each income below the poverty line with 1 and replaces non-poor incomes with 0. Its associated poverty measure is called the headcount ratio, or the mean of the deprivation vector. It indicates the proportion of people who are poor, also frequently called the incidence of poverty. The normalized gap vector , for , replaces each poor person’s income with the normalized income gap and assigns 0 to the rest. Its associated measure , the poverty gap measure, reflects the average depth of poverty across the society. The squared gap vector for , replaces each poor person’s income with the squared normalized income gap and assigns 0 to the rest. Its associated measure—the squared gap or distribution sensitive FGT—is ; it emphasizes the conditions of the poorest of the poor as Box 2.1 explains. The FGT measures satisfy a number of properties, including a subgroup decomposability property that views overall poverty as a population-share weighted average of poverty levels in the different population subgroups. As noted by Sen (1976), the headcount ratio violates two intuitive principles: (1) monotonicity: if a poor person’s resource level falls, poverty should rise and yet the headcount ratio remains unchanged; (2) transfer: poverty should fall if two poor persons’ resource levels are brought closer together by a progressive transfer between them, and yet the headcount ratio may either remain unchanged or it can even go down. The poverty gap measure satisfies monotonicity, but not the transfer principle; the measure satisfies both monotonicity and the transfer principle.
Box 2.1. A numerical example of the FGT measuresA simple example  can clarify the method and these axioms, and will also prove useful in linking the Alkire and Foster methodology (fully described in Chapter 5) to its roots in the FGT class of poverty measures. Consider four persons whose incomes are summarized by vector and the poverty line is . The headcount ratio : Consider first the case of . Each gap is replaced by a value of 1 if the person is poor and by a value of 0 if non-poor. The deprivation vector is given by: , indicating that the second and third persons in this distribution are poor. The mean of this vector—the measure—is one half: , indicating that 50% of the population in this distribution is poor. Undoubtedly, it provides very useful information. However, as noted by Watts (1968) and Sen (1976), the headcount ratio does not provide information on the depth of poverty nor on its distribution among the poor. For example, if the third person became poorer, experiencing a decrease in her income so that the income distribution became , the measure would still be one half; that is, it violates monotonicity. Also, if there was a progressive transfer between the two poor persons, so that the distribution was , the measure would not change, violating the transfer principle. This has policy implications. If this was the official poverty measure, a government interested in maximising the impact of resources on poverty reduction would have an incentive to allocate resources to the least poor, that is, those who were closest to the poverty line, leaving the lives of the poorest of the poor unchanged. The poverty gap (or FGT-1): Here Each gap is raised to the power , giving the proportion in which each poor person falls short of the poverty line and if the person is non-poor. The normalized gap vector is given by . The measure is the mean of this vector. , indicates that the society would require an average of 20% of the poverty line for each person in the society to remove poverty. In fact, $4 is the overall amount needed in this case to lift both poor persons above the poverty line. Unlike the headcount ratio , the measure is sensitive to the depth of poverty and satisfies monotonicity. If the income of the third person decreased so we had the corresponding normalized gap vector would be , so . Clearly, . Indeed, all measures with satisfy monotonicity. However, a transfer to an extremely destitute person from a less poor person would not change , since the decrease in one gap would be exactly compensated by the increase in the other. By being sensitive to the depth of poverty (i.e. satisfying monotonicity), the measure does make policy makers want to decrease the average depth of poverty as well as reduce the headcount. But because of its insensitivity to the distribution among the poor, does not provide incentives to target the very poorest, whereas the FGT-2 measure does. The Squared Poverty Gap (or FGT-2): When we set , each normalized gap is squared or raised to the power . The squared gap vector in this case is given by: . By squaring the gaps, bigger gaps receive higher weight. Note for example that while the gap of the second person () is three times bigger than the gap of the third person (), the squared gap of the second person () is nine times bigger than the gap of the third person (). The mean of the vector—the measure—is . The measure is sensitive to the depth of poverty: if the income of the third person decreases one unit such that , the squared gap vector becomes , increasing the aggregate poverty level to ). It is also sensitive to the distribution among the poor: if there is a transfer of $1 from the third person to the second one, so becomes , the squared gap vector becomes , decreasing the aggregate poverty level to . Squaring the gaps has the effect of emphasising the poorest poor and providing incentives to policy makers to address their situation urgently. All measures with satisfy the transfer property
2.2 Notation and Preliminaries for Multidimensional Poverty MeasurementWe now extend the notation to the multidimensional context. We represent achievements as dimensional achievement matrix , as in the unidimensional framework described in section 2.1. We make two practical assumptions for convenience. We assume that the achievement of person in dimension can be represented by a non-negative real number, such that for all and . Also, we assume that higher achievements are preferred to lower ones. In a multidimensional setting, in contrast to a unidimensional context, the considered achievements may not be combinable in a meaningful way into some overall variable. In fact, each dimension can be of a different nature. For example, one may consider a person’s income, level of schooling, health status, and occupation, which do not have any common unit of account. As in the unidimensional case, we allow the population size of a society to vary, and we assume to denote a fixed set (and number) of dimensions. We denote the set of all possible matrices of size by and the set of all possible achievement matrices by , such that . If , then matrix contains achievements for persons and a fixed set of dimensions. Unless specified otherwise, whenever we refer to matrix , we assume . The achievements of any person in all dimensions, which is row of matrix , are represented by the -dimensional vector for all . The achievements in any dimension for all persons, which is column of matrix , are represented by the -dimensional vector for all . In multidimensional analysis, each dimension may be assigned a weight or deprivation value based on its relative importance or priority. We denote the relative weight attached to dimension by , such that for all . The weights attached to all dimensions are collected in a vector . For convenience we may restrict the weights such that they sum to the total number of considered dimensions, that is: Alternatively, weights may be normalized; in other words, the weights sum to one: .
2.2.1 Identifying DeprivationsA common first step in multidimensional poverty assessment in several of the methodologies reviewed in Chapter 3, as well as in the Alkire and Foster (2007, 2011a) methodology, requires defining a threshold in each dimension. Such a threshold is the minimum level someone needs to achieve in that dimension in order to be non-deprived. It is called the dimensional deprivation cutoff. When a person’s achievement is strictly below the cutoff, she is considered deprived. We denote the deprivation cutoff in dimension by ; the deprivation cutoffs for all dimensions are collected in the -dimensional vector . We denote all possible -dimensional deprivation cutoff vectors by . Any person is considered deprived in dimension if and only if . For several measures reviewed in Chapter 3, and for the AF method, it will prove useful to express the data in terms of deprivations rather than achievements. From the achievement matrix and the vector of deprivation cutoffs , we obtain a deprivation matrix (analogous to the deprivation vector in the unidimensional context) whose typical element whenever and , otherwise, for all and for all . In other words, if person is deprived in dimension , then the person is assigned a deprivation status value of 1 and 0, otherwise. Thus, matrix represents the deprivation status of all persons in all dimensions in matrix . Vector represents the deprivation status of person in all dimensions and vector represents the deprivation status of all persons in dimension . From the matrix one can construct a deprivation score for each person such that . In words, denotes the sum of weighted deprivations suffered by person . In the particular case in which weights are equal and sum to the number of dimensions, the score is simply the number of deprivations or deprivation counts that the person experiences. Whenever weights are unequal but sum to the number of dimensions, person deprivation score is defined as the sum of her weighted deprivation counts. The deprivation scores are collected in an -dimensional column vector . On certain occasions, it will be useful to use the deprivation-cutoff-censored achievement matrix which is obtained from the corresponding achievement matrix in , replacing the non-deprived achievements by the corresponding deprivation cutoff and leaving the rest unchanged. We denote the th element of by . Then, formally, if and , otherwise. In this way, all achievements greater than or equal to the deprivation cutoffs are ignored in the censored achievement matrix. When data are cardinally meaningful for all and all , and , in other words, when all the achievements take non-negative values and the deprivation cutoffs take strictly positive values, one can construct dimensional gaps or shortfalls from the censored achievement matrix as: Each or normalized gap, expresses the shortfall of person in dimension as a share of its deprivation cutoff. Naturally, the gaps of those whose achievement is above the corresponding dimensional deprivation cutoff are equal to 0. Generalizing the above, the individual normalized gaps can be collected in an dimensional matrix where each element is the normalized gap defined in (2.2) raised to the power ; such normalized gaps can be interpreted as a measure of individual deprivation in dimension . When , we have the deprivation matrix already defined. When , we have the matrix of normalized gaps, and when , we have the matrix of squared gaps. Analogous to the FGT measures, is a deprivation aversion parameter.
2.2.2 Identification and Aggregation in the Multidimensional CaseSen’s (1976) steps of identification of the poor and aggregation also apply to the multidimensional case. It is clear that the identification of who is poor in the unidimensional case is relatively straightforward. The poverty line dichotomizes the population into the sets of poor and non-poor. In other words, in the unidimensional case, a person is poor if she is deprived. However, in the multidimensional context, the identification of the poor is more complex: the terms ‘deprived’ and ‘poor’ are no longer synonymous. A person who is deprived in any particular dimension may not necessarily be considered poor. An identification method, with an associated identification function, is used to define who is poor. We denote the identification function by , such that identifies person as poor and identifies person as non-poor. Analogous to the unidimensional case, we denote the number of multidimensionally poor people in a society by and the set of poor persons in a society by , such that . It could be the case that the identification method is based on some ‘exogenous’ variable, in that it is a variable not included in achievement matrix For example, the exogenous variable could be being beneficiary of some government programme or living in a specific geographic area. One may also define an identification method based on one particular dimension of matrix . One may consider the corresponding normative cutoff to identify the person as poor, in which case the function is , or one may consider a relative cutoff identifying as poor anyone who is below the median or mean value of the distribution, in which case the function is . Alternatively, identification may be based on the whole set of achievements not necessarily considering dimensional deprivation cutoffs but rather the relative position of each person on the aggregate distribution . There are many different ways of identifying the poor in the multidimensional context. A particularly prevalent set of methods consider the person’s vector of achievements and corresponding deprivation cutoffs, such that identifies person as poor and identifies person as non-poor. Within this specification of the identification function, at least two approaches can be followed. An approach closely approximating unidimensional poverty is the ‘aggregate achievement approach’, which consists of applying an aggregation function to the achievements across dimensions for each person to obtain an overall achievement value. The same aggregation function is also applied to the dimensional deprivation cutoffs to obtain an aggregate poverty line. As in the unidimensional case, a person is identified as poor when her overall achievement is below the aggregate poverty line. Another method, which we refer to as ‘censored achievement approach’, first applies deprivation cutoffs to identify whether a person is deprived or not in each dimension and then identifies a person considering only the deprived achievements. The ‘counting approach’ is one possible censored achievement approach, which identifies the poor according to the number (count) of deprivations they experience. Note that ‘number’ here has a broad meaning as dimensions may be weighted differently. Chapter 4 and the AF method (Ch 5-10) use a counting approach. When the scale of the variables allows, other identification methods could be developed using the information on the deprivation gaps.
2.2.3 The Joint DistributionThroughout this book we will frequently refer to the joint distribution in contrast to the marginal distribution and we will also use the expression joint deprivations. The concept of a joint distribution comes from statistics where it can be represented using a joint cumulative distribution function. The relevance of the joint distribution in multidimensional analysis was articulated by Atkinson and Bourguignon (1982), who observed that multidimensional analysis was intrinsically different because there could be identical dimensional marginal distributions but differing degrees of interdependence between dimensions. In this book we treat the achievement matrix as a representation of the joint distribution of achievements. Each row contains the (vector of) achievements of a given person in the different dimensions, and each column contains the (vector of) achievements in a given dimension across the population. From that matrix, considered with deprivation cutoffs, it is possible to obtain the proportion of the population who are simultaneously deprived in different subsets of dimensions. In other words, it is possible to obtain the proportion of people who experience each possible profile of deprivations. This is visually clear in the deprivation matrix , which represents the joint distribution of deprivations. The higher order matrices and obviously offer further information regarding the joint distribution of the depths of deprivations. The importance of considering the joint distribution of achievements, which in turn enables us to look at joint deprivations, is best understood in contrast with the alternative of looking at the marginal distribution of achievements, and, thus, the marginal deprivations. The marginal distribution is the distribution in one specific dimension without reference to any other dimension. The marginal distribution of dimension is represented by the column vector . From the marginal distribution of each dimension, it is possible to obtain the proportion of the population deprived with respect to a particular deprivation cutoff. However, by looking at only the marginal distribution, one does not know who is simultaneously deprived in other dimensions. Table 2.1 illustrates the relevance of the joint distribution in the basic case of persons and dimensions using a contingency table.
Table 2.1. Joint distribution of deprivation in two dimensionsWe denote the number of people deprived and non-deprived in the first dimension by and , respectively; whereas, the number of people deprived and non-deprived in the second dimension are denoted by and , respectively. These values correspond to the marginal distributions of both dimensions as depicted in the final row and final column of the table. They could equivalently be expressed as proportions of the total, in which case, for example, () would represent the proportion of people deprived (or the headcount ratio) in Dimension 1. The marginal distributions, however, do not provide information about the joint distribution of deprivations, which is described in the four internal cells of the table. In particular, the number of people deprived in both dimensions is denoted by , the number of people deprived in the first but not the second dimension is denoted by , and the number of people deprived in the second and not in the first dimension is denoted by . We know that people are deprived in both dimensions and the sum of is the number of people deprived in at least one dimension. These values correspond to the joint distribution of deprivations. Consider now the case of four dimensions and four people, to see how valuable information can be added by the joint distribution. Table 2.2 presents the deprivation matrix of two hypothetical distributions, and . Such a matrix presents joint distributions of deprivations in a compact way and is used regularly throughout this book.
Table 2.2. Comparison of two joint distributions of deprivations in four dimensions
2.2.4 Marginal MethodsSome of the methods for multidimensional poverty assessment introduced in Chapter 3 can be called marginal methods because they do not use information contained in the joint distribution of achievements. In other words, they ignore all information on links across dimensions. Following Alkire and Foster (2011b), a marginal method assigns the same level of poverty to any two matrices that generate the same marginal distributions. In Table 2.2, a marginal method would assign the same poverty level to distribution (four deprivations are experienced by one person) and distribution (each person experiences exactly one deprivation). That is, it would not be able to show whether the deprivations are spread evenly across the population or whether they are concentrated in an underclass of multiply deprived persons. Such marginal methods can also be linked to the order of aggregation while constructing poverty indices (Pattanaik et al 2012). Specifically, a measure can be obtained by first aggregating achievements or deprivations across people (column-first) within each dimension and then aggregating across dimensions, or it can be obtained by first aggregating achievements or deprivations for each person (row-first) and then aggregating across people. Only measures that follow the second order of aggregation (i.e., first across dimensions for each person and then across persons) reflect the joint distribution of deprivations (Alkire 2011: 61, Figure 7). Measures that follow the first order of aggregation fall under marginal methods of poverty measurement.
2.2.5. Useful Matrix and Vector OperationsThroughout the book, we use specific vector and matrix operations. This section introduces the technical notation covering vectors and matrices. We denote the transpose of any matrix by where has the rows of matrix converted into columns. Formally, if and the th element of is written , then , where is the th element of for all and . The same notation applies to a vector, with being the transpose of . Thus, if is a row vector, is a column vector of the same dimension. As stated in section 2.1 the average or mean of the elements of any vector is denoted by , where . Similarly, the average or mean of the elements of any matrix is denoted by , where . Later in the book we use a related expression, the so-called ‘generalized mean of order’ . Given any vector of achievements , where , the expression of the weighted generalized mean of order is given by where and . When weights are equal, for all . Each generalized mean summarizes distribution into a single number and can be interpreted as a ‘summary’ measure of well- or ill-being, depending on the meaning of the arguments . When for all , we write simply as . When , reduces to the arithmetic mean and is simply denoted by When , more weight is placed on higher entries and is higher than the arithmetic mean, approaching the maximum entry as tends to . For more weight is placed on lower entries, and is lower than the arithmetic mean, approaching the minimum entry as tends to . The case of is known as the geometric mean and as the harmonic mean. Expression (2.3) is also known as a constant elasticity of substitution function, frequently used as a utility function in economics. When generalized means are computed over achievements, it is natural to restrict the parameter to the range of giving a higher weight to lower achievements and penalizing for inequality (Atkinson 1970). Likewise when generalized means are computed over deprivations, it is natural to restrict the parameter to the range of giving a higher weight to higher deprivations and also penalizing for inequality. Box 2.2 contains an example of generalized means.
Box 2.2. Example of generalized MeansConsider two distributions and with the following distribution of achievements in a particular dimension: and . We first show how to calculate for certain values of and then compare two distributions with a graph where ranges from to . In this example, we assume that all dimensions are equally weighted: . Arithmetic Mean:The arithmetic mean () of distribution is: . Geometric Mean:If , then is the ‘geometric mean’ and by the formula presented in (2.3) can be calculated as: . Harmonic Mean: If , then is the harmonic mean and can be calculated as: . The following graph depicts the values of the of and for different values of . Note that when , given that the two distributions have the same arithmetic mean. In both cases, when , the generalized means are strictly lower than the arithmetic mean, because the incomes are unequally distributed. Note moreover that for this range, because has a more unequal distribution. On the other hand, for , as the higher incomes receive a higher weight.
2.3 Scales of Measurement: Ordinal and Cardinal DataAn important element of the framework in multidimensional poverty measurement relates to the scales of measurement of the indicators used. Scales of measurement are key because they affect the kind of meaningful operations that can be performed with indicators. In fact, as we will observe, certain types of indicators may not allow a number of operations and thus cannot be used to generate certain poverty measures. What does scale of measurement refer to exactly? Following Roberts (1979) and Sarle (1995), we define a scale of measurement to be a particular way of assigning numbers or symbols to assess certain aspects of the empirical world, such that the relationships of these numbers or symbols replicate or represent certain observed relations between the aspects being measured. There are different classifications of scales of measurement. In this book, we follow the classification introduced by Stevens (1946) and discussed in Roberts (1979). Stevens’ classification is consistent with Sen (1970, 1973), which analysed the implications of scales of measurement for welfare economics, distributional analysis, and poverty measurement, and it has largely stood the test of time. Stevens’ (1946) classification relies on four key concepts: assignment rules, admissible transformations, permissible statistics, and meaningful statements. First, the defining feature of a scale is the rule or basic empirical operation that is followed for assigning numerals, as elaborated below. Second, each scale has an associated set of admissible mathematical transformations such that the scale is preserved. That is, if a scale is obtained from another under an admissible transformation, the rule under the transformed scale is the same as under the original one. Third, a permissible statistic refers to a statistical operation that when applied to a scale, produces the same result as when it is applied to the (admissibly) transformed scale. While the word ‘permissible’ may sound rather strong, it is justifiable under the premise that ‘one should only make assertions that are invariant under admissible transformations of scale’ (Marcus-Roberts and Roberts 1987, 384). Fourth, a statement is called meaningful if it remains unchanged when all scales in the statement are transformed by admissible transformations (Marcus-Roberts and Roberts 1987: 384). Stevens (1946) considered four basic empirical operations or rules that define four types of scales: equality, rank-order, equality of intervals, and equality of ratios. Following them, he defined four main types of scales: nominal, ordinal, interval, and ratio. Stevens' classification is not exhaustive. For example, it only applies to scales that take real values and which are regular. Also, note that alternative terms are sometimes used for some of Stevens’ types. For example, nominal scales are sometimes referred to as categorical scales. Table 2.3 lists the scale types mentioned above from ‘weakest’ to ‘strongest’ in the sense that interval and ratio scales contain much more information than ordinal or nominal scales. The column that presents the rule defining each scale type is cumulative in the sense that a rule listed for a particular scale must be applicable to the scales in rows preceding it. The column that lists the permissible statistics is also cumulative in the same sense. In contrast, the column that lists the admissible transformations goes from general to particular: the particular operation listed in a row is included in the operation listed above. We now introduce each scale ‘type’. The scale pertains to an indicator used to measure dimension . The term ‘indicator ’ denotes the indicator of dimension . Achievements in indicator across the population are represented by vector , where is the achievement of person in the indicator. Indicator is said to be nominal or categorical if the scale is based on mutually exclusive categories, not necessarily ordered. Nominal variables are frequently called categorical variables. The rule or basic empirical operation behind this type of scale is the determination of equality among observations. A nominal scale is ‘the most unrestricted assignment of numerals. The numerals are used only as labels or type numbers, and words or letters would serve as well’ (Stevens 1946: 678). That is, numbers assigned to the various achievement levels in this domain are simply placeholders. Stevens introduces two common types of nominal variables. One uses ‘numbering’ for identification, such as the identification number of each household in a survey or the line number of individuals living within a household. The other uses numbering for a classification, such that all members of a social group (ethnic, caste, religion, gender, or age) or geographical regions (rural/urban areas, states, or provinces) are assigned the same number. The first type of nominal variable is simply a particular case of the second. There is a wide range of admissible transformations for this type of scale. In fact, any transformation that substitutes or permutes values between groups, that is, any one-to-one substitution function such that for all , will leave the scale form invariant. Given that in a nominal variable, the different categories do not have an order, neither arithmetic operations nor logical operations (aside from equality) are applicable. In terms of relevant statistics, if the nominal variable is simply an identifier, then only the number of categories is a relevant statistic; if the nominal variable contains several cases in each category, then the mode and contingency methods can be implemented, as can hypotheses tests regarding the distribution of cases among the classes (Stevens 1946: 678–9). Indicator is said to be ordinal if the order matters but not the differences between values. The rule or basic empirical operation behind this type of scale is the determination of a rank order. Categories can be ordered in terms of ‘greater’, ‘less’, or ‘equal’ (or ‘better’, ‘worse’, ‘preferred’, ‘not preferred’). Admissible transformations consist of any order-preserving transformation, that is, any strictly monotonic increasing function such that for all , as these will leave the scale form invariant. Thus, admissible transformations include logarithmic operation, square root of the values (nonnegative), linear transformations, and adding a constant or multiplying by another (positive) constant. Examples of ordinal scales are preference orderings over various categories, or subjective rankings. Given that the true intervals between the scale points are unknown, arithmetic operations are meaningless (because results will change with a change of scale), but logical operations are possible. For example, we can assert that someone reporting a health level of four feels ‘better’ than someone reporting a health level of ‘three’, who in turn feels better than a ‘two’, but we cannot assert whether the difference between level three and four is the same as the difference between level two and three. Nevertheless, some statistics are applicable to ordinal variables, namely, the number of cases, contingency tables, the mode, median, and percentiles. Statistics such as mean and standard deviation cannot be used. Clearly, an ordinal variable is a nominal variable but the converse is not true. Ordinal and nominal (or categorical) variables are also sometimes referred to as qualitative variables.
where is the z-score of weight-for-age, is the observed weight of child , is the median weight of children of the same sex and age as child in the reference population (healthy children), and is the standard deviation of the weight of children of that age in the reference population. The z-scores for weight-for-height () and height-for-age () are computed in an analogous way. Thus, for all and all , and . Thus, for example, suppose 14-month-old Anna weighs 8.3 kilograms. The median weight in the reference population of children of that age is 9.4 and the standard deviation is 1.* Thus, the z-score of Anna is , meaning that Anna is about one standard deviation below the median weight of healthy children. It is considered that children with z-scores that are more than two standard deviations below the median of the reference population suffer moderate undernutrition, and, if their z-score is more than three standard deviations below, they suffer severe undernutrition (underweight, wasting, or stunting, correspondingly). Children with a z-score of weight-for-height above standard deviations above the median are considered to be overweight (WHO 1997). An alternative way to assess the nutritional status of children is to use percentiles rather than z-scores, but z-scores present a number of advantages. Most importantly, they can be used to compute summary statistics such as a mean and standard deviation, which cannot be meaningfully done with percentiles (O’Donnell et al. 2008). Note that if we take a linear transformation of the z-score for weight-for-age such that , where and , then . Note that the difference (or ) has the same implication as the difference . This equivalence would hold for any linear transformation, exhibiting the characteristics of an interval-scale indicator.
*These values were taken from WHO’s reference tables: http://www.who.int/childgrowth/standards/sft_wfa_girls_z_0_5.pdf
2.4 Comparability across People and DimensionsThe last section established the scales of measurement by which we can rigorously compare achievement levels in one variable, and the mathematical and statistical operations that can be performed on that variable. The discussion enabled us to identify the scale of measurement of each single indicator. Yet multidimensional measures seek to compare people’s achievements or deprivations across indicators, in ways that respect the scale of measurement of each indicator. This is by no means elementary. As Sen (1970) pointed out, cardinally meaningful variables may not necessarily be cardinally comparable—across people or, in multidimensional measurement, across dimensions. This section scrutinizes how these comparisons can legitimately proceed. That is, it takes a step back from the material presented thus far, to make explicit assumptions that have usually been implicit in work on multidimensional poverty measurement.
2.5 Properties for Multidimensional Poverty MeasuresIn selecting one poverty measurement methodology from a set of options, a policy maker thinks through how a poverty measure should behave in different situations in order to be a ‘good’ measure of poverty and support policy goals. Then she asks which measure meets these requirements. For example, should the poverty measure increase or decrease if the achievement of a poor person rises while the achievements of other people remain unchanged? Should poverty comparisons change when achievements are expressed in different units of measurement? Should the measure of poverty in a more populous country with a larger number of poor people be higher than the poverty measure in a small country with a smaller number of, but proportionally more, poor people?
2.5.1 Invariance PropertiesThe first invariance principle is symmetry. Symmetry requires that each person in a society is treated anonymously so that only deprivations matter and not the identity of the person who is deprived. Hence this property is also often referred to as anonymity. As long as the deprivation profile of the entire society remains unchanged, swapping achievement vectors across people should not change overall poverty. This type of rearrangement can be obtained by pre-multiplying the achievement matrix by a permutation matrix of appropriate order.
2.5.2 Dominance PropertiesThis section covers six principles, each of which has a stronger version and a weaker version. The stronger version requires that a poverty measure strictly moves in a particular direction, given certain transformations in the achievements of the poor. The weaker version, does not require a poverty measure to move in a particular direction but ensures that the poverty measure does not move in the opposite (wrong) direction under certain transformations of the achievements. The first dominance principle, monotonicity, requires that if the achievement of a poor person in a deprived dimension increases while other achievements remain unchanged, then overall poverty should decrease. Normatively, this principle considers that improvements in deprived achievements of the poor are good and should be reflected by producing a reduction in poverty. Its weaker version, referred to as weak monotonicity, ensures that poverty should not increase if there is an increase in any person’s achievement in the society.
2.5.3 Subgroup PropertiesThe next set of principles is concerned with the link between overall poverty and poverty in different subgroups of the population, and the link between overall poverty and dimensional deprivations. The first principle—subgroup consistency—ensures that the change in overall poverty is consistent with the change in subgroup poverty. For example, suppose the entire society is divided into two population subgroups: Group 1 and Group 2. Poverty in Group 1 remains unchanged while poverty in Group 2 decreases. One would expect overall poverty to decrease. If overall poverty did not reflect subgroup poverty, there would be an inconsistency, which would be conceptually and politically problematic. As a result, national poverty estimates would not reflect regional successes in poverty reduction. A related principle with a stronger requirement is population subgroup decomposability. This principle requires overall poverty to be equal to a weighted sum of subgroups’ poverty, noted as in section 2.2.2, where the weight attached to each subgroup’s poverty is the population share of that subgroup.
Population Subgroup Decomposability:
2.5.4 Technical PropertiesFinally, we introduce certain technical principles, which ensure that the poverty measure is meaningful. These principles are non-triviality, normalization, and continuity. The non-triviality principle requires that a poverty measure takes at least two different values. This property may appear to be trivial by its name, but it is important: unless a measure takes two different values, it is not possible to distinguish a society with poverty from a society with no poverty. Note that when a measure satisfies the strong version of at least one of the dominance principles, this property is automatically satisfied (by definition, poverty will take at least two different values). However, when a measure only satisfies the weak version of all dominance principles, this property becomes necessary.
 Empirical applications may encounter negative or zero income values, which require special treatment for certain poverty measures to be implemented.
 In practical implementations of the unidimensional method, a fixed set and number of dimensions is rarely obtained. Survey-based consumption items or income sources often differ in number and content.
 A utility function is a (mathematical) instrument that intends to measure the level of satisfaction of a person with all possible sets of achievements (usually consumption baskets). Utility functions represent consumer preferences. The use of the utility framework for distributional analysis faces two well-known problems. First, in principle, utility functions are merely ordinal, that is, they indicate that a certain consumption basket (or achievement vector) is preferred to some other, without providing the magnitudes of the difference between two utility values. Second, in principle, the utility framework does not allow interpersonal comparability, in the sense that one cannot decide whether some utility loss of a given person (say a rich one) is less important than some utility gain of another person (say a poor one). As Sen observed, ‘...the attempt to handle social choice without using interpersonal comparability or cardinality had the natural consequence of the social welfare function being defined on the set of individual orderings. And this is precisely what makes this framework so unsuited to the analysis of distributional questions’ (Sen 1973, 12–13). In order to make this framework applicable to distributional analysis, one needs to broaden individual preferences to include interpersonally comparable cardinal welfare functions (Sen 1973, 15). One particular way in which this has been implemented is through the so-called utilitarianism approach, which defines the measure of social welfare as the sum of individual utilities; moreover, it is frequently assumed—as in the framework described above—that everyone has the same utility function.
 Alkire and Foster (2011b) provide further discussion on uni- vs. multidimensional approaches.
 The concept of the poverty line dates to the late 1800s. Booth (1894, 1903), Rowntree (1901), and Bowley and Burnett-Hurst (1915) wrote seminal studies based on surveys in some UK cities. As Rowntree write, the poverty line represented the ‘minimum necessaries for the maintenance of merely physical efficiency’ (i.e. nutritional requirements) in monetary terms, plus certain minimum sums for clothing, fuel, and household sundries according to the family size (Townsend 1954: 131).
 Axiomatic measures described in section 3.6.2 takes this approach.
 The interpretation of the variable is different if total income or total expenditure is used, with the former reflecting ‘what could be’ and the latter reflecting ‘what is’ (Atkinson 1989 cited in Alkire and Foster 2011b: 292).
 See Foster and Sen (1997), Zheng (1997), and Foster (2006) for a review of unidimensional poverty indices and Foster, Seth, et al. (2013) for pedagogic coverage of poverty and other unidimensional measures, with tools for practical implementation.
 Ravallion (1992) offers an early guidebook on the wide range of possible uses of the FGT measures, and Foster, Greer, and Thorbecke (2010) provide a detailed retrospective of the use and extensions of this class of measures.
 An alternative way to define the normalized income deprivation gap not using the censored distribution is that for , and for .
 In the epidemiological literature there is a clear distinction between the terms incidence and prevalence. Incidence refers to the number or rate of people becoming ill during a period of time in a specified population, whereas prevalence refers to the number or proportion of people experiencing an illness in a particular point in time (regardless of the moment at which they became ill). In general usage, this distinction is usually ignored and the expression ‘poverty incidence’ or ‘incidence of poverty’ frequently refers to the proportion of poor people in a certain population at a certain point in time (which strictly speaking in epidemiological terms would be poverty prevalence), and not to the proportion of people who became poor over a certain time period (which strictly speaking in epidemiological terms would be incidence). The expression ‘poverty prevalence’ or ‘prevalence of poverty’ is also sometimes, although much less frequently, found but refers to the same concept as when incidence is used. In this book we follow the poverty literature and refer to poverty incidence as the poverty rate at a particular point in time.
 Note that the population subgroups are mutually exclusive and collectively exhaustive.
 Although we address the issue of scales of measurement later on in this chapter, it is worth anticipating that while all three mentioned members of the FGT family (, and ) can be applied to cardinal variables (where distances between categories are meaningful) only the headcount ratio can be used with an ordinal variable (where distances between categories are meaningless).
 Alkire and Santos (2009).
 In empirical applications some indicators may not be restricted to the non-negative range, or be scored such that larger values are worse, or that the lowest attainable value is strictly positive. For example the z-scores of children’s nutritional indicators may take negative values; in a people-per-room indicator, larger values are worse. And the lowest possible Body Mass Index for human survival is strictly positive. Such indicators may require rescaling.
 For simplicity of presentation, in theoretical sections, we use the term dimension to refer to each variable; in empirical presentations often we use the term ‘indicator’ for the variables, while ‘dimension’ refers to groupings of indicators.
 Note that the prices used in the unidimensional case provide a particular weighting structure, where the weights do not necessarily sum to or 1.
 Alternative notations for the AF methodology are presented and elaborated in Chapter 5.
 This is an analogous construct to the income gaps in the FGT measures. An alternative way to define the deprivation gaps not using the censored distribution is that when and when .
 Note that this identification function differs from the one introduced in the unidimensional case in that it depends on the vector of achievements and the vector of dimensional deprivation cutoffs . In the unidimensional case, identification depends on the already-aggregated overall achievement or resource variable and the aggregate poverty line , which of course may depend upon the prices of certain commodities.
 Within the aggregate achievement approach, the intermediate criterion is operationalized by using the so-called ‘poverty frontier’, defined as the different combinations of the achievements that provide the same overall achievement as the aggregate poverty line. Duclos, Sahn, and Younger (2006a) further elaborate the poverty frontier; cf Atkinson (2003) and Bourguignon and Chakravarty (2003).
 In one of the measures in the AF class of measures, the Adjusted Headcount Ratio, this partial index is called the censored headcount ratio. See Section 5.5.3 for a detailed presentation.
 Given two random variables and , the joint distribution can be described with the bivariate cumulative distribution function: . In words, the joint distribution gives the proportion of the population with values of and lower than and correspondingly and simultaneously.
 The authors analyse inequality in the two-dimensional case. They introduce the transformation in which there is an increase in the correlation of the achievements, leaving the marginal distributions unchanged—something we discuss in section 2.5.2. They extend the conditions for second-order stochastic dominance, noting that such conditions depend on the joint distribution.
 Given any random variable , the marginal distribution can be described with the cumulative distribution function: .
 Only in the very particular case in which the two variables are statistically independent, can one obtain the joint distribution from the marginal ones. In such a case, the proportion of people deprived simultaneously in a number of variables can be obtained as the product of the proportions of people deprived in each variable. Although this is a topic for further empirical research, a priori, it seems unlikely that the independence condition will be satisfied, especially as the number of considered dimensions increases.
 Alkire and Foster (2011b). Similar examples on the relevance of considering the joint distribution in the measurement of multidimensional welfare and poverty can be found in Tsui (2002), Pattanaik, Reddy, and Xu (2012), and Seth (2009).
 Alkire and Santos (2009).
 Stevens’ work belongs to a branch of applied mathematics called measurement theory, which is useful in measurement and data analysis.
 ‘The criterion for the appropriateness of a statistic is invariance under the [admissible] transformations’ (Stevens 1946, 678).
 The notion of meaningfulness is alluded to in Stevens (1946) and used in Roberts (1979).
 An irregular scale does not always generate an acceptable scale from an admissible transformation (see Roberts and Franke 1976, cited in Marcus-Roberts and Roberts 1987: 384).
 Relatedly, Luce (1956) distinguished a weak order from a semi-order over the same set of elements. In a weak order the indifference relation is transitive, but in a semi-order it is not.
 Countably infinite means that the values of the discrete variable have one-to-one correspondence with the natural numbers.
 Note that other authors equate the distinction between qualitative/ordinal vs. quantitative/cardinal with the distinction between discrete vs. continuous variables (e.g. Bossert, Chakravarty, and D’Ambrosio 2013). In our definitions, cardinal variables can be either continuous or discrete, so the two pairs are not equivalent.
 Dichotomous variables can also be obtained from nominal ones. For example, given a nominal variable on age intervals, a dummy variable can be created for each age interval (‘belongs’ or ‘does not belong’ to that particular age range). More commonly, one can dichotomize variables with categorical responses into deprived and non-deprived states; for example, classifying ‘sources of water’ into two exhaustive groups reflecting ‘safe’ and ‘unsafe’ water.
 There is a very large literature on interpersonal comparisons and partial comparisons, stemming from Sen (1970). Basu (1980) raises comparability across dimensions in the context of government preferences and helpfully distinguishes comparability and measurability (ch. 6, 74–5).
 Sen (1979, 1985, 1992, 1997) has powerfully observed how the same level of resources may in fact be associated with different levels of well-being because of differences in people’s ability to convert resources into well-being.
 Earlier we defined a cardinal variable to be interval scale type if the rule or basic empirical operation behind its scale is the determination of equality of intervals or differences. Consider a variable having exactly two points, neither of which is a natural zero. In this case, they can be understood to be equally spaced along any scale, hence trivially satisfy this definition. If either of the points occurs at a natural zero then the dichotomous variable is ‘trivially ratio scale’.
 Watts (1968) offered an early intuitive (non-formal) justification for selecting the functional form of a poverty measure according to the properties it should satisfy.
 Within the poverty measurement literatures, there are essentially two procedures for constructing measures in the axiomatic framework. In the first, known as characterization, a number of principles that are considered desirable are introduced and then the entire class of measures (one or many) that embody these principles is determined. This procedure entails a sufficiency condition, which shows that the measure satisfies these principles, and, simultaneously, a necessity condition, which shows that this is the only measure (or the family of measures) that satisfies the set of desirable principles. Studies that follow this procedure include Sen (1976), Tsui (2002), Chakravarty, Mukherjee, and Ranade (1998), Bossert, Chakravarty, and D’Ambrosio (2013), Chakravarty and Silber (2008), Bossert, Chakravarty, and D’Ambrosio (2013), Hoy and Zheng (2011), and Porter and Quinn (2013). Second, studies may introduce a number of properties that are considered desirable and then propose a measure or family of measures satisfying these properties, without claiming it to be the only measure or family of measures to do so. Studies following this procedure include Bourguignon and Chakravarty (2003), Calvo and Dercon (2009), Foster (2009), Alkire and Foster (2011a), and Foster and Santos (2013).
 Other possible identification methods may violate some of the properties stated in this section. Future research may develop a set of properties for the identification function in the multidimensional context.
 This classification follows Foster (2006).
 This principle was first suggested by Dalton (1920) in the context of inequality measurement.
 In the context of welfare measurement, Foster and Sen (1997) referred to this as the ‘symmetry for population’. Chakravarty, Mukherjee, and Ranade (1998), Bourguignon and Chakravarty (2003), and Deutsch and Silber (2005) call it the ‘principle of population’. Bossert, Chakravarty, and D’Ambrosio (2013) introduce a separate principle called the ‘poverty Wicksell population principle’ to compare societies with different population sizes. This property requires that if a person is added to the society with the same level of poverty as the aggregate poverty of the society, overall poverty should not change.
 Most of the studies, such as Chakravarty, Mukherjee, and Ranade (1998), Bourguignon and Chakravarty (2003), and Deutsch and Silber (2005), have used the term ‘scale invariance’; whereas Tsui (2002) uses the term ‘ratio-scale invariance’.
 Both the scale invariance and the unit consistency principles refer to cases in which achievements are changed in a certain proportion (which may differ or not across achievements). A different principle known as ‘translation invariance’, popularized by Kolm (1976a,b), requires the poverty level to remain the same if each achievement and its corresponding deprivation cutoff are changed by adding the same constant for every person (although the constant added can differ across dimensions). Technically, if an achievement matrix is obtained from another achievement matrix so that , where and for all , and , then .
 Bourguignon and Chakravarty (2003) refer the deprivation focus as ‘strong focus’ and the poverty focus as ‘weak focus’. Chakravarty, Mukherjee, and Ranade (1998) and Tsui (2002) only used the deprivation focus and did not consider the poverty focus.
 A nice theorem would be to prove that the only poverty measure invariant to admissible transformations of the nominal or ordinal variables is one based on dichotomous variables.
Alkire and Foster (2011a) distinguished the monotonicity principle from the weak monotonicity principle. Others, including Chakravarty, Mukherjee, and Ranade (1998), Tsui (2002), Bourguignon and Chakravarty (2003), and Deutsch and Silber (2005) imply weak monotonicity by their monotonicity principle. Bossert, Chakravarty, and D’Ambrosio (2013) did not introduce a weak monotonicity principle.
 See Fleurbaey (2006a) and Duclos et al. (2011) for discussion on axioms based on uniform majorization.
 Note that it is not possible for a multidimensional poverty measure to satisfy the deprivation focus principle and the transfer principle, simultaneously (Tsui 2002). For example, suppose the initial achievement matrix is and the deprivation cutoff vector is and both of them are identified as poor by some criteria. Consider the bistochastic matrix . Then, . The transfer principle now requires that , but by the deprivation focus principle, we should have .
 Rank association refers to the degree of agreement between two rankings. In the context of the properties discussed here, perfect rank association would occur if person having higher achievement than person in dimension , also has higher achievements in all the other dimensions That is: for all .
 This transformation was motivated by Boland and Proschan (1988).
 Note that if , on the contrary, is obtained from , then it is called ‘basic rearrangement’ by Boland and Proschan (1988). In multidimensional poverty measurement, it is referred to as ‘basic rearrangement-increasing transfer’ by Tsui (2002), ‘correlation increasing switch’ by Bourguignon and Chakravarty (2003) and ‘correlation increasing arrangement’ by Deutsch and Silber (2008). In multidimensional welfare analysis, an analogous concept has been called ‘association increasing transfer’ (Seth 2013), and in multidimensional inequality analysis it has been called ‘correlation increasing transfer’ by Tsui (1999) and ‘unfair rearrangement principle’ by Decancq and Lugo (2012).
 In the multidimensional measurement literature the substitutability and complementarity relationship between indicators is defined in terms of the second cross-partial derivative of the poverty measure with respect to any two dimensions being positive or negative. This obviously requires the dimensions to be cardinal and the poverty measure to be twice differentiable. Practically, given two dimensions and , substitutability implies that poverty decreases less with an increase in achievement in dimension for people with higher achievements in dimension (Bourguignon and Chakravarty 2003, 35). Conversely, complementarity implies that poverty decreases more with an increase in achievement in dimension for people with higher achievements in dimension . If the dimensions are independent, the second cross-partial derivative is zero and poverty should not change under the described transformation. This corresponds to the Auspitz-Lieben-Edgeworth-Pareto (ALEP) definition and differs from Hick’s definition, traditionally used in the demand theory (which relates to the properties of the indifference contours) (Atkinson 2003, 55). See Kannai (1980) for critiques of the ALEP definition. For a critique of Bourguignon and Charkavarty (2003)’s association axiom, see Decancq (2012).
 For various weak versions of the sensitivity to rearrangement properties in poverty measurement literature, see Tsui (2002), Chakravarty (2009) (which contains a modified version of the properties in Bourguignon and Chakravarty (2003), and Alkire and Foster (2011a). For different statements of the stronger versions of the property in the measurement of welfare and inequality, see Tsui (1995), Gajdos and Weymark (2005), Decancq and Lugo (2012), and Seth (2013).
 For a different statement of the strong dimensional transfer property using an association-increasing rearrangement, see Seth and Alkire (2013).
 The concept of subgroup consistency in poverty measurement has been motivated by Foster and Shorrocks (1991).
 For a formal discussion of this inconsistency, see Alkire and Foster (2013).