Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Geography, Environment and Planning

Major Professor

Ambe Njoh, Ph.D.

Co-Major Professor

Ruiliang Pu, Ph.D.

Committee Member

Joni Downs (Firat), Ph.D.

Committee Member

Fenda Akiwumi, Ph.D.

Committee Member

Jeffrey Cunningham, Ph.D.


multi-index slum measurement, variable clustering, unsupervised learning, satellite imagery, analytic regionalization


Despite being an indicator of modernization and macro-economic growth, urbanization in regions such as Sub-Saharan Africa is tightly interwoven with poverty and deprivation. This has manifested physically as slums, which represent the worst residential urban areas, marked by lack of access to good quality housing and basic services. To effectively combat the slum phenomenon, local slum conditions must be captured in quantitative and spatial terms. However, there are significant hurdles to this. Slum detection and mapping requires readily available and reliable data, as well as a proper conceptualization of measurement and scale. Using Bamenda, Cameroon, as a test case, this dissertation research was designed as a three-pronged attack on the slum mapping problematic. The overall goal was to investigate locally optimized slum mapping strategies and methods that utilize high resolution satellite image data, household survey data, simple machine learning and regionalization theory.

The first major objective of the study was to tackle a "measurement" problem. The aim was to explore a multi-index approach to measure and map local slum conditions. The rationale behind this was that prior sub-Saharan slum research too often used simplified measurement techniques such as a single unweighted composite index to represent diverse local slum conditions. In this study six household indicators relevant to the United Nations criteria for defining slums were extracted from a 2013 Bamenda household survey data set and aggregated for 63 local statistical areas. The extracted variables were the percent of households having the following attributes: more than two residents per room, non-owner, occupying a single room or studio, having no flush toilet, having no piped water, having no drainage. Hierarchical variable clustering was used as a surrogate for exploratory factor analysis to determine fewer latent slum factors from these six variables. Variable groups were classified such that the most correlated variables fell in the same group while non-correlated variables fell in separate groups. Each group membership was then examined to see if the group suggested a conceptually meaningful slum factor which could quantified as a stand-alone "high" and "low" binary slum index. Results showed that the slum indicators in the study area could be replaced by at least two meaningful and statistically uncorrelated latent factors. One factor reflected the home occupancy conditions (tenancy status, overcrowded and living space conditions) and was quantified using K-means clustering of units as an ‘occupancy disadvantage index’ (Occ_D). The other reflected the state of utilities access (piped water and flush toilet) and was quantified as utilities disadvantage index (UT_D). Location attributes were used to examine/validate both indices. Independent t-tests showed that units with high Occ_D were on average closer to nearest town markets and major roads when compared with units of low Occ_D. This was consistent with theory as it is expected that typical slum residents (in this case overcrowded and non-owner households) will favor accessibility to areas of high economic activity. However, this situation was not the same with UT_D as shown by lack of such as a strong pattern.

The second major objective was to tackle a "learning" problem. The purpose was to explore the potential of unsupervised machine learning to detect or "learn" slum conditions from image data. The rationale was that such an approach would be efficient, less reliant on prior knowledge and expertise. A 2012 GeoEye image scene of the study area was subjected to image classification from which the following physical settlement attributes were quantified for each of the 63 statistical areas: per cent roof area, percent open space area, per cent bare soil, per cent paved road surface, per cent dirt road surface, building shadow-roof area ratio. The shadow-roof ratio was an innovative measure used to capture the size and density attributes of buildings. In addition to the 6 image derived variables, the mean slope of each area was calculated from a digital elevation dataset. All 7 attributes were subject to principal component analysis from which the first 2 components were extracted and used for hierarchical clustering of statistical areas to derive physical types. Results show that area units could be optimally classified into 4 physical types labelled generically as Categories 1 – 4, each with at least one defining physical characteristic. Kruskal Wallis tests comparing physical types in terms of household and locations attributes showed that at least two physical types were different in terms of aggregated household slum conditions and location attributes. Category 4 areas, located on steep slopes and having high shadow-to-roof ratio, had the highest distribution of non-owner households. They were also located close to nearest town markets. They were thus the most likely candidates of slums in the city. Category 1 units on other hand located at the outskirts and having abundant open space were least likely to have slum conditions.

The third major objective was to tackle the problem of "spatial scale". Neighborhoods, by their very nature of contiguity and homogeneity, represent an ideal scale for urban spatial analysis and mapping. Unfortunately, in most areas, neighborhoods are not objectively defined and slum mapping often relies in the use of arbitrary spatial units which do not capture the true extent of the phenomenon. The objective was thus to explore the use of analytic regionalization to quantitatively derive the neighborhood unit for mapping slums. Analytic neighborhoods were created by spatially constrained clustering of statistical areas using the minimum spanning tree algorithm. Unlike previous studies that relied on socio-economic and/or demographic information, this study innovatively used multiple land cover and terrain attributes as neighborhood homogenizing factors. Five analytic neighborhoods (labeled Regions 1-5) were created this way and compared using Kruskal Wallis tests for differences in household slum attributes. This was to determine largest possible contiguous areas that could be labeled as slum or non-slum neighborhoods. The results revealed that at least two analytic regions were significantly different in terms of aggregated household indicators. Region 1 stood apart as having significantly higher distributions of overcrowded and non-owner households. It could thus be viewed as the largest potential slum neighborhood in the city. In contrast, regions 3 (located at higher elevation and separated from rest of city by a steep escarpment) was generally associated with low distribution of household slum attributes and could be considered the strongest model of a non-slum or formal neighborhood. Both Regions 1 and 3 were also qualitatively correlated with two locally recognized (vernacular) neighborhoods. These neighborhoods, "Sisia" (for Region 1) and "Up Station" (for Region 3), are commonly perceived by local folk as occupying opposite ends of the socio-economic spectrum.

The results obtained by successfully carrying the three major objectives have major implication for future research and policy. In the case of multi-index analysis of slum conditions, it affirms the notion the that slum phenomenon is diverse in the local context and that remediation efforts must be compartmentalized to be effective. The results of image based unsupervised mapping of slums from imagery show that it is a tool with high potential for rapid slum assessment even when there is no supporting field data. Finally, the results of analytic regionalization showed that the true extent of contiguous slum neighborhoods can be delineated objectively using land cover and terrain attributes. It thus presents an opportunity for local planning and policy actors to consider redesigning the city neighborhood districts as analytic units. Quantitively derived neighborhoods are likely to be more useful in the long term, be it for spatial sampling, mapping or planning purposes.