Traditionally, it is assumed that the population size of cities in a country follows a Pareto distribution. This assumption is typically supported by nding evidence of Zipf's Law. Recent studies question this nding, highlighting that, while the Pareto distribution may t reasonably well when the data is truncated at the upper tail, i.e. for the largest cities of a country, the log-normal distribution may apply when all cities are considered. Moreover, conclusions may be sensitive to the choice of a particular truncation threshold, a yet overlooked issue in the literature. In this paper, then, we reassess the city size distribution in relation to its sensitivity to the choice of truncation point. In particular, we look at US Census data and apply a recursive-truncation approach to estimate Zipf's Law and a non-parametric alternative test where we consider each possible truncation point of the distribution of all cities. Results conrm the sensitivity of results to the truncation point. Moreover, repeating the analysis over simulated data conrms the diculty of distinguishing a Pareto tail from the tail of a log-normal and, in turn, identifying the city size distribution as a false or a weak Pareto law.
Pareto or log-normal? A recursive-truncation approach to the distribution of (all) cities
2012
Abstract
Traditionally, it is assumed that the population size of cities in a country follows a Pareto distribution. This assumption is typically supported by nding evidence of Zipf's Law. Recent studies question this nding, highlighting that, while the Pareto distribution may t reasonably well when the data is truncated at the upper tail, i.e. for the largest cities of a country, the log-normal distribution may apply when all cities are considered. Moreover, conclusions may be sensitive to the choice of a particular truncation threshold, a yet overlooked issue in the literature. In this paper, then, we reassess the city size distribution in relation to its sensitivity to the choice of truncation point. In particular, we look at US Census data and apply a recursive-truncation approach to estimate Zipf's Law and a non-parametric alternative test where we consider each possible truncation point of the distribution of all cities. Results conrm the sensitivity of results to the truncation point. Moreover, repeating the analysis over simulated data conrms the diculty of distinguishing a Pareto tail from the tail of a log-normal and, in turn, identifying the city size distribution as a false or a weak Pareto law.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.