Population Synthesis and models
Population Synthesis

Population Synthesis and models

The need for more detailed population analysis

Recent years have seen an increased interest in the distributional impacts of transport interventions; reducing inequalities of access to opportunities for work and study has become an important policy objective. These impacts can be estimated using classic models with an appropriate traveller segmentation in the demand models. Alternatively, one may adopt Agent or Activity Based Modelling approaches. Both classic and new models benefit from a more granular treatment of the population of interest, their preferences and interactions. A good way to achieve this is to create a synthetic population with a full set of characteristics of interest for a particular model.

A synthetic population would also be a very good starting point for more conventional models as it allows a more granular treatment of the distributional impacts of any policy intervention. A synthetic population is, of course, an artificial construct; it would contain as many households and individuals as the most recent census but to protect privacy none of the simulated entities correspond to a real one.

There are two main sources of data to build this synthetic population: A travel survey (perhaps up to a 2% sample of the population), and the most recent census data as a control to expand and distribute that sample. The survey will probably be an augmented version of a Household Travel Survey (HTS) collecting additional data in respect of issues considered important in the model; for example on preference for car ownership or Mobility as a Service, or how is the use of the car decided within the household.

The main objective is, therefore, to create this synthetic population representing everybody resident in the study area while retaining the relevant attributes of interest to use in the model. If the model is used in forecasting mode (rather than as a policy testing tool for the present) then this population needs to be synthesised for future years based on a few properties that are actually forecast by planners, such as the number of people and households per zone, perhaps income. Other attributes, like distribution of household sizes, age distribution, school and university attendance, multiple vehicle ownership, propensity for remote work and procurement of services will need to be estimated or assumed.

The synthesis procedure involves two main steps. First, a demographic distribution of households is estimated for each transport zone, and then a matching sample of households is drawn from a set of household records for which nearly complete census information is available.

The number of households in each cell is estimated through an iterative multi-proportional fitting procedure. The procedure starts with an initial joint distribution available for (aggregate) census geographical units. It then cycles iteratively through a set of control totals, one for each category of each control variable.

?A simple example

As an example, consider sample household data as shown above. There are three household sizes and two income levels. We know from, say census data, that there are 55 households (HH) with low income and 35 HH with high income in that zone, and that there is a total of 20, 40 and 30 HH of each size. The sample data is shown in the 3x2 box labelled Sample and Targets. Applying a bi-proportional adjustment, in this case, will solve this population synthesis problem. First adjustment factors are calculated to scale up the total number of households in the zone. Then an iterative process adjusts for Income level and then again for HH number until convergence, approximately in iteration 7.

Modelling extensions

This approach is often extended to cover other dimensions like car ownership, number of students, etc. Additional household characteristics that may be used as controls in special cases include age and gender of the head of household, presence of children, and family vs. non-family household members. The adjustments will then be multi-proportional and a requirement for this procedure to work is to have consistent control of marginal totals. In this case, the iterative procedure will converge so that all control totals are satisfied and the correlation structure of the initial joint distribution is preserved. Control totals are taken from census tables for the base year. For the forecast years, they will come from demographic and land use forecasts, which may be less detailed.

It is also useful to note that the problem of zero cells or zero marginals, that affected trip matrix expansion or matrix estimation, applies also to the population synthesisers. Similar corrections would need to be applied.

From a modelling perspective, the process of population synthesis needs a second phase. In this case, we need to identify person attributes from within each household; again, we will be interested in retaining the person attribute marginal totals for each zone. This second phase typically includes three steps.

The first is to convert into integers the non-integer values for households in zones resulting from the first phase; fractions of households cannot be handled in agent or activity-based models. Second, a Monte Carlo procedure is typically employed to draw the correct number of households of each type from the HTS. Note that as some of the desired data may not be available in the census, or it may not be accessible to the modeller, it is often inevitable to sample from the HTS and any activity diary dataset available. Third, the useful household and person variables are extracted from the drawn households and retained for use by the model system.

Optional steps

There is an optional fourth step used in some models to assign each household to a more precise location within its geographic unit. For example, for the detailed modelling of Demand Responsive Transit it is desirable to identify the coordinates of each household, each individual and each available unit (say e-scooter) to serve specific demands at particular times.

The final output from these processes is a synthetic population where each synthesised household and its members have many clearly defined characteristics of interest for use in the model system and, together, they match the estimated demographic distribution within each zone.

???????????This synthetic population would be the basis for Agent and Activity Based Models; it will also facilitate a more detailed analysis of the distributional impacts of transport interventions using classic aggregate models.

Nicole Andréa Mathys

Head of Section at Federal Office For Spatial Development ARE

1 年

interesting! could you please indicate where I can find further information on the models? Thanks!

要查看或添加评论,请登录

Luis Willumsen的更多文章

  • An important new book

    An important new book

    A major book on decarbonising transport in the light of real travel behaviour David Metz, Honorary Professor at…

    3 条评论
  • Activity Based Models

    Activity Based Models

    Improving realism Efforts have been made to base transport models on a deeper understanding of the reasons for…

    10 条评论
  • Agent Based Modelling in Transport

    Agent Based Modelling in Transport

    The use of "agents" in models Agent-based modelling (AgBM) is a computational method that simulates the behaviour of…

    13 条评论
  • 5 The modelling challenge of shared mobility

    5 The modelling challenge of shared mobility

    There is plenty of hype about how these new technologies suggesting that some are “silver bullets” that will deliver a…

    9 条评论
  • 4. Modelling the impact of new technologies

    4. Modelling the impact of new technologies

    New technologies inevitably generate impacts on the transport system and I will focus here on two of the most important…

  • 3 Equity and Environment

    3 Equity and Environment

    I discuss here how we can use transport models to address the challenges of reducing inequality and protecting the…

    5 条评论
  • The future is now a foreign land

    The future is now a foreign land

    2. The future is now a foreign land It became apparent around 2008, and crystal clear a couple of years later, that the…

    2 条评论
  • When the future changes do I adapt or do I change my models?

    When the future changes do I adapt or do I change my models?

    1. Introduction When John Maynard Keynes was accused of advising one thing one day and the opposite a year later, he…

    16 条评论
  • Are we ready to deal with covid uncertainty in travel forecasting?

    Are we ready to deal with covid uncertainty in travel forecasting?

    The title of this note is slightly unfair. Uncertainty has been with us since the beginning of time, but it became…

    8 条评论
  • Has covid killed our transport models?

    Has covid killed our transport models?

    The short answer is no, but we need to adapt them to be useful The pandemic has disrupted activities, the economy and…

    42 条评论

社区洞察

其他会员也浏览了