PhD research with a choice modelling context is carried out across different schools at the University of Leeds. For initial enquiries, relating to both topics and funding opportunities, please send an e-mail to General Enquiries from where your query will be directed to potential supervisors.
Specific funding opportunities are announced on this site as and when they arise.
Current potential topics include, arranged by school of lead supervisor
Institute for Transport Studies
Improving the understanding and prediction of human behaviour over time
Accurate understanding and forecasting of human behaviour is a core challenge in many disciplines. Predicting whether people will be infected by a disease, buy a product or choose to travel more sustainably is the focus of many researchers; effort to develop accurate mathematical and statistical models to inform health, marketing and transport policies, to name a few. Such models are more likely to be able to capture causal mechanisms between outcomes if they make use of longitudinal data, i.e. data recording repeated observations (outcomes, choices) over time.
Models to understand and forecast human behaviour over time have been developed in multiple disciplines. In Mathematics, these models are structurally flexible and have strong forecasting accuracy but they lack foundation in behavioural theories. Differently, in Econometrics, simpler models are used which do not fully exploit the richness of the data, but meaningful economic measures for policy making are derivable for their results. This project aims to apply some of the forecasting methods and flexible specifications of mathematical models to the framework typically used in transport research, with the aim of improving the quality of the models and accuracy of predictions without renouncing the convenient foundation in Utility Theory. The results of this work will not only benefit transport research but also environmental and health research by providing a better toolkit for understanding and predicting behaviours and outcomes in the long term.
While the advantages of panel data are well-known to many scholars and practitioners, the exploitation of the richness of information they provide is limited. For example, transport researchers have used such data to investigate car ownership over the life course using mainly qualitative, descriptive or relatively simple statistical methods. One of the concepts that the literature mainly relies on is state dependence, i.e. the idea that current car ownership can be explained by car ownership in the past. While this is found to be a significant factor in modelling results, such an approach does not help us understand why the observed levels of car ownership have been maintained over the years. Indeed, the factors which more likely explain long-term car use (such as personal characteristics or attitudes) are generally only used to explain car ownership at each point in the panel, but not the overall trend. For example, a person’s income in a given year is used only to explain whether they own a car that year. More advanced methods providing alternative frameworks (e.g. Composite Marginal Likelihood) have been developed in the last decade, but their complexity and steep learning curve to apply them resulted in limited uptake.
Despite these limitations, the methods used in transport research are grounded in well-established behavioural frameworks (such as Random Utility Theory), ensuring the ability to derive economic measures highly valuable to policy makers, such as willingness to pay for goods and services. For example, a longitudinal dataset of rail journeys including ticket prices could be used to evaluate whether people would be willing to pay a given amount for a given trip in the future.
The same is not true for other disciplines. In Mathematics, longitudinal data models are well-developed. Differently from transport researchers, mathematicians do not limit themselves to model structures which are derived from recognised theories of behaviour and produce flexible structures which are aimed at tackling the specific research question at hand. Such models are used, for example, for predicting disease outcomes and hospital admissions and present good forecasting performance, but economic measures are not easily derivable.
On the other side, forecasting from such models is believed to be more accurate. Not only it is common practice to consider exogenous factors (i.e. external from the elements captured in the model) but also a range of techniques are used such as regarding longitudinal data as functional data (a curve, rather than a data point) and use these curves to perform predictions in the near future.
Against this backdrop, this project aims to incorporate some of the flexibility and forecasting tools of mathematical models into transport models, without deviating from the behavioural foundations of the latter, so that economic measures for policy-making can still be obtained. This will lead to the development of a better methodological toolkit for modelling panel data in transport and other fields where forecasting human behaviour is a priority, for example environment and health.
More specifically, the PhD candidate will first produce a detailed review of the existing literature on modelling panel and longitudinal data, in order to provide an overview of the different modelling methodologies across different fields – an effort which would benefit all researchers working with data collected over time. This first activity will also allow the candidate to identify the most desirable aspects of the different approaches and work on a framework to incorporate them. Two case studies will be then used to demonstrate the new methods, with different fields of application (e.g. transport and health) using open-access datasets which have been identified by the supervision team.
The supervision team
The supervision team is made up of two experts in behavioural modelling which bring the diversity of perspective needed to tackle this project. It includes an Economist working in the area of Choice Modelling who has experience with modelling joint outcomes in panel data and a Mathematician whose main area of work is longitudinal data analysis.
Chiara Calastri is a Lecturer in rail economics at the Institute for Transport Studies and a member of the Choice Modelling Centre at the University of Leeds. Her work focuses on human decision making, especially in the fields of travel behaviour, time use and social interactions. She has made contributions in the areas of modelling transport choices with complex data sources, including longitudinal and panel data.
Haiyan Liu is a University Academic Fellow at the Department of Statistics at the University of Leeds and an Alan Turing Research Fellow at Alan Turing Institute. She focuses on developing statistical models for time series, functional, longitudinal, and survival data and application is mainly in ecology, fintech, medical science, and psychology.
Modelling ‘difficult’ choices combining virtual reality, physiological and neurological data
Choice modellers aim to build mathematical models that predict and forecast which alternative(s) an individual will choose in a given scenario. The key aim of these models is to understand the key factors that lead to different individuals making different choices. Increased globalization and migration have however led to marked diversity among the decision-makers who not only have different sensitivity to the factors influencing the choices (e.g. time and cost in the context of travel mode selection), but also decision-rule heterogeneity, where the way in which individuals think and make decisions may differ. For example, there have been psychological theories put forward to explain how and why Asians and Westerners think differently and can make substantially different choices (e.g. Nisbett 2004). These differences are particularly prominent in the case for ‘difficult’ decisions where the alternatives cannot be categorized as ‘right’ or ‘wrong’ and/or there is inherent uncertainty in the outcomes of a choice. For example, during the COVID-19 pandemic, significant heterogeneity has been observed around the world and among different ethnic groups in terms risk-taking propensity which in turn affected the shift in activity and travel patterns throughout this year (Dryhurst et al. 2020); country-level variations have been observed regarding what is the ‘right’ decision in the so called ‘moral machine’ choice setting, where an individual must choose who to save if an autonomous vehicle (AVs) hypothetically were to crash (Awad et al. 2020).
To understand individual preferences, choice modellers have historically relied on stated preference (SP) surveys and responses to questionnaires on psychometric scales. But these are subject to hypothetical biases and measurement errors. These biases become more influential in the case of difficult/risky/moral decision-making, as decision-makers likely do not face the consequentiality of their choices and additionally may not wish to appear immoral. In our early research in this area, we have demonstrated how physiological data can be used to help understand decision-making. For example, Paschalidis et al., 2018, demonstrate integrating physiological response data with driving choice data can better capture the effect of stress on driving decisions; Bogacz et al. (2019) demonstrate using electroencephalogram (EEG) data to understand risk perception when cycling in a virtual reality (VR) setting. A key advantage of physiological data is that it can sidestep hypothetical biases by providing outputs that cannot be controlled by the decision-maker (i.e. heart-rate).
The aim of this project is to advance this further, both methodologically and empirically, by using physiological data combined with choice data in a wide range of VR representation of real-world scenarios to obtain more reliable forecasts of how individuals respond to risky or moral choice scenarios. This will allow us to better understand human reactions to risky situations with uncertain outcomes which can be useful for artificial intelligence based policy planning. They can also help to better predict choices involving moral aspects, for example, policies to promote the uptake of electric vehicles or to encourage carbon neutrality through carbon offsetting schemes.
Modelling transport and energy choices combining machine learning and econometric techniques using emerging big data sources
Transport choice models have historically relied on manually collected survey data which are expensive to obtain and generally have limited sample sizes and lower update frequencies. They are also prone to biases and reporting errors. This has led to a growing reliance on data from surveys presenting people with hypothetical choices, which are themselves subject to different biases.
On the other hand, over the last decade, passively collected big data sources have emerged as a very promising source of mobility and energy usage data for researchers and practitioners. These include GPS traces, mobile phone records, bank and loyalty card transactions and geo-coded social-media data. But the application of these data have been primarily limited to visualizations and development of machine learning based predictions. The machine learning techniques for analyzing big data are however very often data-driven and lack behavioural underpinning which questions there applicability in predicting behaviour in radically different future scenarios.
In our recent research, we have developed methodologies to enhance mode choice models with large-scale GPS data (Calastri et al. 2017) where we proved how the panel nature of the data can be utilized to capture additional behavioural complexity. Further, we have developed methodologies to successfully use mobile phone data in modelling trip generation (Bwambale et al. 2017) and route choice behavior (Bwambale 2018). This research proposes to build on these novel methodologies and combine mobile phone and GPS data with data from traditional sources (household surveys, census, roadside interviews and sensor counts) to further enhance this research direction which is not only academically novel and challenging but also capable of producing results that have real world benefits. In particular, combining the choice modelling and machine learning techniques will be investigated and solutions will be formulated. Further, the potential to apply these types of big data in jointly modelling activity durations and patterns beyond transport will be investigated which can play a crucial role in developing next generation energy consumption models.
The effective combination of machine learning and choice modelling will enable us to make better use of emerging data and enable us to develop better and more comprehensive models, which can be updated more regularly and without new expensive data collection, allowing analysts to produce up to date results for the better evaluation of new transport and energy policies. Furthermore, the new methodologies will be immensely useful for smaller authorities or poorer countries were data and resource limitations are major barriers for developing transport and energy models.
Demand for electric and hybrid cars - modeling vehicle replacement and type choice
Supervisor: Prof. Gerard de Jong
Electric and hybrid cars could contribute substantially to the required reduction in emissions and dependence on fossil fuels. The technology to do this is there. Crucial issues for the market penetration of this new technology are:
- How fast will consumers replace their current cars?
- How many of the replacement cars will be electric and hybrid cars?
Behavioural data on these important issues are largely missing. This project will develop a combined revealed/stated preference household survey which includes:
- Questions on actual attributes of he households, its persons and its cars;
- Retrospective questions on the car ownership history of the household;
- Stated choice experiments on car type choice, including attributes that are specially relevant for electric and hybrid cars, such as fuelling range, top speed and luggage space.
As part of this project, the questionnaire will be used to interview several hundreds of UK households. The resulting data will then be used to estimate models for:
- The timing of vehicle replacement (hazard-based duration models or Markov models) and other changes in the household car ownership status (e.g. moving to more cars or fewer cars);
- Vehicle type choice (discrete choice models, including mixed logit), focussing on electric and hybrid cars;
- Vehicle use.
Finally, the estimated models will be used to carry out policy simulations, such as on the effect of measures to accelerate replacement (e.g. scrappage schemes), subsidies on the purchase of electric and hybrid cars, and emission taxes.
Jong, G.C. de (1996); A disaggregate model system of vehicle holding duration, type choice and use; Transportation Research B, 30-4, pp 263-276.
Jong. G.C. de, J. Fox, A.J. Daly, M. Pieters and R. Smit (2004); Comparison of car ownership models, Transport Reviews, 24-4, pp 379-408, 2004.
Rashidi, T.H., K. Mohammadian and F. Koppelman (2009); A dynamic hazard-based structural equations model of vehicle ownership with endogenous residential and job location changes incorporating group decision making; Paper presented at the International Choice Modelling Conference 2009, Harrogate.
Are we modelling the wrong thing: differences between the psychologists' and the modellers' view of behaviour
Mathematical models representing human behaviour are used extensively in the field of transport and beyond. These models are used to analyse existing choices and forecast likely behaviour in a changing environment, e.g. the provision of new transport facilities, the introduction of new electricity pricing structures or the building of a new hospital.
To a large extent, these models are based on a compensatory approach, in which a person is assumed to make choices by trading off different attributes against one another. As an example, one mode of travel may be faster, but an alternative mode is cheaper; one train will get us to work on time, but the slightly later train is considerably less congested. The values of the different attributes of an alternative all affect that alternative’s probability of being chosen, where the negative effect of one attribute may be cancelled out by the more positive effect of another attribute.
A different view of behaviour however exists in various strands of the mathematical psychology literature. Here, evidence suggests that at some people do not in fact engage in compensatory evaluation of alternatives, but make use of various alternative heuristics to arrive at their choices. This could for example involve lexicographic behaviour, the existence of reference points or the presence of thresholds in sensitivities or tolerances.
The aim of this PhD project is to first revisit the limited amount of existing work contrasting and combining the often disparate methodologies from the fields of economics and mathematical psychology. In-depth studies will then be conducted to investigate under which circumstances the assumptions made in traditional approaches may not be justified. Ultimately, the aim is to expand the existing methodological framework to be able to adequately represent decision making processes that are well established in the mathematical psychology literature, but which are largely ignored in the modelling field. By better understanding and representing the underlying behavioural structures, the project will seek to enhance the predictive power of models used to plan the provision and usage of scarce services and resources (such as healthcare, energy and transportation).
While the topic is concerned with the interface between psychology, economics and mathematics, the proposed research will be highly methodological in its nature, and a strong quantitative background will be expected from the student. Some programming skills will be also be desirable.
References – suggested reading
Batley, R. and Daly, A. (2006) On the equivalence between elimination-by-aspects and generalised extreme value models of choice behaviour. Journal of Mathematical Psychology, 50 (5), pp456-467.
Batley, R. and Toner, J. (2003) Elimination-by-aspects and advanced logit models of stated preferences for alternative-fuel vehicles. Proceedings of the European Transport Conference, Strasbourg, October 2003.
Hess, S., Rose, J.M. and Polak, J.W. (2009), Non-trading, lexicographic and inconsistent behaviour in stated preference data, Transportation Research Part B, forthcoming.
Simon, H.A. (1959) Theories of decision-making in economics and behavioral science. American Economic Review, 49 (1), pp253-283.
Train, K.E. (2003) Discrete choice methods with Simulation, Cambridge University Press, Cambridge, MA.
Tversky, A. (1972) Elimination by aspects: a theory of choice. Psychological Review, 79 (4), pp281-299.
Developments in experimental design for stated choice surveys and alternative preference elicitation procedures
The analysis of travel behaviour requires as its main input data on travel decisions (choices) made by individual respondents. However, in many situations, data on real world choices is either not available or is not suitable for the purposes of the proposed analysis. As a result, an increasing number of studies rely on data collected through surveys which present respondents with hypothetical choice scenarios. Data from such stated preference (SP) surveys are used not only in academic work but also form the backbone of many studies advising policy makers in scenarios as wide ranging as the building of new roads, the introduction of road pricing or the investment in new rolling stock.
The majority of work using SP methods now makes use of stated choice (SC) surveys, in which respondents are asked to choose their most preferred option amongst a set of mutually exclusive alternatives. Approaches such as ranking or rating exercises have been largely discredited in a transport context, as have transfer price methods, which aim to directly obtain the willingness by respondents to pay for developments or improvements. However, outside a transport environment, these methods are experiencing a renaissance, and new developments, such as best-worst, a halfway measure between choice and full ranking, are gaining in popularity. At the same time, in transport and elsewhere, researchers are constantly devising new methods to improve the efficiency of the various available survey techniques. The net outcome is that there is substantial confusion at the user end, with practitioners often unsure which approach would be most applicable in their given context.
The aim of this PhD project would be to conduct an in depth comparison of the different available methods, highlighting which approaches are most adequate in what context. Additionally, the work would look at the potential for combining various existing methods. Finally, where appropriate, further methodological developments would be made.
Louviere, J.J., Hensher, D.A. and Swait, J.D. (2000) Stated Choice Methods: Analysis and Application. Cambridge University Press.
PTRC (2000) Stated Preference Modelling Techniques. A compilation of major papers from PTRC’s meeting and conference material. Edited by J de D Ortuzar. PTRC, London.
Mode and shipment size choice models on data for individual shipments
Supervisor: Prof. Gerard de Jong
Mode choice in freight transport is usually studied in isolation. However, mode and shipment size are closely linked decisions. Large shipment sizes usually coincide with higher market shares for non-road transport, whereas there is a high correlation between road transport and small shipment sizes. Decisions on shipment size (or delivery frequency) need to be studied taking a logistics approach (e.g. reducing inventories by more frequent, just-in-time deliveries) that encompasses the more limited transport costs approach.
The Swedish 2004-2005 Commodity Flow Survey (CFS) is a unique data source in Europe. It details about more than 2.5 million individual shipments to or from a company in Sweden, with information on origin, destination, modes used, weight and value of the shipment, sector of the sending firm, commodity type, access to rail tracks and quays, etc.. Whilst the US Commodity Flow Survey has been analysed several times, its Swedish counterpart has barely been used for model estimation so far. Using this Swedish CFS, mode and shipment size choice at the individual shipment level can be explained from characteristics of the shipper, the shipment and transport time and cost on the networks.
Earlier work at ITS Leeds used the CFS 2001 to estimate mode and shipment size models. Multinomial land nested logit models were estimated on the CFS 2004-2005 in a Master Thesis project at Delft University of Technology in The Netherlands.
This PhD project will extend the models estimated so far on the CFS 2004-2005 in many ways:
- estimation of different models for different commodity classes (observed heterogeneity)
- estimation of models with different transport and logistics costs functions
- estimation of mixed logit models following the random coefficients specification to account for unobserved heterogeneity
- estimation of models where shipment size is treated as a continuous variable instead of discrete size classes, simultaneously with (discrete) mode choice.
Furthermore the project will look into the implications of these modelling options for the value of time and freight demand elasticities – the model outputs that are typically used to evaluate transport policies.
Jong, G.C. de and M.E. Ben-Akiva (2007) “A micro-simulation model of shipment size and transport chain choice”, Special issue on freight transport ofTransportation Research B, 41, pp. 950-965, 2007.
School of Earth and Environment
Public preferences for invasive versus non-native species: a transnational study
Across the European Union, some 12,000 non-native species have become established, some of which can have significant impacts on native biodiversity, agriculture, forestry, fisheries and infrastructure. For example, the economic cost to Europe of invasive alien species has been estimated at €12.5 to 20 billion per year, and governments across the continent spend many billions on prevention, control and eradication programmes.
However, many invasive species, such as the grey squirrel in the UK, or any of the 13 non-native parrot species now found in Europe, can attract strong public support and concern, representing a complex socio-environmental conflict. Policy calls for culling and control are often strongly opposed even though the ecological and economic cases can be compelling. There is therefore a strong case to examine the general public’s preferences for invasive species in order to inform and set policy responses to the species. One common way of quantifying preferences in such complex situations is to design and deliver choice experiments.
The aim of this PhD will be to use choice modelling to quantify public preferences for invasive species across different countries in Europe. In particular, the project would investigate: (1) Across multiple European countries, do the general public prefer native or non-native species? (2) How do these preferences vary according to the characteristics of the species concerned? (3) Invasive species are often localised in occurrence. How might public preferences vary according to people’s individual experiences of the species concerned?
Academic Unit of Health Economics
Patient choice in healthcare - do we really know what is best for us?
In order to facilitate shared decision making, patients in England are being offered more information and choice regarding their healthcare. This includes more information on treatments and hospital and health care professional (HPC) performance and more choice over treatments and over which hospitals and HPCs will provide healthcare. In making choices, particularly over treatments, patients often have to read and interpret relevant information, interpret the associated risks and accurately imagine treatment benefits and side effects such that their choices are consistent with their current and future preferences. Whether or not patients make the ‘correct’ or optimal decisions depends on their ability to interpret information and predict the benefits and harms of treatments.
This increased choice has significant implications, not only for healthcare resource management, but patient satisfaction and outcomes and potentially for issues such as treatment adherence. Using stated and/or revealed preference analysis methods, the proposed PhD research will explore the extent to which individuals are able to make optimal choices about healthcare and determine the impact on this ability of factors such as educational level, mood, experience and attitudes to risk. The research will provide important insights into the way information and choices are presented to individuals and potentially has significant policy implications.