Motivation

  • Driving is a complex tasks humans do on a daily basis
  • It involves
    • motor control (e.g., steering)
    • path finding (e.g., home to work)
    • attentional processes and object recognition
  • Difficult to investigate these aspects on the real road
    • \(\Rightarrow\) Driving simulators!

Driving simulators

Driving simulators

  • Driving simulators offer a variety of benefits:
    • controlled and standardized environment
    • ease of collecting high quality data (distance to other cars, reaction time, etc.)
    • can simulate dangerous situations without harming participants
    • provide a means to test novel user-interfaces and assistance systems
    • overall, have a high cost-effectiveness (Winter et al., 2009)

Driving simulators

  • Driving simulators are machines of extreme technical complexity (Schöner & Morys, 2015)
  • Various aspects influence the validity of driving simulators
    • Physical validity: degree to which the simulator reproduces physical reality \(\checkmark\)
    • Behavioural validity: degree to which driving behaviour in simulator corresponds to real road \(\checkmark\)
    • Do the computer-simulated drivers behave like real humans? Current question!

Modeling drivers

  • There exist different approaches to modeling traffic (van Wageningen-Kessels et al., 2014)
    • Macroscopic traffic simulations
    • Microscopic traffic simulations
      • Specify certain parameters for the drivers (e.g., desired velocity)

  • How do we calibrate the parameter values and model differences among drivers?

Modeling drivers: Empirical basis

  • Assumes six distinct driver profiles based on studies (total \(N = 10.557\)) conducted between 1972 and 1992 in Germany and Switzerland (Hürlimann, 1996)

  • 3 categories (Speed, Safety, Consideration) with 9 parameters each
  • Calibration based on data by 266 Daimler employees

Overview: Studies

  • Try to answer two broad questions
    • (a) Can we recover the driver profiles based on simulated data? How similar are the profiles to each other?
      \(\Rightarrow\) Simulation-based study

    • (b) Are the new driver profiles an improvement? In general, what do participants think about the traffic simulation?
      \(\Rightarrow\) Experimental study

Simulation-based study

Setup

  • For every driver profile: 50 data sets (75.9km) for low, medium, and high traffic density

  • A data set consists of measurements of
    • Speed
    • Headway
    • Acceleration
    • Number of lane changes
    • Total amount of time spent on each lane

  • This is a classification problem with several features
    \(\Rightarrow\) Random forests!

Example: Speed

Classification Trees

  • To illustrate: short botanical excursion to the Iris data set (Fisher, 1936)

Classification Trees

data(iris)

head(iris, 10)
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1           5.1         3.5          1.4         0.2  setosa
## 2           4.9         3.0          1.4         0.2  setosa
## 3           4.7         3.2          1.3         0.2  setosa
## 4           4.6         3.1          1.5         0.2  setosa
## 5           5.0         3.6          1.4         0.2  setosa
## 6           5.4         3.9          1.7         0.4  setosa
## 7           4.6         3.4          1.4         0.3  setosa
## 8           5.0         3.4          1.5         0.2  setosa
## 9           4.4         2.9          1.4         0.2  setosa
## 10          4.9         3.1          1.5         0.1  setosa

Classification Trees

Random Forests

  • Random Forest
    • average over many classification trees (e.g., Strobl, Malley, Tutz, 2009)
    • also allow for clustering (Liaw & Wiener, 2002)

Prediction results

  • Leave-one-out cross-validation
  • Prediction accuracy decreased with increased density
    • Low density: 89% (11%)
    • Medium density: 84% (14%)
    • High density: 64% (18%)

Clustering: Low traffic density

Clustering: Medium traffic density

Clustering: High traffic density

Conclusion

  • Could distinguish the driving profiles reasonably well under low and medium traffic density
    • except for aggressive and sporty
  • Discrimination much worse under high traffic density
    • Not unreasonable since some features become less distinguishing (e.g., desired velocity)
    • If discriminability is desirable in these settings, the driving model requires extension

  • Concerned solely with "internal consistency"
    • Given some simulated data and features, can we recover the driver profiles?
    • Does not say anything about whether humans would distinguish the profiles
      [or whether they find them realistic at all!]

Experimental Study

Setup

  • A previous version of the driving simulation did not model individual differences
    • I call this version the "old" version

  • Experiment focused on the aggressive and calm driver profiles
    • Are they "better" than the previous version of drivers?

  • \(32^{\star}\) participants took part in a driving simulator study at the FKFS in Stuttgart

Setup

Setup

  • Quantitative part
    • Instruction: Cars either being driven by a human or a computer
    • Two questions after each scenario
      • Was the driver a human or a computer? (0, 1)
      • How realistic was the driving behaviour? (1 - 7)
      • [Simulator sickness]

  • Qualitative part:
    • free driving
    • subjects were instructed to think aloud and comment on what's happening

Hypotheses

  • In comparison to the old driver profiles, the behaviour of aggressive and calm drivers should be
    • judged more frequently as being human (0, 1)
    • judged as being more realistic (1 - 7)

  • These are directed hypotheses
  • Note also that they are only with respect to overtaking on a high-way

Raw data: Human or Computer

Raw data: Realism judgements

Analysis

  • Used Bayesian logistic and ordinal regression models to analyze the data
  • Computed Bayes factors as a continuous measure of evidence (e.g., Morey, Romejin, & Rouder, 2013)

\[ \begin{align*} \underbrace{\frac{p(\mathcal{M}_1 \mid y)}{p(\mathcal{M}_0 \mid y)}}_{\text{Posterior Odds}} = \underbrace{\frac{p(y \mid \mathcal{M}_1)}{p(y \mid \mathcal{M}_0)}}_{\text{Bayes Factor}} \times \underbrace{\frac{p(\mathcal{M}_1)}{p(\mathcal{M}_0)}}_{\text{Prior Odds}} \end{align*} \]

  • Requires computing the integral \(p(y \mid \mathcal{M}) = \int_{\Theta} p(y \mid \mathcal{M}, \theta) p(\theta \mid \mathcal{M})\theta\)
  • Can all be done in brms!

Model comparison: Results

Model comparison: Results

  • Used weakly informative Cauchy priors for the predictors (Gelman et al., 2008)
  • Compared to the old version of the driver profiles, and specific to the situation we tested, …
    • strong evidence that aggressive drivers a more frequently perceived as human, \(BF_{+0}^a = 22.43\)
    • anecdotal evidence that calm drivers are more frequently perceived as human, \(BF_{+0}^c = 1.15\)
  • Anecdotal evidence for aggressive drivers being perceived as more realistic, \(BF_{+0}^a = 2.03\)
  • Anecdotal evidence for calm drivers being perceived as more realistic, \(BF_{+0}^c = 1.17\)

Qualitative Part

  • 15 minutes of free driving, think aloud protocol
  • Points of improvement
    • cars drove in their lanes too perfectly
    • headway to the participant and each other was unrealistically large
    • no critical occurances
    • drivers used the indicator too frequently
    • trucks rarely overtook each other
    • pull to drive on the right lane was too strong, resulted in unrealistic overtaking

Conclusion

  • Aggressive driver profile is perceived as more human-like than old profiles
    • Most likely due to the increased speed of the aggressive driver ("computer would follow norms")

  • Some Issues
    • Experimental design incorporated no ground truth
    • Comparisons are very situation-specific
    • Participants received very little information about the other drivers
    • Realism is certainly not unidimensional, high variance in its interpretation
    • What, in a modern car, is the difference between a human and a computer driver?

  • Different experimental designs probably more appealing
    • Videos of certain situations shown with all driver profiles
    • Participants have to rate which profile is most likely given the behaviour displayed

Materials

  • All materials available from https://osf.io/dw5hr/
    • Includes all data, analysis code, slides, thesis manuscript
    • Includes also an appendix on "Some Issues in Psychological Science"
      • Written for applied researchers, provides an
        • overview of the replication crisis
        • an introduction to Bayesian statistics
        • and three practical recommendations for applied research

Thanks for your attention!

Appendix

Agent Simulation: Implementation (Detail)

Agent Simulation: Implementation (Detail)

Agent Simulation: Implementation (Detail)

Agent Simulation: Implementation (Detail)

What Parameters did you use in Random Forest?

  • Random Forests have two parameters
    • number of trees (selected 2000)
    • number of predictors randomly sampled (default: \(\sqrt{p}\))
  • Random Forests are generally robust w.r.t the specification (Liaw & Wiener, 2002)