This Website is not fully compatible with Internet Explorer.
For a more complete and secure browsing experience please consider using Microsoft Edge, Firefox, or Chrome

Monte Carlo simulation

4. The Vampire Problem: A Numerical Solution

In the previous section, the vampire problem was solved analytically using Bayes' theorem. For this simple example, the calculation can be done directly. However, many realistic uncertainty quantification problems are too complex to solve by hand. In those cases, numerical methods become useful.

This section solves the same problem using Monte Carlo simulation. Instead of calculating the result directly from equations, we generate many synthetic observations from the assumed probabilistic model and then count the outcomes of interest.

 

 

The model

The simulation starts by generating an imagined population. The number of simulated people is defined by n. In this example, n <- 1e7 means that ten million samples are generated. R uses the <- operator for assignment.

Each simulated person is assigned as either a vampire or a mortal. The probability of being a vampire is set to 1%, so most simulated people are mortal.

The test result is then generated conditionally. If the simulated person is a vampire, the probability of a positive test result is 90%. If the simulated person is mortal, the probability of a positive test result is 8%.

In R, this can be represented compactly as:

library(tidyverse)

n <- 1e7
df <-
  tibble(vampire = rbinom(n, 1, p_vampire) == 1) |>
  mutate(pos = rbinom(n, 1, if_else(vampire, p_pos_if_vampire, p_pos_if_mortal)) == 1)

Here, tibble from the tidyverse library creates a data structure for holding the simulation results. The function rbinom is used to generate yes-or-no outcomes. The first use of rbinom creates n Monte Carlo samples that are vampires with probability p_vampire.

The pipe operator |> passes the output of the preceding operation as input to the next operation. This makes it possible to create a data processing pipeline by applying one operation after another.

The second use of rbinom generates the test result. The probability that a simulated test result is positive depends on the vampire status that was simulated in the previous line. This is handled using if_else: if the simulated person is a vampire, the probability p_pos_if_vampire is used; otherwise, the probability p_pos_if_mortal is used.

In R, numbers are treated as vectors. A scalar value is simply a vector of length one. This is why the same functions can conveniently work with both single values and many simulated samples.

The simulation produces samples from the joint probability distribution of two quantities:

  • whether the person is a vampire;
  • whether the test result is positive.

A glimpse of the raw results

The result of the probabilistic simulation is a table of simulated samples. Each row represents one simulated person, with two logical values: whether that person is a vampire, and whether their test result is positive.

For example, if we look only at the simulated vampires, the table might begin as follows:

# A tibble: 99,988 × 2 vampire pos <lgl> <lgl> 1 TRUE TRUE 2 TRUE FALSE 3 TRUE TRUE 4 TRUE TRUE 5 TRUE TRUE 6 TRUE TRUE 7 TRUE TRUE 8 TRUE FALSE 9 TRUE TRUE 10 TRUE TRUE 11 TRUE FALSE 12 TRUE TRUE 13 TRUE TRUE 14 TRUE TRUE 15 TRUE TRUE # ℹ 99,973 more rows

This is what we expect from the model: most vampires have a positive test result, because P(positive | vampire) was set to 90%. However, not all vampires test positive, because the test is not perfect.

If we look only at the simulated mortals, the table might begin as follows:

# A tibble: 9,900,012 × 2 vampire pos <lgl> <lgl> 1 FALSE FALSE 2 FALSE FALSE 3 FALSE FALSE 4 FALSE FALSE 5 FALSE FALSE 6 FALSE FALSE 7 FALSE FALSE 8 FALSE FALSE 9 FALSE FALSE 10 FALSE FALSE 11 FALSE FALSE 12 FALSE FALSE 13 FALSE FALSE 14 FALSE FALSE 15 FALSE FALSE # ℹ 9,899,997 more rows

Most mortals have a negative test result, because P(positive | mortal) was set to only 8%. However, because mortals are much more common than vampires, even a relatively small false positive probability can still create many positive test results among mortals.

Posterior probabilities

Once we have generated enough samples, we can estimate the conditional probability by grouping the results by test outcome and counting how often the person is actually a vampire in each group.

df |> group_by(pos) |> summarize(p_vampire = mean(vampire))

The group_by command groups the results by the test result. The vampire status is then summarised separately for each group. Because TRUE is treated as 1 and FALSE as 0 when taking a mean, mean(vampire) gives the proportion of vampires in each group.

The result is approximately:

# A tibble: 2 × 2 pos p_vampire <lgl> <dbl> 1 FALSE 0.00111 2 TRUE 0.102

This means that, among simulated people with a positive test result, only about 10.2% are actually vampires. The numerical result P(vampire | positive) is therefore very close to the analytical result from Bayes' theorem.

Results

The numerical simulation confirms the earlier analytical calculation. Even though the test identifies vampires correctly with high probability, a positive test result is still weak evidence because vampires are rare in the population.

The important lesson is that P(positive | vampire) is not the same as P(vampire | positive). A test can be reliable in one direction, while still giving a low probability that a positive result identifies the rare condition of interest.

Even though the test itself seems fairly reliable, it is arguably useless for separating vampires from mortals. The reason is not that the test has no information, but that the event being tested for is rare and the false positives among mortals dominate the positive test results.

Don't drive a stake through somebody's heart based on the result of this test.

This idea is directly relevant to engineering reliability assessment. When looking for rare events, such as rare defects or uncommon failure modes, the base rate matters. A model or test may appear reliable, but false positives can still dominate the result if the event is sufficiently rare.