I might be one of the few quants who suck at math. Or maybe there’s loads of us and we’re just hiding in shame.
So I always try to find intuitive ways to understand everything from probabilities to options.
One of the most important and fun ways to solve some, if not all, math problems is simulation. I have lately been in love with the concept of simulation as it takes us into a joyride of game-like methods.
While I was thinking about this, I remembered some exercises I did back in the day to force home intuition about unrelated things. Let’s discuss a couple of them here today:
Estimating the value of pi using simulation.
It’s impossible to think of pi without thinking of a circle: both its area and its perimeter have a factor of pi. if we assume a circle with diameter d
, a square it can be inscribed in will have a side of length d
as well.
Now the area of such a circle = pi * (d/2)^2
.
And the area of such a square = d^2.
So we can see that pi/4 is simply the area of the circle divided by the area of the square, right?
Next, we have to find a way to calculate the areas of the circle without using pi since that’s what we’re trying to estimate in this exercise.
To do this, we’re going to do something that blew my mind when I first read about it or did it, and hopefully it does yours in too.
We’re going to imagine a square dartboard (I know), and we’re going to throw some darts on to it. If we throw enough darts on it to cover the entire surface area of this square-ass dartboard, the number of a darts landing inside the circle should be the area of the circle assuming the dart is a point thick, isn’t it?
If this didn’t blow your mind, I apologize. It did mine.
Code is in R and pretty much self explanatory, but you’re free to go through the full markdown linked at the end of the page.
num_iterations <- 1000000
x <- runif(num_iterations, min = -5, max = 5)
y <- runif(num_iterations, min = -5, max = 5)
df <- tibble(x, y)
df %>%
mutate(distance_from_origin = (x^2 + y^2)^0.5,
is_inside_circle = case_when(distance_from_origin > 5 ~ F,
T ~ T)) %>%
summarise(proportion = sum(is_inside_circle) / num_iterations,
area_of_square = 10 * 10,
area_of_circle = proportion * area_of_square,
pi = area_of_circle * 4 / area_of_square)
# A tibble: 1 × 4
proportion area_of_square area_of_circle pi
<dbl> <dbl> <dbl> <dbl>
1 0.785 100 78.5 3.14
Probability Questions
If Bayes’ Theorem makes your head go bonkers, welcome to the club. I see a lot of probability questions on Twitter and they’re mostly always solved using Bayes’ Theorem. Quite a few of them (at least the easiest ones) can be solved by just simulating the question in a program.
Like so:
num_iterations <- 1000000
# heads is 1, Tails is 0
toss1 <- sample(c(0,1), num_iterations, replace = T)
toss2 <- sample(c(0,1), num_iterations, replace = T)
toss <- tibble(toss1, toss2)
toss %>%
mutate(total_toss = toss1 + toss2) %>%
summarise(total_tosses = n(),
total_wins = sum(total_toss == 2),
probability = total_wins / total_tosses)
# A tibble: 1 × 3
total_tosses total_wins probability
<int> <int> <dbl>
1 1000000 249350 0.249
Monty Hall Problem
Or for the Indian context, the Aman Verma paradox.
Now I obviously can’t explain it as well as Michael explains this (and everything else in the world tbh), but I can run trials, or… simulations.
If you don’t know the rules of the game (and didn’t watch Michael being awesome as usual), they’re as follows:
You’re on a game show and Monty Hall / Aman Verma asks you to choose one out of three doors. Behind one of the doors is a car or whatever, and behind the other two doors is shit or whatever. Once you’ve picked a door, Monty / Aman then opens one of the other doors, which has shit behind it (duh, why would he open the one with the car in it that’d be so dumb lol).
Now comes the real question: You’re offered a choice to switch from your currently picked door to the other closed door. Would you or wouldn’t you switch?
The answer is paradoxical because statistically (as shown below via simulations) you should switch as it gives you the better odds. But odds only play out over a loooong enough horizon and the one time you’re allowed to play the game may or may not be the winner. Anyhoo, here’s the simulation for the Monty Hall Problem:
num_iterations <- 1000000
doors <- c('A','B','C')
door_picked <- sample(doors, num_iterations, replace = T)
car_behind <- sample(doors, num_iterations, replace = T)
mhall <- tibble(picked = door_picked, winner = car_behind)
mhall %>%
mutate(picked_is_winner = case_when(picked == winner ~ T,
T ~ F),
opened_door = case_when(picked_is_winner & picked == 'A' ~ sample(c('B', 'C'), 1),
picked_is_winner & picked == 'B' ~ sample(c('A', 'C'), 1),
picked_is_winner & picked == 'C' ~ sample(c('B', 'A'), 1),
picked %in% c('A', 'B') & winner %in% c('A', 'B') ~ 'C',
picked %in% c('A', 'C') & winner %in% c('A', 'C') ~ 'B',
picked %in% c('C', 'B') & winner %in% c('C', 'B') ~ 'A'),
switched_door = case_when(picked %in% c('A', 'B') & opened_door %in% c('A', 'B') ~ 'C',
picked %in% c('A', 'C') & opened_door %in% c('A', 'C') ~ 'B',
picked %in% c('C', 'B') & opened_door %in% c('C', 'B') ~ 'A'),
switched_winner = case_when(switched_door == winner ~ T,
T ~ F),
original_winner = case_when(picked_is_winner ~ T,
T ~ F)) %>%
summarise(prob_win_sticking_with_original = sum(original_winner) / num_iterations,
prob_win_switching = sum(switched_winner) / num_iterations)
# A tibble: 1 × 2
prob_win_sticking_with_original prob_win_switching
<dbl> <dbl>
1 0.333 0.667
Until next time.