Mambo_5_Bayesian
Can we use Bayesian statistics to estimate when Mambo Number 5 appeared based on the names of the women that appear in the song?
Bayesian inference
We can answer the above question using the Bayes formula:
where the posterior distribution
We can use the frequency of baby names for each year to calculate these probabilities. However, it is important to note that while Mambo No. 5 was released in 1999, it would be wrong to assume that we only need to look at popular baby names in 1999. Lou Bega wasn’t meeting newborn infants but was most likely referring to 20–35-year-old women—so we should be looking at the popularity of names between 1960 and 1980. Let us describe the age of the women as
For simplicity, I will assume that all women are of the same age ranging from 20 to 35 years old. Therefore, the probability of a name given the release and the specific age:
For all the names:
Assuming a uniform probability distribution for the ages, we are saying that there is an equal probability that a woman is 20 or 30:
And we can marginalize over the ages using the law of total probability:
and our posterior distribution is then:
where we assume that the prior