Mambo_5_Bayesian

Can we use Bayesian statistics to estimate when Mambo Number 5 appeared based on the names of the women that appear in the song?

Bayesian inference

We can answer the above question using the Bayes formula:

Pr(YR|N)=Pr(N|YR)Pr(YR)Pr(N),

where the posterior distribution Pr(YR|N) describes the probability of the release year (YRgiven the nine names N=Angela,Pamela,Sandra,Rita,Monica,Erica,Tina,Mary,Jessica that appear in Mambo No. 5. On the right-hand side, the likelihood function Pr(N|YR) describes the probability of observing the names conditioned on YR. Finally, Pr(YR) is our prior probability distribution, and Pr(N) is the marginal distribution.

We can use the frequency of baby names for each year to calculate these probabilities. However, it is important to note that while Mambo No. 5 was released in 1999, it would be wrong to assume that we only need to look at popular baby names in 1999. Lou Bega wasn’t meeting newborn infants but was most likely referring to 20–35-year-old women—so we should be looking at the popularity of names between 1960 and 1980. Let us describe the age of the women as α, such that we can back-calculate the birth year (YRαi). For example, if we assume that the release year is 1980 and the women are 20 years old, we would be looking at the popularity of girls’ names in 1960.

For simplicity, I will assume that all women are of the same age ranging from 20 to 35 years old. Therefore, the probability of a name given the release and the specific age:

Pr(N=Ni|α=αj,YR)=Pr(Ni,YRaj).

For all the names:

Pr(N|α=αj,YR)=i=19Pr(Ni,YRaj).

Assuming a uniform probability distribution for the ages, we are saying that there is an equal probability that a woman is 20 or 30:

Pr(a=aj)=13520=116.

And we can marginalize over the ages using the law of total probability:

Pr(N|YR)=116aj=aminamaxi=19Pr(Ni,YRaj),

and our posterior distribution is then:

Pr(YR|N)116Pr(Y)aj=aminamaxi=19Pr(Ni,YRaj),

where we assume that the prior Pr(Y) is a uniform distribution ranging from 1980 to 2005.