In this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico.
We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.