Importance of the population size.
If you take at random 50 people, what is the chance you find 11 of them that make together an excellent soccer team ? Nil, nada, zip, none whatsoever.
Let’s enlarge the sample to, say, 5000. Perhaps you will find 11 of them that can decently handle a soccer ball with their feet. But a world top team ? No way.
So how many people do you need before you get a team that has more than a glimpse of a chance to win the world soccer championship ? Right ! you need the population of one of the bigger countries.
What ? You mean that the bigger the country, the larger the possibility of winning the world soccer championship ?
Or, put otherwise : is the soccer quality of a country determined by the number of its inhabitants?
Let us do the test.
We need two things :
first: the soccer quality of each country. One of the best numbers that comes near is the FIFA-points. It is freely available on the FIFA website.
Ssecondly the number of inhabitants. Various sources on the nternet give the response to that.
So let’s look at the chart for the 100 highest ranked countries on the FIFA ranking list (I left out China, because it is definitely an outlier !) :
I used the logaritmic transformation of both the FIFA points and the number of inhabitants to get a more evenly distribution.
It is clear that there is a positive correlation, although the scatter around the regression line is considerable.An R-squared of 0.22 means that only 22% of the variance of the FIFA-points is explained by the number of inhabitants.
So : is there something else ? Perhaps the number of soccer players ? Because if nobody plays soccer it does not help if you have a billion inhabitants (a bit like China ?). We find these numbers on the FIFA website.
The following chart effectively shows a higher R-squared :
Problem : stil only 32% of the variance is explained. Where is the remaining 68% ?
We could suspect that unorganized soccer players can never reach the quality of organized ones, the ones that play in clubs, in regular competitions. So let’s try another number we find on the FIFA website: registered players.
And yes, our R-squared goes the right direction : 0.37
We could proceed with variables like gross national product, average temperature, average adult male size, or whatever metric we can find about the countries.
And for a far better model we would definitely need a more advanced analysis, like a multiple regression, but for this blog post I think we are far enough.
The conclusion is obvious : Larger countries are definitely at an advantage, but the popularity of soccer and the organization of it can seriously enhance or decrease the expected soccer quality of a country.
The top five countries in the FIFA ranking (Brazil, Spain, Portugal, The Netherlands and Italy) are clearly way above the regression line in the third chart, which means that they use the possibilities to exploit/influence/act on the remaining 62% unexplained variance in FIFA points.
In South Africa (the one isolated point in the lower right of the chart) on the other hand, there is a big problem : although they have a relatively large number of registered players (they are in the top-10) they find themselves only on place 83 in the FIFA ranking.
And there are 2 more things:
- The FIFA ranking does not say anything about the average soccer quality of a country. It is all about the top qualities, which are the very extremes of the quality distribution and hence much more erratic than an averages, and much more difficult to “catch” in a model. Simply chance probably plays an important role.
- Not only chance in having the best soccer players is importance, but also chance on the moments of the soccer matches : the result of a soccer game is quasi unpredictable, except when one of the top countries plays against one of the weaker ones. Otherwise how could it be that 8 of the 32 countries playing the final world championship are lower than place 32 in het FIFA-ranking, while we would only expect the best 32 countries?