Waiting for Kickoff: A Historical Look at European Soccer

By Carlos Pena-Lobel

Inspired by the start of the return of European Soccer again, I wanted to fulfill my promise I made almost a year ago in my post, Why Leicester City Shouldn’t Be atop the Table: A Historical Look at Parity in the EPL.  I suggest you give it a glance because I am going to use many of the same graphs and visualizations and I believe that I spent more words explaining them there than I will here.  Back then I was looking at why what Leicester City was accomplishing (and did accomplish) was unprecedented in English top flight soccer.  Basically I found that the top teams were doing better than they had in the past and similarly the worse teams were doing even worse.  The disparity was growing.  I also found that mobility was down too meaning the teams at the bottom stayed at the bottom and the teams at the top stayed at the top.

Here I look to expand that to the other 5 most successful European leagues (in terms of Champions League performance: La Liga, Bundesliga, Primeira Liga, Ligue 1, and Serie A) to see if this is an isolated event, or a growing trend.  After this I test to see if (spoiler alert) this growing trend affects CL performance.  Finally I look for comparisons to Chelsea’s fall or Leicester’s rise to see how truly special these events are and what other similar events can tell us.

Parity within each League

First I look at the distribution of points in each of the 6 leagues over time.  This ignores who is the 1st place team from year to year and only looks at the separation between teams places.  So it is theoretically possible to have a league which would be labeled having increased inequality in which the last place team and the first place team switch every year with a widening gap between them.  While this clearly isn’t the case, it’s good to think about a metrics strengths and weaknesses.

As you can see below in almost every league, there is an increase in Gini that is statistically significant, with the exception of Italy.  Thus there is rather strong evidence to suggest that as a whole, European soccer is becoming more unequal, the winner’s are winning by ever larger margins.  This might be explained by money and TV contracts, or it could be explained by a change of tactics.  For example a change in tactics towards a more aggressive style could account for some of this because there is a significant decrease in the number of ties in 3 of the 6 leagues.  Yet, if there are more ties, this could also just be another signal for a more even matchup between 2 teams, so a decrease in ties confirms this theory.   Gini.pngBeyond looking at p-values, it also important to look at the magnitude of the increase.  It is fully possible to have a statistically significant increase that in the grand scheme of things is negligible.  In this case, each year correlates with an increase of 0.5%-1.0% in standardized gini.  This means at the current rate, the scale will be broken in 66 years.  Since our y axis is percentage of maximum gini, it is impossible to go over 100%.  Mathematically this will never happen as the growth rate will be forced to slow drastically, which only makes the fact that we are seeing this linear growth rate even more astounding.  Also a quick note about standardized gini, it is the league’s gini coefficient divided by the largest possible gini coefficient for a league that size.  This I covered in my previous post because of teams requirement to play every other team.  Also in this post this metric has also been updated for league size.  A quick graph (which I am not going to explain, read the previous post) is below:


Does Intra-League Parity Affect CL Performance?

So how does this tie into the ultimate goal of winning the Champions League?  Using an idea that I believe I borrowed from a 538 post long ago called Championship points, I decided to test this.  Succinctly, Championship points give you 1 championship for winning, ½ for reaching the finals, ¼ for reaching the semis etc.  Since the inception of the CL the results are below:

Basically there are 5 classes of countries:

The Dominant- Spain

The Elite – England

The Competitive – Germany, Italy

The Occasional Surprises – France, Portugal

The Thanks for Showing Up – Everyone else.

Graphically these different levels of dominance are reinforced:

First I want to compliment Spain’s recent dominance.  Spain’s last “bad” year was 2009 in which they “only” had 1 semi-finalist.  In fact this past year, with the exception of Valencia losing in the CL group stage, and Liverpool beating Villarreal in the Europa semis, Spanish team’s didn’t lose to anybody other than another Spanish team.  In the Europa League, Bilbao eliminated Valencia and Sevilla eliminated Valencia.  In the CL, Atletico Madrid beat Barca, who later fell to Real.

Even after all of that dominance, Spain still didn’t reached the height of the 07-08 English dominance, although this metric doesn’t include Europa performance (which it arguably should).

While it would be interesting and potentially rule changing if there was a decent correlation between domestic parity/success and European success, I found none.  And trust me, I even tried to p-hack as detailed in 538 (for fun of course, HSAC has the highest journalistic standards) and couldn’t best a p-value of .12.  Thus I can conclusively say that I can’t conclusively say anything.

Looking at the Makeup of a League

Since there was nothing to be found regarding pure gini to CL, it is worth digging into what causes these gini changes.  Gini doesn’t tell us where the equalities and inequalities lie, and for that I turn to a visualization comparing how teams throughout the league table are doing.  Shown below is the aggregated data for the last 10 years.Table_Position.png

Note: all y values are as a percentage of possible points to account for league size.  Similarly all x values are weighted by league size so that we can compare each league as if there were 20 teams in it.

The first thing that jumps off the page to me is the dominance of Spain’s top 2 teams, which have been head and shoulders above the competition.  While this has primarily been Barcelona and Real Madrid, Atletico Madrid also makes this list.  Whether you buy the Arsenal fan’s argument that this is because of the lack of depth or the Madrid fan’s argument that they are simply the best teams in the world is up to you.

The next thing (that I highlighted in the top right) is the depth of decent teams in the EPL.  This is what the pundits always love to suggest, that the EPL always has a number of teams competing for the title or doing well.  This primarily includes the the Tottenham’s the Everton’s, the Liverpool’s etc.

Finally what personally surprised me the most was the French curve.  I always think about PSG’s recent dominance, most recently winning the league by 30+ points.  Thus I was this shocked to see that Ligue 1’s best teams are comparatively a liability and in fact it’s teams just above the relegation zone that provide the league’s comparative advantage.

Leicester’s and Chelsea’s Historical Comparisons

Finally let’s end with something else.  How crazy was Leicester City’s winning the EPL, and is there any precedent for that?  And conversely, is there any precedent for Chelsea’s utter collapse?  Once again following work I did in my previous post, I graphed a team’s position in the previous season to the current season where the y axis is the current season performance, with lower being better and similarly for the x axis.  If a team was relegated or promoted I assigned them to 1 place lower than being in the upper flight league.  Thus there are many observations with an x-value of 19 or 21 which signify a team that was promoted to an upper flight division with 18 or 20 teams respectively.  Similarly there are very few x-values at 18th or 20th because these teams would usually be demoted to the lower league.

The highest masses appear near the bottom left and top right signifying that on a time scale of 1 year the best teams stay good and the worst teams stay bad.  Below is the graph for all European teams in the 6 major leagues with the exception of the year of the Italian Cheating Scandal.  If you want to see league by league breakdowns check the github with the notebook I was working with (bottom of the page).

This allows us to view Leicester’s accomplishments in their historical context.  While amazing in its own right, I would be remiss if I didn’t point out how in Germany FC Kaiserslautern went from the 2nd tier in 1996 to winning the Bundesliga in 1997.  This is the record for the lowest rated team to win a major league trophy the following season.  However, the year before they were relegated, FC Kaiserslautern was the 4th place team.  In some sense they were already a top flight team who followed up a bad season with a good one, one year removed.

After this record, comes 3 teams making the jump from 14th to 1st including Montpellier, Atlético Madrid and Leicester City.  Could Montpellier offer an example of what we might expect for Leicester City as it was only 5 years ago?  After winning the club’s first title in history, the team then dropped to 9th and then 15th.  Similarly Atlético Madrid fell to 5th, 7th, 13th before being relegated in 19th place.  So while our fingers are crossed for some more magic, usually there is a reversion to the mean.

Yet there is still hope, as AS Monaco’s relegated to 2nd turnaround was orchestrated by a familiar name, Claudio Ranieri.  And Monaco has seemed to enjoy success since his turnaround placing 3rd the next 2 seasons (bringing us to the present day).  A fuller list with more of these huge turnaround is below, note that Ranieri is the only coach to appear on here twice…)

On the flip side an investigation of the teams that had a dramatic fall from grace like Chelsea.  Our first comparison is Milan who had a similar tumble after winning Serie A.  They fell to 11th, 10th,  before winning the league again, just 3 years after the previous league cup.  The next comparison is the Montpellier team we looked at above who went from 14th to 1st to 9th.  This dropoff was likely due to the fact that all along the were only a middle table side who had 1 fantastic year.

Similarly VfL Wolfsburg isn’t an apt comparison either because their results went 15, 15, 5, 1, 8, 15, 8, 11.  This wasn’t some perennial power house that had a sudden slip, it was more of a 1 or 2 wonder who likely over performed.

Yet there may be an apt comparison in the form of Schalke 04 who were a top four team before the fall, and promptly returned to the top 4 just a year after after falling to 14th.  A more complete list is below:

However, no matter how many stats you look at or performance you analyze, the matches still occur on the pitch which is what makes the game so exciting.  I eagerly await hopefully another great year of soccer to watch and get to enjoy maybe the best player of all time.  Still if stats can tell you anything, if you ever see 5000-1 odds again, look up, it’s happened at least 4ish times in less than 20,000 times.  Or put simply take the bet.

Scripts (and data) for this post can be found here.


