A Camp Nou presentation: Automating insight from data visualisations
Football data visualisations are so widely used nowadays that discussing their importance seems naive. When executed well, they are able to convey an incredible amount of information very quickly.
Moves, passing directions, shots, defensive actions and virtually every type of event that can be recorded in a football match can be represented on a graph, which experienced analysts and casual fans are then able to easily extract Insight from.
One of the simplest methods to extract insight from a certain type of visualisation is to execute it over two different datasets (think of events by two different players, or by a team when playing in two different formations) and compare the outputs.
Experienced analysts or content producers can use these comparisons to support a point they are trying to make and will have a pre-existing idea about the two visualisations they put side by side.
Our aim at Twenty3, however, was to simplify this action by automating the process used to discover pairs of significantly different visualisations.
Having worked on Story Mining, the beta version of Insight, we knew that ‘all’ we needed to do was to come up with a similarity metric that quantifies how different two instances of the same visualisation are.
Finding the right metrics to use with heatmaps and passing sonars was the content of the poster we presented at the 2019 edition of the Barça Sports Analytics Summit at Camp Nou in November. We used those visualisations to find pairs of players who were quite similar to each other in terms of their movements and passing intentions.
N.B.: The below are intuitive explanations of how those metrics work. Please don’t hesitate to reach out with questions if you’re interested on the technical details (you can find me here), or read the full paper for the more detailed explanation.
As a mathematical function, a heatmap applied to a football pitch represents how likely it is to find a player touching the ball in each point of the pitch: the function will take greater values in the places where they participate the most in the game (where the heatmap is brighter), and smaller values where they spend the least amount of time (where the heatmap is dimmer).
To compare two such functions, we use the earth mover’s distance, which quantifies the effort it takes to reshape and move one of the heatmaps to be the other one. The earth mover’s distance between two heatmaps will be big if they are dissimilar in shape and/or the location of their respective brightest areas are far apart.
To illustrate how this metric works, we can compare Roberto Firmino’s heatmap over the 2019/20 Premier League season with the rest of the top ten strikers by minutes played, and we find that Neal Maupay’s movements appear most similar to those of Firmino, with Anthony Martial proving the most different to them both.
Likewise, a passing sonar (or, more precisely, the outer boundary of a passing sonar) is a 16-dimensional vector, with each component representing the percentage of all attempted passes by a player that go in a specific direction. To compare two passing sonars, we measure how different those two vectors are, component by component, penalising big differences in components that are further apart.
As an example, we look at Trent Alexander-Arnold, whose passing is quite unique for a right-back given his verticality, and we compare the Liverpool player to the other top ten right-backs (by minutes played) this season. We see that Ricardo Pereira’s passing sonar is the most similar to Alexander-Arnold’s, with Kyle Walker’s the most different to the pair.
Since then, we have designed similar metrics to compare average position graphs, passing networks and event by zone maps and integrated all of these into our Content Toolbox as part of our Insight product.
Applying the same principle we have applied here to recognise when the heatmaps or passing sonars of two players are very similar or different to each other, we can detect when any of these visualisations on a fixed player or team have changed significantly with time, depending on the formation – whether they are playing home or away or whether it is the first or the second half.
We expect to be doing the same for many other visualisations in our portfolio soon, so stay tuned!
If you’d like to learn more about our Insight tool, the Twenty3 Content Toolbox or our other products and services, don’t hesitate to get in touch. You can find Daniel on Twitter.