Jesse Davis and his research team are using advanced artificial intelligence to reshape how professional soccer clubs make decisions, unveiling counterintuitive, data-backed tactics and open-source tools that are changing how rosters are evaluated, strategies are judged for efficiency, and hidden patterns are identified on the pitch. Their latest work, which analyzes millions of on-ball events and models in-game choices at scale, underscores how machine learning can turn granular match data into clear competitive insights for top-flight teams.

AI Integration

The group’s approach combines large-scale data collection with modern machine-learning techniques to test specific, repeatable questions about what works during a match. To evaluate one increasingly visible tactic—intentionally sending the ball out of play near the opponent’s goal line to force a throw-in—they assembled a training set with more than 1.4 million passes and roughly 60,000 throw-ins, including sequences drawn from the 2022 World Cup. Using tree ensemble models, which blend multiple decision trees to capture complex interactions, the researchers simulated the game states that follow such a choice and quantified the downstream consequences.

The conclusion, presented in a 2024 paper aptly titled “Boot it,” is straightforward in its framing and striking in its implications: when the ball is in the middle third and is played out of bounds on the opponent’s side, the sequence that follows can move a team to within 10 actions—passes, dribbles, and similar events—of a shot on goal. In a sport that can log 1,500 or more actions per match with relatively few scoring chances, shifting the distribution of those actions in your favor is nontrivial. The statistical case, as Davis explains, is not about surrendering possession for its own sake; it is about manufacturing a recoverable situation that tilts the field position and subsequent decision tree toward a better outcome.

By embedding these choices within a simulated environment, the lab shows how learning algorithms can weigh outcomes that human intuition might dismiss as counterproductive. The models allow analysts to examine what happens not just on the next touch but across linked sequences, tracing how a seemingly conservative move can, probabilistically, seed a more advantageous chain of play. That is the practical value of machine learning in this setting: it aggregates rare or overlooked events and evaluates them consistently, turning anecdote into tested probability.

Technology Use Case

The core of the lab’s work is translating raw match data into standardized, analyzable units. Building the “Boot it” dataset required stitching together event streams—passes, throw-ins, and other discrete actions—into a form suitable for training and validation. Tree ensemble models were then used to emulate how different choices propagate through a possession, generating a robust view of the expected pathways from one touch to the next. Because these models can represent nonlinear relationships and interactions among variables, they are well suited for the fluid, context-dependent nature of soccer, where space, pressure, and positioning change from second to second.

This technical framework also underpins the lab’s broader contributions to day-to-day club operations. Teams use the group’s methods to judge roster construction, analyze how efficiently particular strategies are executed, and surface tactical motifs that are otherwise hard to spot at full speed. The lab’s outputs are not limited to private briefings. Davis releases much of the work as open-source analytics resources, giving practitioners shared building blocks and promoting common standards—an especially important step as clubs collect ever more detailed in-game feeds.

Standardization is a recurring theme for the team. By focusing on consistent definitions and data schemas for on-field events, they aim to simplify how video and event logs are parsed so that insights are portable across matches, competitions, and teams. The academic setting, Davis notes, offers the freedom to pursue these infrastructure questions alongside immediate match-analysis tasks. That combination—foundational data work plus targeted tactical studies—has helped the lab earn a reputation for influence among decision-makers inside professional clubs.

Industry Response

Within the recruitment and analytics community, the lab is viewed as a driver of practical, field-ready innovation. Hugo Rios-Neto, who leads data recruitment at Belgium’s Royal Sporting Club Anderlecht, describes Davis’s unit as a leading force in soccer analytics. That assessment reflects both the specificity of the group’s studies—such as testing the value of a deliberate throw-in scenario—and the accessibility of its outputs, which enable clubs to fold the findings into their own processes. It also reflects a broader industry shift: more organizations are hiring dedicated data teams to safeguard competitive advantages, and they look to external research for validated approaches they can adapt in-house.

In this environment, reproducible methods matter. The lab’s emphasis on transparent tooling and clear, testable claims helps practitioners compare strategies without relying on hunches. For analysts working under match-day time pressure, that means they can query models for likely outcomes and stress-test alternatives rather than debate hypotheticals. The “Boot it” study, for example, formalizes a late-stage possession choice that analysts have noticed informally in recent seasons, grounding it in numbers rather than anecdotes.

Research Roots

Davis, 45, did not begin his career in soccer. He grew up in Wisconsin, gravitating to basketball and American football, and only encountered soccer as a serious spectator sport during the 2002 World Cup, when Brazil’s run captivated global audiences. His academic path took shape at the University of Wisconsin–Madison, where his computer science doctoral work applied AI to medical texts, partnering with radiologists on analyzing mammography reports. That early focus on pattern recognition in high-stakes, data-rich environments would later inform his sports research.

In October 2010, Davis joined KU Leuven as a computer science professor working at the intersection of AI and health, with studies on monitoring athletic performance. His team examined how to combine heart rate and other physiological indicators to detect overtraining and delved into the biomechanics of running. The pivot toward the tactical and technical dimensions of soccer began when he recruited Jan Van Haaren, an engineering student concentrated on AI and an avid soccer observer. Van Haaren’s questions—about passing, shooting, and the mechanisms of ball progression—matched the moment, as the sport’s infrastructure for event logging and video analysis was just maturing.

From there, the research increasingly targeted soccer’s unique analytical challenges. Compared with baseball or basketball, where isolating discrete actions such as a pitch or a jump shot is relatively straightforward, soccer presents a continuous, interdependent flow. Davis recognized that machine learning’s ability to model complex, dynamic systems made it a natural fit for this setting. The resulting tools treat the match as a sequence of linked probabilities, where the value of a single decision depends on its context and its ripple effects several actions later.

Market Impact

The immediate beneficiaries of this work are the clubs that fold the findings into scouting, training, and match preparation. Better roster evaluation emerges when performance is measured within consistent, context-aware frameworks. Strategic planning becomes more grounded when teams can test whether a favored approach actually converts to more productive action chains. And targeted pattern discovery—surfacing, for instance, the payoff from conceding a throw-in in a specific zone—gives coaches and players a way to operationalize insights without guesswork.

Equally important is the lab’s open posture. By sharing tools and emphasizing standardized data, Davis’s group lowers the barrier for analysts across the sport to replicate and critique results, a prerequisite for lasting adoption. The effect is cumulative: more consistent data yields more reliable models, which in turn produce clearer guidance for in-game choices. In a low-scoring sport where marginal gains can decide outcomes, the capacity to quantify and simulate those margins is the essence of competitive advantage.

As professional clubs deepen their internal analytics capabilities, the blend of applied studies like “Boot it” and foundational work on in-game data standards positions Davis’s lab at a productive intersection: close enough to the touchline to answer urgent tactical questions, and sufficiently independent to build the shared infrastructure that makes those answers durable. The result is a model for how academic AI research can inform real-time decision-making at the highest levels of soccer without sacrificing rigor or accessibility.