Data Optimization for Fantasy Sports Analytics

Problem Overview

Fantasy sports represent a rich and exciting world of modeling and analytic possibilities. With the advent of modern computer vision, statistics tracking, and the general embrace of the sporting community of a “data-centric” view to the game, there is a wealth of information available about each player, their performances, and various metadata.

The key is to find an application for this data that is interesting, but also challenging enough that it guides us out of the traditional “we’ve seen this before” blog post regarding sporting analytics. As such, we have chosen to introduce a particularly niche case relevant to fantasy sports: how to draft your fantasy team. The model we will develop will be solved using a D-Wave quantum annealer. The application of this blog will be with respect to fantasy hockey, but this could be adapted to any fantasy sports scenario (with enough subject matter expertise to change the scoring and other factors). We note however that this model formulation will work best for “long term” fantasy teams where you use the same players week to week.

The D-Wave Quantum Annealer

The Quantum Annealer is good at solving exactly one kind of problem: discrete optimization problems. Specifically, it can solve linear optimization problems and Quadratic Unconstrained Binary Optimization (QUBO) problems, which will be our focus. A QUBO takes the following form:

Where E is the function to be minimized, ai is the ith linear constant, xi is our binary variable, and qi,j are quadratic constants or coupling terms. The goal of any QUBO is to make E either as small or as big as possible (minimization or maximization). The equation above is deceptively simple, and it represents an NP hard combinatorial optimization problem. In this context, an NP hard problem means that we (typically) cannot be sure we have found the smallest or largest value of the E without evaluating every possible combination of our binary variables x. Which, as you might imagine, could represent a serious time commitment.

Despite its deceptively simple form and increasingly difficult solution space, these equations have application in nearly every industry from finance to logistics and machine learning. Due to the sheer number of potential applications, and an ever-increasing demand to find good solutions quickly and efficiently, the quantum annealer was invented.

In short, the quantum annealer works by not searching the entire potential solution space, rather, it has two non-classical options available to it. The first is that a quantum annealer can tunnel through thin (but high) potential barriers which allows a quantum annealer to explore the search space more effectively. But more importantly, a quantum annealer works at near absolute zero, when all molecular motion stops.

From this natural property of the universe, the quantum annealer has another advantage, namely, a lazy universe. At temperatures near absolute zero, anything from molecules and electrons to the Josephson Junctions inside your quantum annealer want to exist only in its ground state – its lowest energy state. Our system will be naturally drawn to the lowest energy states, as there is an energetic advantage; a natural tendency to find a stable low-energy configuration. Indeed, when it comes to quantum annealing, we no longer have a simple algorithmic advantage leading us to local minima or maxima in classical computing, we also have an energetic advantage rooted in fundamental properties of the universe.

Markowitz Portfolio Optimization

The world of fantasy sports shares a lot in common with the world of finance. Predicting long term returns, extrapolating the performance of an asset into the future and assembling the perfect portfolio of those assets.

In finance, these assets are stocks and bonds, and their returns are in cold hard soulless cash. In fantasy sports our assets are the players themselves and their returns are their fantasy points. In finance, portfolio optimization is about assembling a collection of different investments that you hope will perform well long-term by balancing high and low risk investments in order to maximize your potential return. The same is true in fantasy sports, except now the portfolio is your team roster.

Let’s introduce a classic of finance, Markowitz Portfolio Optimization, and how it relates to fantasy sports. Markowitz Portfolio Optimization is typically best suited for long term investments. In its most basic form, a Markowitz Portfolio Optimization problem is defined as follows:

Here r is our expected returns vector, x is the variable we’re optimizing (our list of assets), Q is the covariance matrix of our returns, and 𝛾 is our risk tolerance parameter which can take values above 0 (but typically below 1). Values of 𝛾 closer to zero indicate a higher risk tolerance in the portfolio, and larger values indicate a more risk averse portfolio.

As written above, our goal is to maximize our returns while minimizing the risk. In classic portfolio optimization, x can take continuous values as we can buy fractional shares, or only invest part of our budget in an asset. In fantasy sports however, we cannot take fractional amounts of players (however, we concede that would result in an interesting twist on fantasy sports).

Indeed, we can either choose a player for our team, or not. As such, we have a binary optimization problem, meaning that our target variable x must either be zero or one. 1 indicating we’ve chosen a player, 0 indicating we haven’t. This is defined explicitly below:

Where here we now have a discrete optimization problem, we must choose a discrete number of things in order to maximize our expected returns. In its current form, this isn’t very interesting. Surely, we could just choose only the star players and dominate our fantasy leagues. However, the real challenge of fantasy sports comes in the form rules surrounding how we must choose our team – how our choices are constrained.

Markowitz for Fantasy Hockey

With an example specific to hockey, let’s consider the following common constraints

Budget constraints: Each player has an associated cost, and we only have so much “money” with which to draft them to our team.
Position constraints
We must have at least a certain number of goalies.
We must have at least a certain number of forwards.
We must have at least a certain number of defense men.
Team size constraints: our team must consist of some precise number of players.

The above constraints are the essence of fantasy sports. We can ruin the fun of these rules by formalizing them into our optimization problem below:

Where in the above {G} is the set of players who are goalies on our team, NG is minimum number of goalies we are required, {D} is the set of players who are defense men on our team, ND is the minimum number of defensemen we must pick, {F} is the set of players who are forwards on our team, NF is the minimum number of forwards we must draft, {T} is the set of players on our final team, NT is the number of players we are required to draft, vi is the value associated with the ith player and Vmax is the maximum budget for drafting our team.

That’s a perhaps daunting number of variables to introduce all at once, so let’s walk through this a little. The greater than or equal to constraints for each player, for example the constraint

for goalies represents the constraint that we must pick at least a NG goalies. As each xi is a binary variable, this represents a simple binary sum of 0 and 1. If we have chosen a player, xi = 1, otherwise it is zero. This constraint is similar for each position and represents a series of inequality constraints. The second constraint of interest to discuss is:

This represents our cost constraints. What this constraint says is that of the set of players on our team {T}, the associated value of each player we have chosen must be less than or equal to our final budget. Finally, our sole equality constraint enforces that we must have a team of size NT. In order to have an acceptable answer, known as a feasible solution in the optimization world, all these constraints must be satisfied simultaneously.

Preparing for D-Wave

At this stage, with the appropriate data, this model would be ready for ingestion into a classical solver like CVXPY. However, we want to run this on a D-Wave quantum annealer. In its current form, that would be impossible as the quantum annealer cannot optimize a problem with constraints.

Specifically, we need to rewrite this as the one type of problem the annealer can handle: something more concise to be like equation 1. The first step will be to write out the sums implied by the above matrices explicitly:

Here we have directly specified that there are N players, and qi,j is element i,j of our covariance matrix.

Now, to coerce this into a form resembling equation 1 we will need to:

Multiply by 1
Add zero

Specifically, we will remove the constraints and express them as the following in our objective function:

Where λ1 and λ2 are free parameters to be set by us known as Lagrange multipliers, and the new parameters Cα,i represent the position “cost” of a player. For example, the table below outlines the values of this parameter for each class of player:

Constant	Value for Forwards	Value for Defense	Value for Goalies
C G,i CG,i	0	0	1
C F,i CF,i	1	0	0
C D,i CD,i	0	1	0

With these new constants, this first sum represents our previous positional constraints, and the second sum represents the value constraint. The terms are squared to enforce that this term is strictly positive for the minimization. Of course, it’s not immediately obvious how the above equation is zero like we have claimed. The truth is it’s not, except at a feasible solution that satisfies our constraints. For example, let’s focus on the value constraint. If we have completely used our budget, then the following is true:

This of course is a soft constraint controlled by the Lagrange multiplier, but the idea is the same for the player constraints: if we have drafted a correctly sized team, these sums evaluate to zero. In which case our optimization problem becomes (by subtracting this constraint from the maximization objective)

Where here we have subtracted the above as when this term is “not zeroy” it will make the objective function smaller, therefore ruining our maximization. We should also note that we will have to set and tune the values of our Lagrange multipliers. The larger the value for the Lagrange multiplier, the “more we care” about the constraint.

For example, we may set λ2 the Lagrange multiplier for our budget constraint to a smaller number than λ1, the Lagrange multiplier for the player constraints, as going under budget is fine – but having an incorrect number of players is a mistake we cannot afford to make. After some tedious algebra and removing some constants, we can rewrite the objective in a form friendlier for D-Wave as follows:

Where this may not necessarily look like we’ve made a lot of progress towards making this work on the D-Wave quantum annealer. However, this is exactly what we need. By comparing this to equation 1, the terms in the parenthesis are the ai and bi,j terms we were looking for.

Drafting a Winning Team

We will look at the Sportsnet 2021-2022 fantasy hockey pool and scoring system. This is a relatively simple scoring system where goals are worth three points, assists are worth two points, and for goalies a win is worth four points, an overtime win/loss is worth two points, and a shutout is worth two additional points.

In this version, every participant can draft any player they would like, however, each player has an associated cost of 1-4, and your entire team cannot exceed a value of 30. You are also required to draft six forwards, four defensemen, and two goalies to comprise your team.

We will calculate our returns and covariance matrices using the publicly available NHL data to gather player statistics and use the “mean returns” of each player to calculate their returns vector, and use a classical covariance indexed by game played for each player. It should be noted that this is a clear place for improvement; we could align our covariance differently to account for lines, teams, subject matter expertise and so forth. Aligning by date should be considered a baseline model.

We also note that the model formulation described here would work best for a long-term head-to-head style league, contrary to the Sportsnet fantasy pool, which drafts new players weekly. We also note that this model would be significantly improved in these short-term versions by accounting for injuries and the number of games played in the upcoming week. However, we get some interesting results concerning how long it takes to find a solution when comparing classical and quantum solvers.

Results

We performed this optimization using our model above, using both classical solvers and quantum solvers to compare results of both speed and effectiveness. To compare the effectiveness of our model, we also used a “time traveler” metric wherein we found the perfect team for a given week retrospectively. That is to say, if we could see into the future, we would draft that team for the best possible score. It should be noted that one should never expect to draft the best possible team – but comparing the performance of our model to this “future” team makes for an excellent baseline to compare our model performance to. The closer we get to perfect – the better the model. For this model, we tried two different values of the risk tolerance parameter and included ten weeks of data for each optimization. The scores below show the results of the Markowitz optimal team using unseen data for the next week, simulating an actual contest scenario. These results are shown in Figure 1.

***Figure 1:*** This figure displays the week-to-week performance of the specified model above and compares the performance of both classical solvers and quantum solvers. In the second figure, the time comparison does not include problem set up/server communication or latency and is raw CPU and QPU time.

From Figure 1 above, we note that our optimization model, on average, does not perform close to the “perfect” score, with one exception where it was able to obtain 70% of a perfect score. The classical solver achieves 49% of the perfect score, and the quantum solver performs 51% of the perfect score. It should be noted that subject matter experts (i.e., the winners each week on Sportsnet) typically achieve 70 to 85 % of the greatest possible score. However, this poor performance in terms of winning in this setup is not necessarily surprising given the nature of the model used to optimize our teams. It is ripe for improvement, as discussed in the preceding section. However, the solution time is essential in the figure above. Solving this system using a quantum annealer is, on average, nearly 270 times faster at finding a solution! In this case, if we had a lot of teams we wanted to manage, or we wanted to search a more significant space, the advantage of the quantum annealer for speed has become apparent. It is also noteworthy that the quantum computer achieves slightly better performance on average in terms of points earned each week – and it will be interesting to incorporate further improvements and model developments to improve further and tune this model specific to the Sportsnet contest. As with all problems in optimization – the best model will be obtained by consulting with subject matter experts to craft and tune an objective function specific to the application.

While the jury is still out on the effectiveness of this optimization tactic for fantasy hockey performance, results are expected to be above average. The optimization problem was formulated using Markowitz Portfolio optimization to run on a D-Wave quantum annealer, proving that with a little bit of math, and some data extraction expertise, we can draft a competitive fantasy hockey team without needing to do hours of sports research. Instead, we get to do hours of optimization research.

Noteworthy Improvements

Certainly, this is not the end of the story. There are many possibilities to improve this optimization and achieve even better results. Adding terms which will benefit drafting players on a similar line to more points for goals and assists, tuning the risk tolerance parameter to this scoring system, incorporating tracking injuries and maximizing games played week-to-week, starting with a strong initial guess at the solution of perhaps last week’s best performing line, or incorporating some custom constraints or objectives to the objective function could all very likely see increased performance.

As well, it may be interesting to have a supplementary model which tries to predict individual payer performances week-to-week which could also help our system draft better teams in the short term.

Optimization Beyond Fantasy

Whether you’re on the field or cheering from the stands (or in this case, from behind a computer screen), the sports industry can benefit from data analytics. A team that can measure and understand itself through its own data will have the competitive edge. The application of data optimization extends way beyond fantasy sports, helping collect player data that simply cannot be collected outside of a laboratory setting, such as electromyography, physiological chemistry, or detailed whole-body motion capture videography for biometric analytics.

Artificial intelligence (AI) and computer vision open entirely new levels of opportunity for teams to assess each athlete’s performance through gameplay analytics, automating the analysis of huge quantities of photos and videos. Using this fundamental understanding of team performance, scenarios can be simulated, and plays can be optimized based on information from real-world game data. This can help inform coaching decisions for improved gameplay.

What’s more, through pricing optimization, sports teams can deploy predictive ticket pricing or apparel manufacturing that considers inventory restrictions, pricing constraints, and nuances in customer demand. There is ample opportunity to deploy data analytics solutions to derive insights from data within the sports industry and beyond. Get in touch with Mosaic to get started.

Data Optimization for Fantasy Sports Analytics

Published by Sel Gerosa on March 16, 2022March 16, 2022

Problem Overview

The D-Wave Quantum Annealer

Markowitz Portfolio Optimization

Markowitz for Fantasy Hockey

Drafting a Winning Team

Results

Noteworthy Improvements

Optimization Beyond Fantasy

Unleash NextGen AI for Financial Services: Streamline Workflows, Boost Efficiency with Mosaic Neural Search

Neural Search Introduction: Putting GenAI to Work

Why Airlines Need to Look at Holistic Machine Learning & Optimization Solutions to Improve Scheduling

Have questions? Schedule a meeting below

Data Optimization for Fantasy Sports Analytics

Published by Sel Gerosa on March 16, 2022March 16, 2022

Problem Overview

The D-Wave Quantum Annealer

Markowitz Portfolio Optimization

Markowitz for Fantasy Hockey

Drafting a Winning Team

Results

Noteworthy Improvements

Optimization Beyond Fantasy

Related Posts

Unleash NextGen AI for Financial Services: Streamline Workflows, Boost Efficiency with Mosaic Neural Search

Neural Search Introduction: Putting GenAI to Work

Why Airlines Need to Look at Holistic Machine Learning & Optimization Solutions to Improve Scheduling

Have questions? Schedule a meeting below