As with all interesting ideas, the notion to view a basketball game as a network of passes is not new. Even as I was thinking several months ago about attempting it as a project, Alan Reifman of Texas Tech was iterating through a second attempt to do the same. His extended abstract, “Network Analysis of Basketball Passing Patterns II” (PDF), was presented at NetSci 2006 last May as part of the student poster work. This kind of thing has roots further back, with Yaneer Bar-Yam talking about the complexity of team play in basketball (2000). Passing has been the focus of work in 1979 by Peter Gould and Anthony Gatrell — a study of soccer, using the 1977 Liverpool-Manchester games — and in last year’s Social Network Analysis Sunbelt Conference (Lee/Borgatti/Molina/Merelo Guervos). Although I’ll have to account for these efforts before I can do anything with my own data, I have immersed myself this semester in trying to understand the process on my own.
Methodology
My hope for this first part of the project was to come away both with key insights from the network analysis of the game of basketball and also with a process for mining information useful to coaches. I envisioned analyzing the entire 2005-06 season for the IU women’s basketball program in the first month and then applying the new techniques to the upcoming season, which starts in earnest next month. The many hours it took to mine data from just a single game, however, brought me back down to earth, and my sights are set much lower.
In watching game tapes of last year’s “best win” and “worst loss,” I quickly came to realize that there was a very limited amount of information of which I could keep track as I watched. The video tape was grainy, particularly on a non-digital screen, and the camera shots and post-production editing masked some of the information I wanted to track. I also had to come up with a way to translate what was happening on the court into data points of a network. The end result was a process that included five passes through the same 40-minute game to piece together the data I needed:
- Possessions — My definition is smaller than an actual possession, which indicates continuous control of the ball. It was more useful to look at the possession chunks that start and stop in certain ways. As a result, there will never be a shot contained within a play network.
- Range — The baseline-to-baseline court was divided into six areas, identified from 6 down to 1 at the basket. The freethrow line and (roughly) about a meter in back of a pro 3-point arc are the dividing points.
- Zone — Same thing as the last pass, except working from sideline to sideline. The zones are identified as Top, Middle and Bottom, with the orientation flip-flopping on possessions so the teams are aligned.
- Players — The sometimes hard-to-read jersey numbers identify the player, and thus any meta information associated with that person (such as statistics, experience and position … although that would need to be mined from other sources at a later time).
- Time — This data mining pass had to be scrapped since I lacked a meaningful and efficient way to keep track of the time. This part might become trivial if I were to digitize the video footage or leverage other indicators of time in a custom application.
Ball movement, not strict passing, is technically what is being captured here. While I did record whether or not a ball moved due to a dribble or a pass, in this initial analysis I didn’t care how the ball moved across the court – just that it did move, where and by whom.
Visualizing the Network
My coding scheme differs from Reifman in its emphasis on physical location. I wanted to know if certain outcomes, teams or situations displayed signature patterns in the networks of ball movement. The result was a two-dimensional data collection that yields three distinct kinds of networks: Zone, Range and Location.
Zone Network Overall, there is much more use of the middle of the court as an intermediary between sideline play. This makes sense given the need for perimeter ball movement to get around defenders. Visually, it would also look like plays tend to come in from the top to the middle more frequently than any other pass, probably indicating there is a preference when setting up a shot inside to pass from the top side. |
|
Range Network The rules dictate this network in that there will never be passes from the closer ranges (1-3) back to the distant ranges (4-6). Doing so results in a turnover. However, there is a noticeable avoidance of the “4” range before crossing over midcourt, as well as a tendency to keep the ball close to the basket. |
|
Location Network When both the range and zone are combined, a grid of 18 areas (6×3) is formed over the basketball court. This indicates the physical location of the ball as it moves around. The patterns here show ball movement being primarily a perimter activity, with another visual hint of a top-to-middle emphasis that might be related to shot attempts close to the basket. |
There are also some differences surfacing when comparing different cross-sections of the composite data, such as the networks of each game and separation by home and visitors (IU was one of each in this sample).
“best win“ | “worst loss“ |
Home | Visitor |
Future Work
The individual players are not yet included in this analysis. (They were, but I found a flaw in my query logic that failed to distinguish between an IU jersey #3 and #3 jerseys on Bowling Green or Purdue.) I also want to try and match apples and apples by normalizing the data, so I can get a good visual comparison between chunks leading to scores and those not leading to points.
If this initial analysis should lead anywhere, there are some opportunities to include other similar games in the study. Indiana played Purdue two other times during the course of last season. The Hoosiers also played Minnesota three times, all in close proximity to the Purdue games. For this season, there is a rematch with Bowling Green on December 6 and then back-to-back games against Purdue on January 14 and 21. Adding to the intrigue of the in-state rivalry is the fact that last season’s coach, Sharon Versyp, resigned in April to take the job at Purdue. Versyp was replaced by Felisha Legette-Jack.
2 replies on “Hoops Network”
In basketball, passes are probably not as important as in soccer. The network will be completely connected to each other, and besides, not so many passes are needed to go from the beginning to score. So tackling it by zones is probably the right approach.
Do you have more information on that 1977 game by Liverpool? I didn’t know it.
PS. Thanks for the reference
The full citation for the Gould article is:
I also found this mention, with some more details, in Connections:
Peter Gould died in 2000, I believe.