Where I Stand

Couple events have occurred over the past few weeks that had me thinking about how my work is interpreted and used in various hockey-related discussions. I figure using this blog as a platform would be the best way to summarize my thoughts on this matter, and clarify any issues that arise in the future.

“Analytics Guy”

This always makes me cringe. For the simple fact that “analytics” is used so interchangeably, especially in hockey, that I often have to re-familiarize myself with the concept. To me, analytics is the process of collecting raw data, refining it, applying different models, finding correlations and ultimately, looking for some sort of pattern to make a decision on. I do some of this, but leave the hard work to people who know what they’re doing.

My approach is to start with a question and then find the data that’s already been scraped from NHL.com and aggregated in an easy to use format. Thanks to websites like War on Ice, Hockey Analysis, Behind the Net and Natural Stat Trick, all I have to do is find the metric that’s been derived from the analytics (i.e., Corsi, Fenwick, etc) and apply it to whatever question or topic I have. I do look for patterns. I do look for correlations. But the bulk of the work is done by real analytics-type people with backgrounds in computational science and statistics. Once I have the data, and run my analysis, I try to explain in 750 words why my topic matters, what I found and what I think the next steps are.

“Corsi Guy”

Now this one is relatively new.

Last week when Mark Fayne was put on waivers to be sent to the AHL, I openly questioned how Eric Gryba was any better than Fayne. Without a doubt Fayne has struggled mightily this season, even getting benched at times and healthy scratched. But I still consider him ahead of guys like Gryba and others for the simple fact that he’s a proven player and has more experience playing against top competition. Gryba has not looked good to me at all, and does not appear to have the ability to move up and down the lineup like Fayne would. For what it’s worth, my own analysis found that Fayne wasn’t shooting at a frequency that McLellan expects from his defencemen, and this might be why he’s been waived.

Now I do look to shots and shot attempt data mainly because it’s a good indicator of possession and has been reviewed and analyzed by some very bright people (Arctic Ice Hockey, Pension Plan Puppets, SB Nation to name a few). It’s not perfect and can’t answer every question, but I have my reasons for using it.

First off, shots and shot attempts tend to be the best metric for the question I have or the topic I’m exploring. My thoughts aren’t that overly complicated, so I can typically track down the exact data set I need rather quickly, without having to using any modelling to test correlations. If I can’t find the dataset, I ask around. That’s how I found things like Ryan Stimson’s passing project or Corey Sznajder’s Zone Entry project.

Quick note: What I stress to anyone who’s looking into any sort of analytics, whether it be hockey or business, is to approach the data with specific questions. And be ready for continuous analysis and discussion. Analytics does not provide any sort of final answer. In my opinion, the best analytics articles are the ones that leave you with more questions.

I also like the shot data because it’s readily available to anyone and everyone. Using a data source that’s used by many other people gives my work some credibility and also makes my work verifiable.

Having said all of that, I’ve always remained open to new metrics that have some thought and explanation to it. Hockey analytics is only in its infancy, so I expect people to collect and aggregate data which can only push the discussion along. Examples include dCorsi, Dangerous Fenwick,  xGoals and the results from manual data collection projects.

So should using metrics such as Corsi or sharing the work of others who use Corsi make me a “Corsi Guy”? Hardly.

“Analytics Community”

This one sounds all warm and fuzzy, but it’s been used as a way to put down a whole group of people when really the target might be one or two.

Another problem with this phrase is the generalization of the intended participants. There are some in  this community that are the actual statisiticians who parse through and test the data. There are some that do the aggregation (i.e., War on Ice). There’s the visualization people. And then there are those that have an understanding of the data and just like reading articles about it. So when someone says “Analytics Community”, I really have no idea who this is referring to and tend to ignore the rest of their issue.

And finally, there is a lot (a LOT) of disagreement among fans when it comes to the application of analytics to hockey. Player A might look great to one person using this metric, while Player B might look better using another metric. But when someone says “Analytics Community”, it sounds like everyone is on the same page have come to the same results.

“Edmonton Media”

There are a few local media types, ones that work full time for one of the major outlets, that tend to stir the pot to draw extra attention to their work. We know this is part of the game when it comes to covering sports in Edmonton. A lot of it is what I refer to as scripted ignorance. For instance, taking a shot at the “analytics community” is a good way to get under the skin of a lot of people and draw attention to themselves. It’s usually the same three or four local reporters that tend to do this. This doesn’t bother me because statistical analysis has been done for a long, long time. It’s a way for fans to get into the game and it helps to add to the discussion. Plus, the beauty of modern communication technology is that individual fans create their own little ecosystem and control what information they receive, create and share.

What does bother me is how the rest of the folks who cover the game get lumped with the few ignorant ones. Outside of our Oilers bubble, “Edmonton Media” does not have a good reputation, which isn’t fair to the individuals who actually do make a conscience effort of expanding their scope to include analytics. And the reputation of Edmonton being a tough place to play is warranted, but has been driven by a lot of the garbage content produced by the few.

 

As always, feedback is appreciated.

Advertisements

6 thoughts on “Where I Stand

  1. I always enjoy your posts. I am not a ‘math guy’ but I find some of the different methods available to be useful as I wander the desert. Thank you for your hard work.

  2. Good morning and thanks for the piece but I for one look at hockey analytics for what they are, minimal. The team that totally dominates are very obvious but when you get to 48.5 v 51.5 and use this to determined the better team, this is flawed in style of play dictating shot blocking teams as to those that don’t want their D going down to block. Some teams force all the traffic and shots to the outside resulting in more shots of lower quality. etc.

    When it comes to individual players corsi is a joke, any player in the league that comes in and is pasted to the 4th line with a majority of zone starts in the Dzone with 3rd paring D will end up with a lower corsi than playing with Hall and Dri. Team sports makes it impossible to pin point a players ability through shot differential. As far as Fayne to Gryba, I think you are as far off as you could imagine. First off, TMc knows what he is doing. Fayne has more experience but so does Kevin Low, guy looks as done as Low, let it go. Gryba is seldom out of position V Fanye. Gryba provides high quality PK, Fayne is awful. Gryba clears the front, Fayne could play with eggs in his pockets and not break one. Gryba forces wingers to the wood, Fayne forces them to the front of the net. Analytics are only as good as the quality of data entered. throwing useless information into a calculation=false results.

  3. Good morning and thanks for the piece but I for one look at hockey analytics for what they are, minimal. The team that totally dominates are very obvious but when you get to 48.5 v 51.5 as a corsi number and use this to determined the better team, this is flawed in style of play dictating shot blocking teams as to those that don’t want their D going down to block. Some teams force all the traffic and shots to the outside resulting in more shots of lower quality. etc.

    When it comes to individual players corsi is a joke, any player in the league that comes in and is pasted to the 4th line with a majority of zone starts in the Dzone with 3rd paring D will end up with a lower corsi than playing with Hall and Dri. Team sports makes it impossible to pin point a players ability through shot differential. As far as Fayne to Gryba, I think you are as far off as you could imagine. First off, TMc knows what he is doing. Fayne has more experience but so does Kevin Low, guy looks as done as Low, let it go. Gryba is seldom out of position V Fanye. Gryba provides high quality PK, Fayne is awful. Gryba clears the front, Fayne could play with eggs in his pockets and not break one. Gryba forces wingers to the wood, Fayne forces them to the front of the net. Analytics are only as good as the quality of data entered. throwing useless information into a calculation=false results.

  4. Pingback: Edmonton blog roundup: Dec. 14, 2015 | Seen and Heard in Edmonton

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s