Showing posts with label play by play. Show all posts
Showing posts with label play by play. Show all posts

Monday, September 3, 2007

Bruins by the Numbers: Stanford

All right, so I generated my first statistics for the 2007 season. Most of them came out right, but I think there are still a few bugs to work out. Anyways, you can see them on the right-hand side of blog. I have them up for all the PAC-10 teams, including the Bruins.

The first thing that jumped out at me was the number of big plays (i.e. 20+ yards). UCLA had 9 such plays against Stanford. That's almost triple the average of 3.2 big plays from last season. Obviously, those long strikes down field translated into a lot of yards and some touchdowns as well. The flip side of that number is Stanford's 4 big plays. That's a lot and I hope that number comes way down after we play BYU this Saturday.

The play calling numbers looked good. It is hard to see if a pattern is developing after just one game, but the mix looked about right. Lots of runs this time, but Stanford was giving up big yards on the ground and it helped establish some nice play-action passing in the second half.

Looking at the official NCAA statistics, the Bruin offense ranks 1st in the PAC-10 and 4th in the country. I'm sure those numbers will come down a bit as we take on stiffer competition than Stanford, but not a bad start for this squad! Those numbers were bolstered by Kalil Bell's career best 198 yards.

Probably the best numbers from the game up in Palo Alto was the red zone scoring. After that first misfire the Bruin were a perfect 4 touchdowns on 4 attempts. Considering how much UCLA struggled last season in red zone efficiency, Saturday's match up was a big step in the right direction.

On the defensive side of the ball, it was Kyle Bosworth's day. The backup defensive end had two sacks to lead the team. The Bruins had four sacks in whole with Trey Brown and Tom Blake adding to the total. It was good to see the linebackers leading in the tackling column with Reggie Carter toping the list at 10 stops. When your safeties are you leading tacklers you know your opponents are getting behind your guys. Carter, Whittington and Taylor were the top three tacklers this week.

Monday, June 25, 2007

Final PAC-10 Stats

I'm finally wrapping up generating statistics for all of UCLA's opponents for 2007. I just put up Washington, Washington State, Oregon and Oregon State. I have almost all of the games, except for WSU who is missing a lot of data. Hopefully this next year I can get more game data for all of the schools.

I'm also looking to complete all of the different statistical queries and calculations so I can focus on getting those reports ready for the up coming season. For Husky, Coug, Duck, and Beaver fans visiting the board for the first time, you way want to check out the stats pages for the other PAC-10 teams. You can find those along the right-hand-side of the blog under "Bruin Roar Football Statistics". I also plan to put up a PAC-10 report that shows how all of the teams compare to each other. That should be the final piece of the puzzle and I think make all of the effort really pay off.

Enjoy!

Sunday, June 10, 2007

Cal and Stanford Statistics

I've loaded up new statistics for the California Gold Bears and the Stanford Cardinal for 2006. For Bear and Cardinal fans who are new to this blog, you can check out stats for the other teams along the right hand side of the website under "Bruin Roar Statistics". I'm hoping to finish loading data for the rest of the PAC-10 teams, Notre Dame, Utah, and BYU before the end of the month.

On a side note, since we are talking about Cal and Stanford, this year is the 25th anniversary of "The Play". It is, hands down, the greatest and most exciting finish to a college football game ever.





I love this video, from an ESPN classic broadcast, because it shows the Stanford field goal before the kickoff and the aftermath on the field after the amazing kick-off return. I had forgotten that John Elway put together an amazing drive, starting deep in Cardinal territory with less than a minute left, to get into field goal position. They kick the field goal, with 4 seconds left on the clock, to take the lead and presumably win the game. That's when all the fun begins.

There are so many cool things about the video. Joe Starkey, the famous broadcaster, is awesome in his description of the events as they unfold. After Stanford scores, he gives a prediction that Cal "pretty much has to run it back to save the game". After a bunch of laterals, including a miraculous over-the-shoulder pass by Mariet Ford, Kevin Moen runs the ball into the touchdown to win the game. The play is immortalized forever with a picture of Moen crashing into a Stanford band member in the end zone after he scored.

I can't describe the play any better than Starkey did in his original broadcast: "Oh my God, the most amazing, sensational, traumatic, heart rending... exciting thrilling finish in the history of college football! California has won... the Big Game...over Stanford."

Monday, May 28, 2007

Arizona and ASU Statistics

I just uploaded new 2006 statistics for Arizona and Arizona State. I wasn't able to get data for all of their games and I think both teams are missing one or two. Anyways, still very interesting to see the information for the desert schools.

For Wildcat and Sun Devil fans coming to this blog for the first time, you may want to check out the statistics for USC and UCLA as well. Eventually all of UCLA's 2007 opponents, including the entire PAC-10, BYU, Utah, and Notre Dame will be available. You can always find the latest Bruin Roar Football Stats on the right-hand side of the blog.

Thursday, May 24, 2007

Statistics Feedback from SC and UCLA fans

After visiting a couple Trojan and Bruin message boards this week, I received a lot of good feedback from football fans of both schools. Here were some of the suggested additions to the Bruin Roar Football Statistics data:
  • "Efficiency numbers" and "Red Zone efficiency" from localbruin on BruinGold.
  • "Play calling by distance and down together" from tommytrojan1122 on WeAreSC.
  • "Points Per Possession/Minute" from WestsideUSCFan on WeAreSC.
  • "1st-10th play, etc performance" from Anonymous on BruinRoar.
I've already added some of those to the latest version of the reports. I'm going to see if I can add the others in the next couple of months. I came up with a few more ideas myself:
  • Special teams break downs for punts, kick-off returns, field goals, etc.
  • Play calling sequences for first 3 plays in a drive.
  • Scoring plays by distance.
  • Turnover break down on fumbles and interceptions.
  • Breakdown of drive length by number of plays, distance, and time-of-possession.
If you are itching for a statistic, even an ad-hoc query for something you are interested in knowing, drop me a comment on the blog and I'll see if I can help you out.

Monday, May 21, 2007

Lies, Damned Lies, and Statistics

I've always loved numbers and statistics. Especially when you apply them to football. There is something gratifying about being able to distill a football game down into a nice little table of data. Maybe I'm weird like that, but I know there are others out there who feel the same way.

One problem I've had over the years is that there just isn't a lot of good statistical data out there for college football. It is either high level stuff like total yards and touchdowns or it is meaningless splits that break down the yards into ridiculous categories like turf type ( My team wins more games on Bermuda!). After years of searching for something better, I decided it was time to create something myself.

Over the last month, I have been working on a computer program that collects play by play data for games, parses out the details, and then saves that information into a database. I can then slice and dice the numbers to my hearts content to find out all kinds of interesting statistics about games, teams, and entire conferences. It is still in the early phases, but I thought I would share some of the results of my work with everyone.

I'm working on generating some reports from the data and I'll be refining those over the summer. I'm looking for feedback on what types of statistics and queries you would like to see in the reports, so after you view them drop me a comment. I'll also be posting around on some message boards looking for feedback. I'll post blog entries as I load more teams, but you can always find the latest updates on the right-hand side of the website under the new Bruin Roar Football Statistics section.

For now, I've loaded all the games for UCLA from 2005 and 2006. I've also loaded in data for USC from 2006 as a comparison. My plan is to load all of the games from last season, for all of the teams we play in 2007, into the database. If I get a positive response from readers then I'll continue to post new reports, for that weeks game, during the season. I think it will definitely be a big resource for you arm-chair analysts out there.

Technical Details

For those of you interested in how the program works, please read on. If you find such computer-speak boring then you may want to check out now.

The program is written primarily in Java. I use the Apache Commons HTTP Client for retrieving the web pages with the play by play data. I then do a screen-scrap of the page and pull out just the play descriptions. Parsing the details of the play data isn't technically difficult but it did take up the most effort.

There are lots of subtle differences in the way plays are described, so finding every possible combination, and reliably extracting that information, has proven to be challenging. I also have to validate the results, as sometimes the original data is just flat out wrong. I try and fix what I can, but it is hard to catch everything as there are over 2,000 plays in a season for a typical team. The good news is that the data is probably 95% correct, so a few misclassified plays one way or the other wont impact the overall numbers much.

Once I have the data parsed, I store it into a pretty simple object structure in memory. I use Velocity to extract the object data out into different file formats. The main format is a set of SQL statements for inserting the information into a local MySQL database. I put everything into a single de-normalized table, just to make the report generation as quick and simple as possible.

To create the reports, I use a local instance of Tomcat running some JSP pages. I have another Java program that loops through all the teams and games, passes those as parameters to the JSP pages, and then saves off the HTML generated. Finally, I run a script that FTPs all the HTML documents to the web server and, viola, you have the reports.

To run the whole thing for one team, for one season, takes less than 5 minutes. There are still a few manual processes in there, but I'm trying to automate the entire thing. I'm still tracking down bugs and refining the program, but I'm pretty happy with the way it works.

Enjoy!