Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
swickles
Aug 21, 2006

I guess that I don't need that though
Now you're just some QB that I used to know
Many of you are avid readers of the Football Arthouse Farthouse. If not its a great thread where people contribute drawings, photoshops, music, and even writing of all kind. Its basically where Febreeze's https://www.thedrawplay.com originated and became a place for Toaster Beef to post his silly comics. Ehud eventually launched his own blog as well.

Lately, a few people like CzarStark have been posting more technical and science/math oriented writings. While writing itself is indeed a creative endeavor, I feel like there should be a place where the science, math, and various other aspects of football can be discussed. CzarStark's article was a really good start. I also usually do some explanation of injuries in a separate thread, so this just feels like a unifying home.

What else can be posted in here? Anything really! As long as its in the spirit of asking or answering some question about football, its welcome as it will start discussion that will hopefully lead to greater understanding of whats happening. People need to learn and this is a great place to do it. Why is everyone making a big deal and poking holes in the Wells Report? Its because they used the Ideal Gas law and not the Beattie–Bridgeman model, or the van der Waals equation. Why do some people come back from an ACL tear like Adrian Peterson and others end up all RG3'd? Lots of reasons but usually there is a correlation between how many ligaments are destroyed along with the ACL and rehab success. Who shot JR? It was Kristin obviously. You got questions, we got answers!

I don't want this is be like reddit where you have to submit proof that you are a doctor, lawyer, dinosaur, or student in x,y, or z. However, if you do make a claim others might find to be dubious, or odd don't be surprised if your future answers are met with skepticism.

Resources for stats and stuff:
Pro Football Reference: http://www.pro-football-reference.com/
Armchair Analysis: http://www.armchairanalysis.com/
College Football stats: http://www.cfbstats.com/
More college football stats: http://www.sports-reference.com/cfb/
Is Tom Brady a Cheater: https://www.istombradyacheater.com


The first thing I would like to talk about is related to CzarStark's article. There is a concept in statistics, particularly epidemiology and biostats, called power. A study has to have a certain number of people in it in order for the results to be valid. Its related to sample size basically. If you have a rare disease or whatever you are tracking, you will need a huge study in order for it to be appropriately powered. To detect small changes or large, the study size will vary. Baseball uses Sabermetrics efficiently because of the sample size and repeat-ability. Lets say you have a player in his 5th season whose on base percentage is .400. Since this number is generated by 5 years of data, its a huge data set. Assuming an average of 4 times at bat over 5 seasons at 162 games a season, thats a number taken from 3,240 points of data. Also, seeing how he is going to be up at bat the same number of times, you can generally assume that he is going to be around that. Its all because of the number of data points that you have this reliability.

The problem in football is that we often take very unique situations and generalize them to develop the set of numbers and sample size we want. So while 4th and 3 or less might be a good data set, it doesn't take into account things like weather, time in the game, play styles, and with football the ultimate randomizer: injuries. In this case I am not talking about the type of injury where the LG comes out for the game. I am talking about the RT who just got his foot stepped on and is otherwise fine, but won't be able to generate the same amount of power with it for a few snaps. This type of "injury" is more likely to be present later in the game than earlier. Its the nature of the game. Essentially, the point I am trying to make is that plays in football are more than down and distance, score, and time. We also over generalize plays and we end up with "the math says go for it on every 4th down!" when thats clearly not the case. I understand plenty of coaches are overly conservative, I just don't think the solution is to abandon that conservatism over numbers that aren't exactly representative of the situation at hand.

swickles fucked around with this message at 02:23 on Sep 1, 2015

Adbot
ADBOT LOVES YOU

pangstrom
Jan 25, 2003

Wedge Regret
I have a bunch of questions I've wanted to try to answer piling up, right now all I can offer is something about the Draft I linked to in TFF last year.
http://www.bradybutterfield.com/nfl/?p=1

pangstrom
Jan 25, 2003

Wedge Regret

swickles posted:

The problem in football is that we often take very unique situations and generalize them to develop the set of numbers and sample size we want. So while 4th and 3 or less might be a good data set, it doesn't take into account things like weather, time in the game, play styles, and with football the ultimate randomizer: injuries. In this case I am not talking about the type of injury where the LG comes out for the game. I am talking about the RT who just got his foot stepped on and is otherwise fine, but won't be able to generate the same amount of power with it for a few snaps. This type of "injury" is more likely to be present later in the game than earlier. Its the nature of the game. Essentially, the point I am trying to make is that plays in football are more than down and distance, score, and time. We also over generalize plays and we end up with "the math says go for it on every 4th down!" when thats clearly not the case. I understand plenty of coaches are overly conservative, I just don't think the solution is to abandon that conservatism over numbers that aren't exactly representative of the situation at hand.
Not disagreeing as much as taking this a bit further, but one-off reasons can be on both sides of the ball (say, the star DE is dinged up or the defense is gassed or whatever). Math can give you a "baseline" from which to deviate based on whatever circumstances are, though. If your win percentage goes up (say) 15% by going for it in a certain situation, well that's a big number and probably overrules most normal deviation reasons, though.

Beer4TheBeerGod
Aug 23, 2004
Exciting Lemon
Useful resources for stats that probably don't matter:

Pro Football Reference: http://www.pro-football-reference.com/
Armchair Analysis: http://www.armchairanalysis.com/

Metapod
Mar 18, 2012
gently caress yo numbers go deep

thanks for making this thread

Mel Mudkiper
Jan 19, 2012

At this point, Mudman abruptly ends the conversation. He usually insists on the last word.
I petition we make this the official image of the Smarthouse

GNU Order
Feb 28, 2011

That's a paddlin'

Just gonna ground floor this thread by saying that wins are the only stat that matter

CzarStark
Dec 23, 2007

by R. Guyovich
Great OP, swickles! I didn't know the concept of larger sample size combined with difficulty of obtaining a reliable result had a name, but Power is definitely the problem with football stats.

GNU Order posted:

Just gonna ground floor this thread by saying that wins are the only stat that matter

My article just updated with a few words saying exactly this! in my opinion anyway

A few more observations from my data set that might exhaust the interesting things that it tells me:
-QB Cap hit is correlated with yards, touchdowns, and 1st downs all at about the 0.60 level, but with wins only at the 0.45 level.
-Interceptions are positively correlated with completions of 20+ yards (0.5759) and total yards (0.5222).
-Sacks are strongly correlated with total yards (0.9545)

With completions and yards so highly correlated (0.9861) I wanted to take a closer look at those as well. If the intercept of a linear fit of completions vs. attempts is fixed to 0, we get CMP = 0.6226*ATT with a correlation of 0.985. This means that a QB in the NFL has about a 62% completion rate with very little variance, regardless of any other factors.

swickles
Aug 21, 2006

I guess that I don't need that though
Now you're just some QB that I used to know

CzarStark posted:

Great OP, swickles! I didn't know the concept of larger sample size combined with difficulty of obtaining a reliable result had a name, but Power is definitely the problem with football stats.

Like I said, I have only learned about and used it in the context medical research, so I wasn't sure if it was called that in other fields as well.

Also, based on what you said about the QB completion percentage being uniform, I wonder how much or if it is possible to quantify a set number of yards that are reached in a game. I don't mean an average, I mean like a number that 95 or 99 percent of all teams get in every game. For example, in basketball a team is going to score at least 80 points a huge amount of the time. As a result, you can get the "looter in a riot" phenomenon, where an average player can put up 20 points a game, simply because points basically have to be scored in the game. Like even Ryan Lindely could put up 250 yards in a game. I just wonder if you can remove the mathematical outliers (like that Dolphins Steelers game in 2007 that was played in a monsoon with turf thrown down 20 minutes before the game. I think you can see a "bare minimum" number of yards that would be earned in almost every game. I think with that number you might be able to make some judgements about replacement QB's being good, average, or the looter in a riot. If it really was a looter in a riot, you could sign a guy off the street to do it and save some cap money.

GonadTheBallbarian
Jul 23, 2007


How many of you know R and/or chart things with it?

pillsburysoldier
Feb 11, 2008

Yo, peep that shit

WugLyfe posted:

How many of you know R and/or chart things with it?

I'm teaching it to myself. Slowly.

Is this where I can post physics things?

pangstrom
Jan 25, 2003

Wedge Regret

WugLyfe posted:

How many of you know R and/or chart things with it?
I know R and it's my go-to, though I'm forced to use Excel a fair amount.

I don't want to get in a console-war style debate BUT: If you are starting fresh I would recommend Python, though. That's what I'm picking up slowly. (R is just as good, maybe slightly better, for analysis purposes, but Python is useful for all sorts of things)

related thread for this type of thing: http://forums.somethingawful.com/showthread.php?threadid=3359430

pangstrom fucked around with this message at 01:09 on Sep 1, 2015

Ehud
Sep 19, 2003

football.

I want you guys to have a place to put Effort Posts™ too, so if any of you nerds want to write something smart for FART, just let me know.

CzarStark's post on FART was very well received.

GonadTheBallbarian
Jul 23, 2007


pangstrom posted:

I know R and it's my go-to, though I'm forced to use Excel a fair amount.

I don't want to get in a console-war style debate BUT: If you are starting fresh I would recommend Python, though. That's what I'm picking up slowly. (R is just as good, maybe slightly better, for analysis purposes, but Python is useful for all sorts of things)

related thread for this type of thing: http://forums.somethingawful.com/showthread.php?threadid=3359430

Awesome! Thanks for this. I tried doing the coursera program, but that was... lovely.

swickles
Aug 21, 2006

I guess that I don't need that though
Now you're just some QB that I used to know

Beer4TheBeerGod posted:

Useful resources for stats that probably don't matter:

Pro Football Reference: http://www.pro-football-reference.com/
Armchair Analysis: http://www.armchairanalysis.com/

Thanks, I will toss in a resources section to the OP! List your links if you got em!

Ehud
Sep 19, 2003

football.

ESPN just introduced something called the Football Power Index. It's uh...Well here, you read about it:

Explanation on what it is:
http://espn.go.com/nfl/story/_/id/13539793/espn-nfl-football-power-index-debuts

How it was developed:
http://espn.go.com/nfl/story/_/id/13539941/how-espn-nfl-football-power-index-was-developed-implemented

And of course, the initial rankings:
http://espn.go.com/nfl/story/_/id/13550051/espn-nfl-football-power-index-rankings-august-31

axeil
Feb 14, 2006

WugLyfe posted:

How many of you know R and/or chart things with it?

R is the devil and I never want to touch it again after using it for my thesis.

SAS on the other hand... :getin:


Also DVOA is poo poo and is bad.

For a little more content, I was pondering if I could examine Pythagorean wins and see what level of correlation they have from year to year. Or more interestingly, if you look at a team's Pythag wins during the year, how predictive is it of the rest of they year? If at week 9 a team has 6 Pythag wins, how likely are they to get to 10 actual wins? Stuff like that.

I think only stats at the team or league level are really ever going to be usable because of the power issue football stats have. Maybe extending the season out to 18 games actually would be good because it'd give us 2 more games of observations. Probably not enough to make things anywhere close to meaingful though.

axeil fucked around with this message at 04:40 on Sep 1, 2015

swickles
Aug 21, 2006

I guess that I don't need that though
Now you're just some QB that I used to know
My stats training is limited to what I have learned in medicine and in earning a physics degree. Can you explain to me what Pythagorean wins means?

pangstrom
Jan 25, 2003

Wedge Regret

swickles posted:

My stats training is limited to what I have learned in medicine and in earning a physics degree. Can you explain to me what Pythagorean wins means?
How many wins a team "should" have earned, just based on point scored vs. points allowed. It's meant to be a slightly better measure of how good a team is than just going off their record.

axeil
Feb 14, 2006

swickles posted:

My stats training is limited to what I have learned in medicine and in earning a physics degree. Can you explain to me what Pythagorean wins means?

pangstrom posted:

How many wins a team "should" have earned, just based on point scored vs. points allowed. It's meant to be a slightly better measure of how good a team is than just going off their record.

Yeah pretty much this. It's to let you compare performance between two teams, one of whom is blowing people out and another one that's winning squeakers. Over time you'd expect the team that blows people out to be the better one than the one constantly winning close games, which is why the NCAA computers used to use margin of victory back in the BCS era. Unfortunately this caused coaches to run up the score against Podunk State while the metric was actually just trying to measure the difference between an Ohio State team that constantly wins by 14 and an Auburn team that keeps winning games by 3 or 4 points.

Here's the actual formula that FO came up with when they did some research into it (hey, not everything they do is lovely)

Wins = ((points scored)^2.37 / ((points allowed)^2.37 + (points scored)^2.37)) * 16

Here's a whole shitload of words about how they came up with it from Wikipedia. It was originally created for baseball but the principle applies for all sports. https://en.wikipedia.org/wiki/Pythagorean_expectation

Wikipedia posted:

More simply, the Pythagorean formula with exponent 2 follows immediately from two assumptions: that baseball teams win in proportion to their "quality", and that their "quality" is measured by the ratio of their runs scored to their runs allowed. For example, if Team A has scored 50 runs and allowed 40, its quality measure would be 50/40 or 1.25. The quality measure for its (collective) opponent team B, in the games played against A, would be 40/50 (since runs scored by A are runs allowed by B, and vice versa), or 0.8. If each team wins in proportion to its quality, A's probability of winning would be 1.25 / (1.25 + 0.8), which equals 50^2 / (50^2 + 40^2), the Pythagorean formula. The same relationship is true for any number of runs scored and allowed, as can be seen by writing the "quality" probability as [50/40] / [ 50/40 + 40/50], and clearing fractions.

The assumption that one measure of the quality of a team is given by the ratio of its runs scored to allowed is both natural and plausible; this is the formula by which individual victories (games) are determined. [There are other natural and plausible candidates for team quality measures, which, assuming a "quality" model, lead to corresponding winning percentage expectation formulas that are roughly as accurate as the Pythagorean ones.] The assumption that baseball teams win in proportion to their quality is not natural, but is plausible. It is not natural because the degree to which sports contestants win in proportion to their quality is dependent on the role that chance plays in the sport. If chance plays a very large role, then even a team with much higher quality than its opponents will win only a little more often than it loses. If chance plays very little role, then a team with only slightly higher quality than its opponents will win much more often than it loses. The latter is more the case in basketball, for various reasons, including that many more points are scored than in baseball (giving the team with higher quality more opportunities to demonstrate that quality, with correspondingly fewer opportunities for chance or luck to allow the lower-quality team to win.)


axeil fucked around with this message at 05:38 on Sep 1, 2015

got any sevens
Feb 9, 2013

by Cyrano4747

swickles posted:


The problem in football is that we often take very unique situations and generalize them to develop the set of numbers and sample size we want. So while 4th and 3 or less might be a good data set, it doesn't take into account things like weather, time in the game, play styles, and with football the ultimate randomizer: injuries. In this case I am not talking about the type of injury where the LG comes out for the game. I am talking about the RT who just got his foot stepped on and is otherwise fine, but won't be able to generate the same amount of power with it for a few snaps. This type of "injury" is more likely to be present later in the game than earlier.

Does this mean making a good play later in the game, despite your tiredness (though maybe it equals out with the other team's?), is more impressive? :biotruths: (J/k dude, I know most of this board shits on the 'clutch' arguments but I think there is a small place for that info, if it tends to be a trend with certain players (Eli, etc)).

swickles
Aug 21, 2006

I guess that I don't need that though
Now you're just some QB that I used to know

axeil posted:

Yeah pretty much this. It's to let you compare wins between two teams, one of whom is blowing people out and another one that's winning squeakers. Over time you'd expect the team that blows people out to be the better one than the one constantly winning close games, which is why the NCAA computers used to use margin of victory back in the BCS era. Unfortunately this caused coaches to run up the score against Podunk State while the metric was actually just trying to measure the difference between an Ohio State team that constantly wins by 14 and an Auburn team that keeps winning games by 3 or 4 points.

See thats funny, because the first thing I think about is Ohio State vs. Florida in 2006. Ohio State blew out most teams, their closest game being against Michigan where they won by a field goal. Florida squeaked out win after win and even had a loss to Auburn and most people expected Ohio State to walk all over Florida. The final result was the exact opposite with Florida dominating in every aspect of the game. A lot of that has to do with strength of schedule. College is unique simply because of the number of teams and the disparity of schedules. Its a whole other problem that is difficult to deal with.

axeil
Feb 14, 2006

swickles posted:

See thats funny, because the first thing I think about is Ohio State vs. Florida in 2006. Ohio State blew out most teams, their closest game being against Michigan where they won by a field goal. Florida squeaked out win after win and even had a loss to Auburn and most people expected Ohio State to walk all over Florida. The final result was the exact opposite with Florida dominating in every aspect of the game. A lot of that has to do with strength of schedule. College is unique simply because of the number of teams and the disparity of schedules. Its a whole other problem that is difficult to deal with.

Yeah, I think college especially has issues because the schedule strength is vastly different. The NFL is a bit easier since, within a division at least, every team plays almost the same schedule, with the only difference being your 2 "where you finished last year" conference games.

But you're right, you'd need to come up with some way to weight for schedule quality if you wanted to use it for any kind of projection forward. Beating the 2008 Lions by 50 shouldn't count as much as beating the 2007 Pats by 10.


Oh god am I starting to create a regression model for NFL wins? This is a dark, dangerous and ultimately unlikely to succeed road...

Grittybeard
Mar 29, 2010

Bad, very bad!

axeil posted:

NCAA computers used to use margin of victory back in the BCS era. Unfortunately this caused coaches to run up the score against Podunk State while the metric was actually just trying to measure the difference between an Ohio State team that constantly wins by 14 and an Auburn team that keeps winning games by 3 or 4 points.

I always liked the idea of capping the margin of victory used in the calculation for things like this. I think the general idea was to cap it at 24 points (three TDs + 2 point conversions). Sure you'd get some assholes trying to score a TD with 2 seconds left instead of kneeling the clock out against their conference rival to win by 10 instead of 3, but Juggernaut U has no reason to beat up on Podunk State anymore than they already are.

Is there some obvious reason that would have been a terrible idea? Not that it really matters anymore I guess, just curious.

Spoeank
Jul 16, 2003

That's a nice set of 11 dynasty points there, it would be a shame if 3 rings were to happen with it
It's called the Pythagorean Expectation because it was developed in baseball by Bill James and the original formula was:

Win = runs scored2 / runs scored2 + runs allowed2

It was very crude and based on the Pythagorean Theorem. The number was fine-tuned and refined for each sport. You'll hear second-order/third-order which were refinements of the formula.



If you're interested in the Smarthouse and like baseball, too, check this (free) course out:
https://www.edx.org/course/sabermetrics-101-introduction-baseball-bux-sabr101x-0
You learn some sports stats stuff as well as SQL and R basics.

Bip Roberts
Mar 29, 2005
Pythagorean is also way more useful for baseball than other sports since things like garbage time don't really exist and it's built up from a billion games and there is decent league parity.

Chichevache
Feb 17, 2010

One of the funniest posters in GIP.

Just not intentionally.
Has anyone quantified eliteness yet? If so, is Joe Flacco?

Forever_Peace
May 7, 2007

Shoe do do do do do do do
Shoe do do do do do do yeah
Shoe do do do do do do do
Shoe do do do do do do yeah
This thread is relevant to my interests and I may or may not write something in the coming weeks/months. I've done some stuff for the fantasy football thread but now that I have nfldb setup for python queries I might as well do actual football stuff too.

Ehud
Sep 19, 2003

football.

Forever_Peace posted:

This thread is relevant to my interests and I may or may not write something in the coming weeks/months. I've done some stuff for the fantasy football thread but now that I have nfldb setup for python queries I might as well do actual football stuff too.

I was just looking at this recently. I was trying to think of a game we could play and was looking for ways to automate the statistics. This would be helpful in automating something like 1000 yards or bust.

CzarStark
Dec 23, 2007

by R. Guyovich

swickles posted:

Like I said, I have only learned about and used it in the context medical research, so I wasn't sure if it was called that in other fields as well.

Also, based on what you said about the QB completion percentage being uniform, I wonder how much or if it is possible to quantify a set number of yards that are reached in a game. I don't mean an average, I mean like a number that 95 or 99 percent of all teams get in every game. For example, in basketball a team is going to score at least 80 points a huge amount of the time. As a result, you can get the "looter in a riot" phenomenon, where an average player can put up 20 points a game, simply because points basically have to be scored in the game. Like even Ryan Lindely could put up 250 yards in a game. I just wonder if you can remove the mathematical outliers (like that Dolphins Steelers game in 2007 that was played in a monsoon with turf thrown down 20 minutes before the game. I think you can see a "bare minimum" number of yards that would be earned in almost every game. I think with that number you might be able to make some judgements about replacement QB's being good, average, or the looter in a riot. If it really was a looter in a riot, you could sign a guy off the street to do it and save some cap money.

Unfortunately I haven't found a good repository for individual games yet, so I can't say anything about that. Looking only at full-season per-game stats though you would be REALLY surprised at how bad the bottom half of QBs in the league are:

Those QBs at the bottom aren't even outliers in terms of number of games played. 2012 Brady Quinn threw for 114 ypg over 10 games, 2012 CKaep threw for 140 ypg over 13 games (half of which are in SF, so you can't blame weather for reducing his number of attempts), etc. I'm sure you could come up with a confidence interval, but I'm also sure it will be a number of yards so low it won't be informative or useful. The outlier at the top of the chart is 2013 Peyton btw.

Also since attempts and yards are correlated at 0.974 these are just QBs that for some reason simply don't throw the ball much. Doing the same linear fit on Yards vs Attempts as I mentioned in my last post gives YDS = 7.21*ATT with an r=0.9733, so again we see that with a surprisingly high probability (and over a long enough span of time) every pass is a 7.2 yard pass.

Before anyone asks, yes, even Alex Smith. His last 3 regular season YPA numbers are 8, 6.5, and 7. What's the average of those 3 numbers? 7.1667 yards per attempt.

I seriously love it when the math works out. :)

Ehud posted:

ESPN just introduced something called the Football Power Index. It's uh...Well here, you read about it:

Explanation on what it is:
http://espn.go.com/nfl/story/_/id/13539793/espn-nfl-football-power-index-debuts

How it was developed:
http://espn.go.com/nfl/story/_/id/13539941/how-espn-nfl-football-power-index-was-developed-implemented

And of course, the initial rankings:
http://espn.go.com/nfl/story/_/id/13550051/espn-nfl-football-power-index-rankings-august-31

Oh god. I feel like they waited until after my gritty expose was published to make this public :argh:

Chichevache posted:

Has anyone quantified eliteness yet? If so, is Joe Flacco?

I bet if I looked at Postseason FARTS he'd be off the charts because of its emphasis on Just Winning Games.

Forever_Peace posted:

This thread is relevant to my interests and I may or may not write something in the coming weeks/months. I've done some stuff for the fantasy football thread but now that I have nfldb setup for python queries I might as well do actual football stuff too.

OK, this looks amazing and exactly what I was looking for. Thanks for sharing this, I have a few ideas that this will make much easier.

Alaois
Feb 7, 2012

Chichevache posted:

Has anyone quantified eliteness yet? If so, is Joe Flacco?

no, but yes

seiferguy
Jun 9, 2005

FLAWED
INTUITION



Toilet Rascal

WugLyfe posted:

How many of you know R and/or chart things with it?

I've used R a couple times in my life and everytime it's been to create Chernoff faces based on sabermetrics.

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
R is wonky but fine, Python with scipy/numpy are probably great, Haskell has a higher learning curve and a weird community but is great once you're dialed in

Forever_Peace
May 7, 2007

Shoe do do do do do do do
Shoe do do do do do do yeah
Shoe do do do do do do do
Shoe do do do do do do yeah

Ehud posted:

I was just looking at this recently. I was trying to think of a game we could play and was looking for ways to automate the statistics. This would be helpful in automating something like 1000 yards or bust.

Drop a line if you want a hand with anything.

SurgicalOntologist (another psych goon) is doing some pretty great stuff with it over in the fantasy football threads too. He's currently working on an add-on to nfldb that scrapes weekly player projections and archives them (I just asked him to scrape vegas lines for something I am interested in pursuing). He is a better programmer than I am but I like getting the practice.

Also pretty handy with R. SAS can eat a bag of dicks.

edit: here is my recommended stats program loadout from the psych thread.

Forever_Peace
May 7, 2007

Shoe do do do do do do do
Shoe do do do do do do yeah
Shoe do do do do do do do
Shoe do do do do do do yeah
First project idea: abuse kernel density estimation to make up some shady bullshit settle once and for all whether Flacco is an elite quarterback (by auto-discretizing quarterback tiers).

axeil
Feb 14, 2006

Forever_Peace posted:

Drop a line if you want a hand with anything.

SurgicalOntologist (another psych goon) is doing some pretty great stuff with it over in the fantasy football threads too. He's currently working on an add-on to nfldb that scrapes weekly player projections and archives them (I just asked him to scrape vegas lines for something I am interested in pursuing). He is a better programmer than I am but I like getting the practice.

Also pretty handy with R. SAS can eat a bag of dicks.

edit: here is my recommended stats program loadout from the psych thread.

What do you have against SAS? :smith:

GonadTheBallbarian
Jul 23, 2007


seiferguy posted:

I've used R a couple times in my life and everytime it's been to create Chernoff faces based on sabermetrics.

This should happen with QBR somehow

Forever_Peace
May 7, 2007

Shoe do do do do do do do
Shoe do do do do do do yeah
Shoe do do do do do do do
Shoe do do do do do do yeah

axeil posted:

What do you have against SAS? :smith:

1) They charge like ten grand per year for a single user license, and server licenses run into the hundreds of thousands. To protect their ability to charge that much, they sued an open source alternative (wps) for duplicating their FUNCTIONALITY despite having no access to the source code. (they lost because lol)

Was gonna make a list but then I decided that was reason enough.

axeil
Feb 14, 2006

Forever_Peace posted:

1) They charge like ten grand per year for a single user license, and server licenses run into the hundreds of thousands. To protect their ability to charge that much, they sued an open source alternative (wps) for duplicating their FUNCTIONALITY despite having no access to the source code. (they lost because lol)

Was gonna make a list but then I decided that was reason enough.

Eh. It's great for huge (1MM+ observations) datasets, which is what I typically work with so I like it. I don't give a poo poo about open source, I just want software that works.

Adbot
ADBOT LOVES YOU

GonadTheBallbarian
Jul 23, 2007


axeil posted:

...I just want software that works.

Yeah, that's my primary mover too. I went to R to make heatmaps for work, but tbh I'd like to get better/learn more from someone who doesn't just post a huge wall of text on a 1990s website.

  • Locked thread