Wednesday, July 25, 2012

Online resources for statisticians

My students often look up statistical methods on Wikipedia. Sometimes they admit this with a hint of embarrassment in their voices. They are right to be cautious when using Wikipedia (not all pages are well-written) and I'm therefore pleased when they ask me if there are other good online resources for statisticians.

I usually tell them that Wikipedia actually is very useful, especially for looking up properties of various distributions, such as density functions, moments and relationships between distributions. I wouldn't cite the Wikipedia page on, say, the beta distribution in a paper, but if I need to check what the mode of said distribution is, it is the first place that I look. While not as exhaustive as the classic Johnson & Kotz books, the Wikipedia pages on distributions tend to contain quite a bit of surprisingly accurate information. That being said, there are misprints to be found, just as with any textbook (the difference being that you can fix those misprints - I've done so myself on a few occasions).

Another often-linked online resource is Wolfram MathWorld. While I've used it in the past when looking up topics in mathematics, I'm more than a little reluctant to use it after I happened to stumble upon their description of significance tests:

A test for determining the probability that a given result could not have occurred by chance (its significance).

...which is a gross misinterpretation of hypothesis testing and p-values (a topic which I've treated before on this blog).

The one resource that I really recommend though is Cross Validated, a questions-and-answers site for all things statistics. There are some real gems among the best questions and answers, that make worthwhile reading for any statistician. It is also the place to go if you have a statistics question that you are unable to find the answer to, regardless of whether its about how to use the t-test or about the finer aspects of LeCam theory. I strongly encourage all statisticians to add a visit to Cross Validated to their daily schedules. Putting my time where my mouth is, I've been actively participating there myself for the last few months.

Finally, Google and Google Scholar are the statistician's best friends. They are extremely useful for finding articles, lecture notes and anything else that has ended up online. It's surprising how often the answer to a question that someone asks you is "let me google that for you".

For questions on R or more mathematical topics, Stack Overflow and the Mathematics Stack Exchange site are the equivalents of Cross Validated.

My German colleagues keep insisting that German Wikipedia is far superior when it comes to statistics. While I can read German fairly well (in a fit of mathematical pretentiousness I once forced myself to read Kolmogorov's Grundbegriffe), I still haven't gathered my guts to venture beyond the English version.