I don’t often participate in the UK’s national lottery. (I mean the gambling one, not random acts of government.) To win the jackpot, you have to choose the six winning numbers, from 1 to 49. On the assumption that all six-number sequences are equally probable, you could pick any and still have as good a chance of winning as anyone else (i.e. virtually zero).
But my theory is that if you are going to win, then it’s better to win with a sequence that nobody else picked, so that the jackpot isn’t split and is yours alone. Humans are very bad at choosing random numbers, (if the sequence 1,2,3,4,5,6 ever comes up, the prize will have to be split among thousands of winners), therefore it would be best to choose random numbers. But since I am also likely to be very bad at choosing random numbers, I’d let the Lottery computer do it, by clicking on their ‘Lucky Dip’ button.
I have no idea how their computer generates its randomness: whether it’s physical or algorithmic. It’s quite simple to design a hardware random source for a computer, based on electrical or thermal noise, or even radioactive decay. Some PC processors now have a real random register inside them. (On Linux, you can access it by reading bytes from /dev/random.)
But pseudo-random is OK too. That’s when an algorithm is used to generate a sequence that appears to be random. Obviously it’s not, because it’s coming from the formula, but a good formula generates a sequence that passes all the statistical tests for randomness.
That’s the bit that humans can’t do. When we try, we always get the distribution wrong. A statistical analysis will reveal that the data has been made up. This has been used to detect various frauds, from the financial to the scientific. In the former, for example, where someone invents fictitious transactions for some purpose, the digits of the amounts always follow an unconscious pattern. (Hint: use a computer to make up your numbers.)
One well-known scientific fraud was committed by Hans Eysenck, who claimed to have studied twins to decide how much of ‘intelligence’ was genetic. His data showed that the answer was “most of it”, which suited Eysenck’s political views: he believed that social class was a natural outcome of genetic superiority. But his published data failed the statistical tests, and to this day nobody knows if he ever did any real research at all. He is even suspected of fabricating an imaginary research assistant.
Perhaps an even more important scientific con to be detected from the statistics of the results was the work of Gregor Mendel. Mendel was a priest and physics teacher at St. Thomas’s Abbey in Brno, but his fame is from a series of experiments in plant breeding in the 1850s, from which he deduced the basic laws of genetics. Even though it was to be another hundred years before Crick and Watson worked out the mechanism of DNA and genetics (by stealing — literally — Rosalind Franklin’s unpublished X-ray data; clearly, genetics is a dirty business), Mendel got the principles absolutely right.
It was his experimental data that later studies identified as dodgy. There were two issues with it. First, the random pattern wasn’t nearly random enough, like the made-up financial accounts or Eysenck’s data. And second, the particular characteristics of his plants that he had chosen ‘randomly’ to breed and cross-breed just ‘happened’ to be the ones determined by only a single gene each; which meant that their presence or absence was always easily detectable. It was clear that he knew the answer he was going to get before he actually did any experiments.
There are still people who want to give Mendel the benefit of the doubt. After all, he was a priest (and they don’t lie, obviously) and he certainly was a significant and original scientist who produced work which forms the whole basis of today’s genetics. And though it was very improbable that his true data wasn’t randomly distributed, it’s still possible. And we might win the next Lottery jackpot.