The danger of small patterns

As I’ve probably said before, I work as a researcher. When you’re doing difficult or expensive research, you don’t usually have the time or money to do a whole lot of replications. That goes doubly if you’re working with patients or patient samples. But since science is all about finding patterns, how can you find patterns in a small dataset?

There are statistical tools that can help with this, but even before you get to the hypothesis testing phase, you need to know which direction your hypothesis will go in. For that, we tend to look at the small patterns which aren’t yet statistically significant and try to see what they mean. The danger here is when you don’t get data in a reasonable amount of time, you want to work on your project but you don’t have data to work on. So you go back to whatever you have, the “small patterns” and start extrapolating from there. “If this pattern holds, what could it mean for this disease?”

Then you can start getting attached to a hypothesis that has no data to back it. When you do get data, you may start to interpret it in light of the small pattern you already detected, a pattern which may not even hold. That’s the problem with small patterns, you get to thinking they mean more than they do.

The human brain is a pattern matching machine. Our first calendars came about from noticing that the seasons of a year came in patterns, and that certain stars in the sky could be seen during the hot season while others could be seen during the colder one. But people also thought they detected patterns about how certain things happened on earth when certain stars were seen in the sky. One pattern between stars and the sky held true, there is a correlation between which stars you can see and the season in your local area. But another pattern was false. Yet both patterns were studied and believed for thousands of years.

I hope I don’t get attached to bad patterns for quite so long as that, but it’s hard to avoid. When you’ve got all the time in the world and not enough data, you get attached to these small patterns that you think you detect. And that can hold true even when the pattern is no longer real.

Leave a comment