How To Uncover Something Beautiful That Explains Everything: A Research Process Primer

I think we’ve all gotten a kick out of those memes that show multiple pictures of any number of various occupations, with subtitles like, “what my mom thinks I do,” and “what my friends think I do,” and then finally, “What I actually do.”

Of course there’s one for research, which I feel obliged to post here:


A lot of you might be interested in some of the topics that I’ve brought up, or that you’ve been seeing on television, or maybe that you’ve just had rolling around in your brain these last forty years. Maybe you’re thinking about retiring and becoming a researcher (that’s sort of what I did, from one perspective), but aren’t really sure what’s inside the black box of research. I thought I would give you a quick rundown of the process, so you can quickly begin your illustrious new career as a data analyst genius.

First, researchers can come at analysis a couple of different ways:

  1. You can prove something you already think is true (I think we all know that guy).
  2. You can try to better understand how phenomena you don’t really understand (your coworker’s absence from her work station) is related to characteristics you can readily observe and collect data on (height, weight, milliliters of coffee she drinks per hour, number of times she uses the bathroom per hour).

The second is a better approach, imho. You get to start with a question that interests you about some phenomenon. Then you construct your basic model that explains it. In the example mischievously given above, let’s say that, even though nobody is saying you’re a gossip, you did perhaps casually mention to your boss that your coworker is often not in her cubie. Let’s also say that your boss shrugs and says, “Probably because she drinks so much coffee!”

There you have it. Your golden relationship. (!!) Now you can retire and join the world of research! (I jest)

So maybe you decide:

Whether or not at desk is determined by whether or not drank coffee.

But that doesn’t really seem to work. Because she is sometimes drinking coffee at her desk. So that relationship doesn’t explain much.  So then you think:

Time spent away from desk is determined by whether or not she drank coffee today.

While this seems better, you also consider the fact that drinking one cup versus two cups is probably going to change the amount of time she’s away, and therefore you have to account for the actual amount of coffee she drinks (your coworker the test subject won’t mind if you measure her coffee, as long as she doesn’t know). So now you’re at:

The amount of time spent away from desk in an eight-hour shift is determined by the amount of coffee she drank during that time period.

So now you’ve got a pretty simply structured central idea/relationship. You want to start collecting data. This is a fluid process.

First of all, when you’re installing the hidden camera inside her cubicle to monitor the time she spends away from her desk, you may notice a confidential note from her doctor referring to test results being ready for pickup. Little messages like these are crucial! Your brian lights up! Maybe, you think, she has weak bowelsthat explains EVERYTHING!

So you might want to ask her about medical conditions related to bowels. Also, while collecting initial data, you’ll want to include some little extries: information that isn’t exactly in the relationship, but could also add fullness to the results. This could be demographic data like age and height, or contextual data like position in company or whether or not she has a family member also on staff. It is even better if you can ask direct questions about the phenomena, but maybe couched in a different question, like, “Rate your overall satisfaction with your cubicle environment,” or something, rather than, “Why do you spend all day away from your desk?” From experience I’ll tell you, that question will get you nowhere good.

After you’ve gotten your first data in (and in a project like this, you will actually get data for everyone in the company), you start to use fancy computer software to tell you things that are so complex on so many dimensions that your little hamster-like brain cannot compute them on its own. The man behind the curtain inside your PC, because the best software spits out Mac products, might actually say something you totally did not expect.

And that’s where the interplay between facts and creative thinking becomes so important. Your little guru in the machine will say something like,

“the more time spent at desk is associated with a higher amount of coffee.”

And you’ll think, I’m so stupid. How can I be so stupid? Why did I quit my job?

Don’t despair.

This is the moment where you will uncover something beautiful that explains everything! Don’t worry. (It may take awhile).

From here, add in other characteristics that you so wisely got data for. One by one, let the little software test the relationships, adding dimension to the overall picture. It may appear, in the end, something like this:

While people who drink tons of coffee can sit for hours on end in their cubicle, data shows that people who have responsibility over two or more other employees spend up to 60% of an eight-hour shift away from their desks.

While initially dismayed by reality, you’re nonetheless elated to be able to tell your boss that said subject’s coffee addiction has nothing to do with her absence from the desk. To boot, you can even suggest the company consider ways to improve efficiency for managers in your department. (This is all as you humbly ask for your job back at the close of your research)

For a more scientific explanation of the cognitive process of research, go ahead and take a look at this article, “Research as a Cognitive Process: Implications for Data Analysis,” by MIT professor, Lotte Bailyn, from 1977 (the most amazing things were born in the late 1970s).