Death to Spreadsheets

While the debate rages about a reproducibility crisis in academia, there’s a much more common one right under our noses

Keith McNulty

--

There was a time when the only way I knew how to do analysis was to use a spreadsheet. I spent over 15 years building highly complex models and analytics on those screens with the rectangular boxes that so many have come to rely on.

Then, in early 2016, it all changed for me. The fuss had started about R and Python, and I asked a few people who were ‘in the know’ which software they believed to be the best for conducting the widest range of analytics. The unanimous answer I received pointed to R (probably due to the people I spoke to). So I decided I am going to learn this thing, and find out if all the hype is worth it.

Six months later, after a lot of late nights and weekends, and through finding whatever chances I could to solve work related problems using R, I was in a situation where I could not bear the idea of working in a spreadsheet again. I’m serious. Today it is a chore to open them, and the only reason I do so is because I communicate with some people who are in the same position as I was 7 years ago.

I look at functions like VLOOKUP and I compare them to dplyr::left_join(). It’s like I am standing on a street littered with trash, and on one side is an unfortunate man who has a small litter picker and is picking the pieces up one by one, and on the other side is someone carrying the biggest, most powerful litter vacuum you’ve ever seen.

So I believe spreadsheets are not good for us. They trap us in this narrow set of views and options, use up all our computer’s memory on their pretty look and feel, decide to sulk and storm off to the other room when there’s too much data, and all because (like our kids with Snapchat) we have become addicted to instant visual-analytic gratification.

There’s another reason spreadsheets have caused harm over the years — they have created a corporate reproducibility crisis.

What is a reproducibility crisis?

An important part of the scientific method is the requirement to reproduce results in order to validate them. Most academics agree that, at least…

--

--

Keith McNulty

Pure and Applied Mathematician. LinkedIn Top Voice in Tech. Expert and Author in Data Science and Statistics. Find me on LinkedIn, Twitter or keithmcnulty.org