In praise of the spreadsheet

By devin, 28 October, 2015

Spreadsheets are amazing. They are arguably a Turing-complete programming language (only stretching the definition slightly). There are definitely as many spreadsheets as people in the world, and I'd even guess there are over a trillion actively used spreadsheets out there somewhere.

IT people don't respect spreadsheets that much. There's good reason for this. With default settings, they aren't version controlled, they easily conceal drastic errors, and they allow any kind of input so it's hard to reason about them with a programmer's mind. In short, they aren't a database.

Often people migrate to Microsoft Access to try to establish control over mission-critical spreadsheets. The other alternative is implementing some other database application, or buying a solution that does what you're currently doing with a spreadsheet "better". I experimented at my last job with trying to create database applications to replace Access databases.

The problem, again and again, is that when you move to a database application from a spreadsheet, you lose flexibility. Sometimes this isn't a problem: maybe you were losing data because people kept entering "07/09/2015" instead of the properly formatted "09/07/2015". Maybe you don't want flexibility... right now.

But the reality of business (at least small business) is that you need that flexibility.

  1. You'll never be able to describe exactly what you want from your database application
  2. Your IT people will never be able to describe exactly what others want from the database application they're creating
  3. Your marketing/business people will never be able to describe exactly whaht they want from a database application
  4. Worst of all, none of you will ever anticipate the big change to the structure of the data that's coming in 3 months!

The problem with database applications is that they specify a rigid schema AND a rigid UI. Ruby on Rails and Django are easy examples of this; to update the database model, realistically, you need to update HTML views/controllers too (not to mention tests and documentation). 

I see this all the time. Full of hubris, people set out to design the perfect database application. "You won't need your spreadsheet after I'm done!", they say to the marketing people. The marketing people look helplessly on as they're told their spreadsheet is inferior to the new, perfectly designed product. 

This is an old story, but things haven't changed. People still make database apps, and IT people in particular still see them as superior to spreadsheets.

But spreadsheets are flexible!

If you need to change your data model, Excel can do that instantly, and you'll never need to touch your views or controllers because Excel is by default flexible enough to handle the changes. Sure, it's not version controlled but you can work around that with Track Changes, backups, and scripts that export data to a csv.

Further, I would argue that for many spreadsheets if you just let the marketing guy be the programmer, you don't need tests. Sure, the marketing guy might break his spreadsheet. But that might happen to your database application with the same frequency. The marketing guy can have the whole spreadsheet in his head, whereas a database application might have 1 person who is forced to use it instead of a spreadsheet, 3 people involved in specifying it, 3 in design, and 5 more in development. When you have 12 people with input into a database application's design and implementation, it's definitely going to be complex enough to need tests. The marketing guy can collapse that whole pipeline into one brain, drastically simplifying the whole application into... a spreadsheet.

OK, maybe spreadsheets aren't perfect

I think the arguments against spreadsheets are actually fairly valid in a lot of cases. Important data shouldn't remain in a spreadsheet forever. But I think it can be very tempting, specifically for bored IT people, to want to convert them all to databases. This will improve the overall code and data integrity, I don't doubt it. But I think it's important to acknowledge and appreciate the value of the flexibility that is lost.

I think the real solution should be to create database applications that allow the marketing guy the freedom to be flexible. That may mean making bad design decisions, or opening up your data model. It may mean abandoning treasured programming design principles. In the end, it's about making the database application serve the user, rather than forcing the user to serve the database application.

That's an old refrain, and I want no misunderstandings. I'm not talking about continunig to work on the UI of a database application until everything is possible. What I'm really talking about is changing the UI to allow combinations the programmer never thought of.

Programming languages do this - the designer of C had no idea at all of what would be possible with C, he just exposed the power of the computer. A typical database application does not. Take ruby on rails for example. You have to explicitly add each database table to the app, and by default there will be 7 whitelisted controller actions, corresponding to Create, Read, Update, and Delete (CRUD). Rails is amazing because it aims to reduce boilerplate code, but the downfall is that any action not corresponding to one of the CRUD actions involves lots of code. That means it's work to add it, which is a disincentive for programmers to add it.

Why not add an Excel-like interface to your next database app? You could create a workspace view that exposes all the tables in the app as different sheets, with extra sheets for common joins. Then you can have a "new sheet" option which lets the marketing guy access, do calculations on, and combine all of the data that's there. He's a genius! He's a brilliant human who is capable of understanding Excel. Give that power to him, rather than belittling him and giving him a user manual for the UI you made up.

Note: this post was inspired by cubesviewer, an amazing github project that lets users combine and recombine OLAP cube data.

Plain text

  • No HTML tags allowed.
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.