The Future of Excel

Tuesday, February 12, 2013

Why do people keep using Excel to build giant, error-prone spreadsheets?

Excel is a programming language. It lets users marshal their computers to crunch data. And it does this with a versatility that's unmatched by pre-designed programs: a spreadsheet can handle everything from planning the week's groceries to managing a billion dollars in assets.

Excel is also the easiest programming language to learn. It's so easy to learn, most Excel users don't even realize that they're programming. A power user can have a spreadsheet filled with formulas like

"=IF(A1 > A2, C1, IF(A1 < A2, C2, IF(B1 > B2, C3, IF(B1 < B2, C4, C5))))"

If you ask her if she can program, though, she'll say "no, I'm just using Excel."

This ease-of-use is a double-edged sword. It lets users build more and more complex models, making it natural for small projects to grow into gigantic, mission-critical spreadsheets. One-off trading prototypes regularly turn into important backtests, and then into enormous production models that can barely run.

At each stage, it's easier to add to the spreadsheet than to start over, and taking the time to develop a proper program is an unthinkable delay. By the end of the process no one knows how the final spreadsheet works, even if it manages enough resources to bankrupt the company.

Any good developer can tell you that critical software needs to be carefully designed and tested, not hacked together. But spreadsheets are so easy to use that the people using them to write critical software aren't developers. They don't know how to engineer software.

Excel is like a user-friendly nuclear power plant. An amateur operator might be able to keep it running most of the time, and most of his mistakes won't be serious. But the wrong mistake can blow up your business.

Moving past Excel

The only way to prevent people from running important systems with giant Excel spreadsheets is to present a better alternative. You can't just tell people to stop using Excel and learn to program, because they need to get their jobs done. And you can't build closed software for them, because they'll need to do things that your software wasn't designed for.

One possible successor for Excel is Excel with VBA. Macro-recording and a wealth of online examples make VBA accessible, and because it's a traditional programming language, it complements Excel's capabilities in a few important ways.

Unfortunately, VBA lacks basic programming features, like libraries and exceptions. This means it's a pain to work with and is borderline unusable for major projects. As a result, it's only used as a last resort - when something has to be done in a spreadsheet, and is impossible to do in Excel proper.

Another potential solution is adding Python to Excel. We're working on this at DataNitro. Some of our users have taught themselves to program with our software, so we know the learning curve is gentle. They've also leveraged Python's libraries to build everything from stat-arb models to Twitter clients inside spreadsheets, so there's a significant improvement in power over Excel and VBA.

Whatever the next step is for Excel power users, it needs to make it easier than ever to build sophisticated models. Today's business need for computational complexity is only going to grow. And it's unrealistic to expect everyone to learn software engineering, or to rely on pre-built software.

One way forward is giving Excel users easy access to open-source libraries, databases, and other modern programming tools. It'll help them abstract away some of the complexity of giant models, and increase their usability. This won't prevent people from building spreadsheets that take hours to run. But it does increase the chances that something useful will happen when they do.