Sandcastles

I am not going to talk about literal sandcastles; I will not speak wistfully of salty breezes, gritty sand, the perfect mix of wet and dry required to drizzle sand into funky pillars. Nor of the inevitable tide that washes it away.

I’m going to talk about programming.

Software development is a curious thing; I sometimes say that it is “fractalized logic” but unless you’re familiar with both fractals and programming you probably don’t know what that means. In simpler but less pithy terms it means that we repeat the same logical patterns at different “scales.” Think about how you learned to add two large numbers by hand: you wrote them down, and you added the ones’ place numbers. If the result is 10 or more, carry a 1 to the next column, the tens’ place numbers. Add those, carry the result. Repeat until all the digits are done. With programming, we wrap this all up into a bundle; we write code to do this kind of thing once and then we make it possible to reference it elsewhere in our code by just saying “add these two large numbers together.”1

Next, we build something that multiplies two large numbers together. This is more complicated, because we have to multiply each of the digits in one number with each of the digits in the other number. You probably learned to do it as I did, where I wrote the two numbers, one on top of the other, and then multiplied each of the top number’s digits with a single digit of the bottom number, writing each result underneath, and then—finally—adding all those intermediate results together to get a final answer. If we are writing software to do this kind of thing, we have something to help us: we already made something that does the adding part, so we can just use that.

Having built addition2 and multiplication, we move on to division. From that we can tackle even more complicated mathematical processes like logarithms and trigonometric functions.3 With these tools we can stack up even more complicated things like automated root finding, statistical analysis, and so on. A teetering tower of mathematical terror.

At the bottom of the stack is the humble addition.

I describe this so that I can put forth a hypothetical situation.4 Suppose there is a mistake in the addition; perhaps when you add two numbers that both end in three zeroes, the result is off by one. In human terms this doesn’t seem huge; it’s a mistake, to be sure, but it occurs only once in every million pairs of numbers, and the answer is off by no more than one one-thousandth of the total. And if all we ever used our addition for was, well, basic addition, maybe it indeed would not be a hugeproblem.

But that humble addition was used as the foundation for an entire elaborate structure of mathematical tools. Mistakes in addition could ripple up into mistakes in multiplication, division, logarithms, and everything else. Each layer assumes the layer underneath it works properly.

This is what programming is like.

Sandcastles Everywhere

A non-trivial fraction of the world uses the web every day.5 The most obvious technologies that make up web sites are HTML, CSS, and JavaScript. But these are actually the products of other programs.6 These programs run on other computers, and might be written in PHP, Ruby, Python, C#, or Java.7 These programming languages are themselves built using another programming language (usually C or C++). The web pages are delivered to a web browser on the recipient’s computer (another program, typically written in C++) through the HTTP protocol (a strict set of rules for how to ask for things and process the answers). HTTP is done through a TCP connection, which relies on IP data packets being delivered from the user’s computer to the host server; these rely on a working internet (BGP, route announcements, and autonomous networks, oh my!) which presumes some fraction of working data lines between its various nodes (fiber, fiber, and more fiber), all of which depend on a functioning electrical power grid and a base level of manufacturing capability.

And you thought it was just a web page.

All of our wonderful technology is inherently interdependent and there’s no way we can eliminate that. In fact, we shouldn’t; we don’t need to be “reinventing the wheel” whenever we build a web site,8 because that’s just a foolish waste of time. More to the point, since we’re human, every time we build something, we run the risk of making mistakes. There are huge benefits to re-using something already built, tested, and proven to work.

But there are risks, too. As a platform architect for quite a few different projects, I’ve had to make this judgment call many times. Do the benefits of re-using this code built by somebody else outweigh the risks? I consider the following factors:9

  1. How long would it take me to build this code myself, compared to finding, evaluating, and integrating some other block of code into my project?
  2. If I build it myself, how much time do I expect to spend fixing problems that are found later?
  3. If I choose to use someone else’s code, do I have a reasonable expectation that they will fix problems? (If not, answer #2!)
  4. Is there an “industry-standard” block of code that lots of other people use? (If so, it means when I need help, other people might already be familiar with the code and can help me; otherwise they have to look at all of my code, which makes getting or hiring help harder.)
  5. If I commit to using someone else’s code, how difficult will it be for me to switch to something different later? (This is especially important if I can’t directly use the code, but have to connect through a service that hides the code from me. Services are run by businesses, and businesses fail; and if that happens, I don’t have the code, all I can do is switch to a different service.)

All of these factors are weighted. I wish I could say this is a mathematical process, but it isn’t. Despite all the software development that has happened over numerous decades, there aren’t enough data points to reliably weight each of these factors correctly for each specific combination of developer and candidate code block.

All of which goes to show that software development is hard, and not just because writing code is hard10 but also because choosing which code needs to be written is also hard. Sometimes you bet, and lose—such as deciding “We’re going to build a Twitter app!” only to have Twitter change the rules after the fact and cap you at 100,000 users, thus permanently limiting your maximum revenue to unsustainable levels. Sometimes you bet, and win—such as deciding years ago that Amazon Web Services were here to stay, and reaping benefits years ahead of competitors who stuck it out with dedicated server farms.

Yeah, this is hard stuff. And thousands and thousands of people like me make those decisions every day, to keep building more stuff. We build sandcastles, on shifting foundations. If we do it right, you don’t see the sand, just the castle.

1 Even some of the earliest computers I’ve written code for—such as the 6502 CPU—could add two decimal digits in a single step. We poor humans can rarely pull that off. Today’s computers can do about 19 digits in a single step. Progress!

2 And, for the sake of argument, subtraction; it’s almost identical to addition, as far as math for computers is concerned.

3 These aren’t actually computed precisely, but instead are calculated a lot like you would do by hand, using an approximation technique and done to enough digits of accuracy that doing more work would be more precise than the computer will actually use.

4 In this specific case, exceptionally, exceptionally unlikely, because addition is extremely well-understood and very unlikely to be broken. But the analogy helps explain what I’m talking about.

5 Currently-promulgated numbers suggest about 40% of the world’s seven billion—so about three billion—use the internet every day. While this isn’t strictly equivalent to web usage, the lines between “web” and “non-web” internet use have become substantially blurred over the last few years as mobile apps have exploded in popularity. Internet usage sources: 1 2

6 Ignoring for the moment things like EmberJS or AngularJS which use a different model, but don’t invalidate my core argument in any way.

7 This is not an exhaustive list. There are lots of choices for this kind of stuff, and most of the time, users of a web site will never notice which one is in use.

8 Or any other kind of programming project.

9 This is not an exhaustive list. I’m trying to stick to the point.

10 Look at that list of things involved with showing a web page to the user; I left out the things the server code has to deal with, like persistent data storage, cache management, web server software itself, host operating systems, server CPU variations, and so on that you do not want to think about because it will make you despair of ever understanding how computers were ever made to work.

Photo Credits: hand in the sand: Robert Fischer, Damien Jones