Back in the mid-1990s, I did a lot of web work for traditional
media. That often meant figuring out what the client was already doing
on the web, and how it was going, so I’d find the techies in the
company, and ask them what they were doing, and how it was going. Then
I’d tell management what I’d learned. This always struck me as a waste
of my time and their money; I was like an overpaid bike messenger,
moving information from one part of the firm to another. I didn’t
understand the job I was doing until one meeting at a magazine company.
The thing that made this meeting unusual was that one of their
programmers had been invited to attend, so management could explain
their web strategy to him. After the executives thanked me for
explaining what I’d learned from log files given me by their own
employees just days before, the programmer leaned forward and said “You
know, we have all that information downstairs, but nobody’s ever asked
us for it.”
I remember thinking “Oh, finally!” I figured the executives would be
relieved this information was in-house, delighted that their own people
were on it, maybe even mad at me for charging an exorbitant markup on
local knowledge. Then I saw the look on their faces as they considered
the programmer’s offer. The look wasn’t delight, or even relief, but
contempt. The situation suddenly came clear:
I was getting paid to save
management from the distasteful act of listening to their own employees.
In the early days of print, you had to understand the tech to run the
organization. (Ben Franklin, the man who made America a media hothouse,
called himself Printer.) But in the 19th century, the printing press
became domesticated. Printers were no longer senior figures — they
became blue-collar workers. And the executive suite no longer interacted
with them much, except during contract negotiations.
This might have been nothing more than a previously hard job becoming
easier, Hallelujah. But most print companies took it further. Talking
to the people who understood the technology became demeaning, something
to be avoided. Information was to move from management to workers, not
vice-versa (a pattern that later came to other kinds of media businesses
as well.) By the time the web came around and understanding the
technology mattered again, many media executives hadn’t just lost the
habit of talking with their own technically adept employees, they’d
actively suppressed it.
I’d long forgotten about that meeting and those looks of contempt (I
stopped building websites before most people started) until the launch
of Healthcare.gov.
* * *
For the first couple of weeks after the launch, I assumed any
difficulties in the Federal insurance market were caused by unexpected
early interest, and that once the initial crush ebbed, all would be
well. The sinking feeling that all would not be well started with
this disillusioning paragraph
about what had happened when a staff member at the Centers for Medicare
& Medicaid Services, the department responsible for Healthcare.gov,
warned about difficulties with the site back in March. In response, his
superiors told him…
[...] in effect, that failure was not an option,
according to people who have spoken with him. Nor was rolling out the
system in stages or on a smaller scale, as companies like Google
typically do so that problems can more easily and quietly be fixed.
Former government officials say the White House, which was calling the
shots, feared that any backtracking would further embolden Republican
critics who were trying to repeal the health care law.
The idea that “failure is not an option” is a fantasy version of how
non-engineers should motivate engineers. That sentiment was invented by a
screenwriter, riffing on an after-the-fact observation about Apollo 13;
no one said it at the time.
(If you ever say it, wash your mouth out with soap. If anyone ever says
it to you, run.) Even NASA’s vaunted moonshot, so often referred to as
the best of government innovation, tested with dozens of unmanned
missions first,
several of which failed outright.
Failure is
always an option. Engineers work as hard as they
do because they understand the risk of failure. And for anything it
might have meant in its screenplay version, here that sentiment means
the opposite; the unnamed executives were saying “Addressing the
possibility of failure is not an option.”
* * *
The management question, when trying anything new, is “When does
reality trump planning?” For the officials overseeing Healthcare.gov,
the preferred answer was “Never.” Every time there was a chance to
create some sort of public experimentation, or even just some clarity
about its methods and goals, the imperative was to deny the opposition
anything to criticize.
At the time, this probably seemed like a way of avoiding early
failures. But the project’s managers weren’t avoiding those failures.
They were saving them up. The actual site is worse—far worse—for not
having early and aggressive testing. Even accepting the crassest
possible political rationale for denying opponents a target, avoiding
all public review before launch has given those opponents more to
complain about than any amount of ongoing trial and error would have.
In his
most recent press conference about the problems with the site, the President ruefully compared his campaigns’ use of technology with Healthcare.gov:
And I think it’s fair to say that we have a pretty good
track record of working with folks on technology and IT from our
campaign, where, both in 2008 and 2012, we did a pretty darn good job on
that. [...] If you’re doing it at the federal government level, you
know, you’re going through, you know, 40 pages of specs and this and
that and the other and there’s all kinds of law involved. And it makes
it more difficult — it’s part of the reason why chronically federal IT
programs are over budget, behind schedule.
It’s certainly true that Federal IT is chronically challenged by its
own processes. But the problem with Healthcare.gov was not timeline or
budget. The problem was that the site did not work, and the
administration decided to launch it anyway.
This is not just a hiring problem, or a procurement problem. This is a
management problem, and a cultural problem. The preferred method for
implementing large technology projects in Washington is to write the
plans up front, break them into increasingly detailed specifications,
then build what the specifications call for. It’s often called the
waterfall method, because on a timeline the project cascades from
planning, at the top left of the chart, down to implementation, on the
bottom right.
Like all organizational models, waterfall is mainly a theory of
collaboration. By putting the most serious planning at the beginning,
with subsequent work derived from the plan, the waterfall method amounts
to a pledge by all parties not to learn anything while doing the actual
work. Instead, waterfall insists that the participants will understand
best how things should work before accumulating any real-world
experience, and that planners will always know more than workers.
This is a perfect fit for a culture that communicates in the deontic
language of legislation. It is also a dreadful way to make new
technology. If there is no room for learning by doing, early mistakes
will resist correction. If the people with real technical knowledge
can’t deliver bad news up the chain, potential failures get embedded
rather than uprooted as the work goes on.
At the same press conference, the President also noted the degree to which he had been kept in the dark:
OK. On the website, I was not informed directly that the
website would not be working the way it was supposed to. Had I been
informed, I wouldn’t be going out saying “Boy, this is going to be
great.” You know, I’m accused of a lot of things, but I don’t think I’m
stupid enough to go around saying, this is going to be like shopping on
Amazon or Travelocity, a week before the website opens, if I thought
that it wasn’t going to work.
Healthcare.gov is a half-billion dollar site that was
unable to complete even a thousand enrollments a day
at launch, and for weeks afterwards. As we now know, programmers,
stakeholders, and testers all expressed reservations about
Healthcare.gov’s ability to do what it was supposed to do. Yet no one
who understood the problems was able to tell the President. Worse, every
senior political figure—every one—who could have bridged the gap
between knowledgeable employees and the President decided not to.
And so it was that, even on launch day, the President was allowed to
make things worse for himself and his signature program by bragging
about the already-failing site and inviting people to log in and use
something that mostly wouldn’t work. Whatever happens to government
procurement or hiring (and we should all hope those things get better) a
culture that prefers deluding the boss over delivering bad news isn’t
well equipped to try new things.
* * *
With a site this complex, things were never going to work perfectly
the first day, whatever management thought they were procuring. Yet none
of the engineers with a grasp of this particular reality could
successfully convince the political appointees to adopt the obvious
response: “Since the site won’t work for everyone anyway, let’s decide
what tests to run on the initial uses we can support, and use what we
learn to improve.”
In this context, testing does not just mean “Checking to see what
works and what doesn’t.” Even the Healthcare.gov team did some testing;
it was late and desultory, but at least it was there. (The testers
recommended delaying launch until the problems were fixed. This did not
happen.) Testing means seeing what works and what doesn’t, and acting on
that knowledge, even if that means contradicting management’s deeply
held assumptions or goals. In well run organizations, information runs
from the top down
and from the bottom up.
One of the great descriptions of what real testing looks like comes from Valve software, in a piece detailing
the making of its game Half-Life. After designing a game that was only sort of good, the team at Valve revamped its process, including constant testing:
This [testing] was also a sure way to settle any design
arguments. It became obvious that any personal opinion you had given
really didn’t mean anything, at least not until the next test. Just
because you were sure something was going to be fun didn’t make it so;
the testers could still show up and demonstrate just how wrong you
really were.
“Any personal opinion you had given really didn’t mean anything.” So
it is in the government; an insistence that something must work is
worthless if it actually doesn’t.
An effective test is an exercise in humility; it’s only useful in a
culture where desirability is not confused with likelihood. For a test
to change things, everyone has to understand that their opinion, and
their boss’s opinion, matters less than what actually works and what
doesn’t. (An organization that isn’t learning from its users decided
it doesn’t want to learn from its users.)
Given examples of technological success from commercial firms, a
common response is that the government has special constraints, and thus
cannot develop projects piecemeal, test with citizens, or learn from
its mistakes in public. I was up at the Kennedy School a month after the
launch, talking about technical leadership and Healthcare.gov, when one
of the audience members made just this point, proposing that the
difficult launch was unavoidable, because the government simply couldn’t
have tested bits of the project over time.
That observation illustrates the gulf between planning and reality in
political circles. It is hard for policy people to imagine that
Healthcare.gov could have had a phased rollout,
even while it is having one.
At launch, on October 1, only a tiny fraction of potential users
could actually try the service. They generated concrete errors. Those
errors were handed to a team whose job was to improve the site, already
public but only partially working. The resulting improvements are
incremental, and put in place over a period of months. That
is a phased rollout, just one conducted in the worst possible way.
The vision of “technology” as something you can buy according to a
plan, then have delivered as if it were coming off a truck, flatters and
relieves managers who have no idea and no interest in how this stuff
works, but it’s also a breeding ground for disaster. The mismatch
between technical competence and executive authority is at least as bad
in government now as it was in media companies in the 1990s, but with
much more at stake.
* * *
Tom Steinberg, in his remembrance of
his brilliant colleague Chris Lightfoot, said this about Lightfoot’s view of government and technology:
[W]hat he fundamentally had right was the understanding
that you could no longer run a country properly if the elites don’t
understand technology in the same way they grasp economics or ideology
or propaganda. His analysis and predictions about what would happens if
elites couldn’t learn were savage and depressingly accurate.
Now, and from now on, government will interact with its citizens via
the internet, in increasingly important ways. This is a non-partisan
issue; whichever party is in the White House will build and launch new
forms of public service online. Unfortunately for us, the last new
technology the government adopted for interacting with citizens was the
fax; our senior political figures have little habit of talking to their
own technically adept employees.
If I had to design a litmus test for whether our political class
grasps the internet, I would look for just one signal: Can anyone with
authority over a new project articulate the tradeoff between features,
quality, and time?
When a project cannot meet all three goals—a situation Healthcare.gov
was clearly in by March—something will give. If you want certain
features at a certain level of quality, you’d better be able to move the
deadline. If you want overall quality by a certain deadline, you’d
better be able to delay or drop features. And if you have a fixed
feature list and deadline, quality will suffer.
Intoning “Failure is not an option” will be at best useless, and at
worst harmful. There is no “Suddenly Go Faster” button, no way you can
throw in money or additional developers as a late-stage accelerant;
money is not directly tradable for either quality or speed, and
adding more programmers to a late project makes it later. You can slip deadlines, reduce features, or, as a last resort, just launch and see what breaks.
Denying this tradeoff doesn’t prevent it from happening. If no one
with authority over the project understands that, the tradeoff is likely
to mean sacrificing quality by default. That just happened to this
administration’s signature policy goal. It will happen again, as long
politicians can be allowed to imagine that if you just plan hard enough,
you can ignore reality. It will happen again, as long as department
heads imagine that complex technology can be procured like pencils. It
will happen again as long as management regards listening to the people
who understand the technology as a distasteful act.