#11 - Upgrade the System

Upgrading software and upgrading society

Hey all,

The last 6 days have been a whirlwind - and it’s been hard to focus on software.

From the outpouring of outrage over the needless killing of George Floyd, to a video of Amy Cooper weaponizing a false plea for help against a black man just trying to watch some birds - the country reached a boiling point. With the shooting of Ahmaud Arbery fresh in our minds, we’re seeing protest marches (many peaceful and some less so) flood the nation.

Meanwhile, astronauts are being launched into space and the coronavirus still lurks - omnipresent and indifferent to human emotions.

Try to focus on software with all of that going on.

It was hard for me to pick a software topic today that wouldn’t seem callous and out of touch with the greater narrative.

Still, I wanted to write something that would be both technically and culturally relevant. I settled on a topic about performing system upgrades. We’ll talk about two - a software system upgrade and a societal upgrade.

A Software System Upgrade

Near the end of my first year at my first job, I was working on a project to help upgrade our payments system. This upgrade involved transitioning from one set of data models to another - models that were not backwards compatible.

As I dug deeper into the problem, I realized there were all sorts of complications to worry about.

  • What happens with the data that currently lives in the old system? In order for us to move to the new system, don’t we have to migrate the old data over as well?

  • What happens if something goes wrong with the new system? Can we just move back onto the old system?

I drafted answers for these problems and realized that something wasn’t quite right. I could “backfill” all the old data with a script, but what if the script ran too early? Until the “cutover” to the new system, data would continue to flow through the old system. If the script and cutover weren’t timed perfectly, there would be a gap of data in the new system.

That seemed like a rather precarious operation - timed incorrectly, we’d have a mess on our hands.

Stumped, I presented my conundrum to a senior teammate. He smiled, looked out the window, and pointed, “So, how would you replace the Bay Bridge?”.

Replacing a Bridge

Small bridges can be torn down and replaced in a weekend, but there’s no way you can do that for a large bridge, such as the one that connects Oakland and San Francisco.

So what to do? Here’s a snapshot of what they actually did:

When replacing the eastern section of the Bay Bridge, engineers built a new bridge in parallel to the old one.

In the image above, you can see how at that precise point in time - there are effectively two fully functioning bridges in existence. That’s how they replaced the Bay Bridge.

Replacing Data Models

With a simple question (and a tiny bit of explanation), my senior teammate had shown me the way. It was clear what I needed to do:

  • Create a point in time where both the old data models and the new data models were both fully operational. This would allow us to easily switch back and forth.

  • Only once we were confident that the new system behaved flawlessly, would we remove the old ones

In practice, there are quite a few more phases:

  • Double Write: For a period of time, perform “double writes”. This is when you mirror traffic from the old system into the new one. This way, both systems are up to date.

  • Backfill: Run a one time backfill script to migrate historical data into the new system. This might not be a perfect process (i.e. the data models don’t map 1-1), but is often necessary. (Beware of the load this can cause on your infrastructure)

  • Shadow Read: For a period of time, perform “audit” or “shadow” reads. During this phase, you will still be servicing “read” requests from your old data models, but you will also be running a background read to make sure that the new system is behaving properly.

  • Switch the Read: Once the Shadow Read reveals no anomalous behavior, you can switch the read such that production traffic is getting serviced by the new data models.

  • Demolish the Old: Now that both read and write traffic is being serviced by the new system, it’s time to stop the double write. And once that process has ended, all the old code can be cleaned up and deleted from the codebase.

Upgrades and Society Today

As I look around the world today, I see a society that’s in need of a variety of upgrades. Some of them are infrastructural - like our pandemic response and legacy COBOL unemployment benefit mainframes. Here, we can use the bridge metaphor to guide us.

But in other aspects of our lives, the upgrades we need are to the software that runs in our heads. We need to learn how coronaviruses spread, and the actions we can take to reduce their viral coefficient. We need to become aware of the racism that lurks and operates in our own minds, as well as the minds of our friends and family. And we need to understand the actions we can take that start making a difference.

Marches and peaceful protests are good, they spread awareness and force people to notice. The very reason that I’m writing this message right now is a testament to their success. They have forced the narrative.

But it would be a shame if we didn’t take it further. When I ponder on what actions I can take, I remember the ending of Malcolm X’s autobiography. Early in the book, a white college student approaches Malcolm and asks her what she can do - and he tells her “Nothing”. She leaves crying.

But towards the end, Malcolm has changed. Reflecting on that memory, he says:

I regret that I told her that. I wish that now I knew her name, or where I could telephone her, or write to her, and tell her what I tell white people now… Where the really sincere white people have got to do their “proving” of themselves is not among the black victims, but out on the battle lines of where America’s racism really is—and that’s in their own home communities; America’s racism is among their own fellow whites.

He says, “Work in conjunction with us—each of us working among our own kind”. And “[w]e will meanwhile be working among our own kind, in our own black communities—showing and teaching black men in ways that only other black men can—that the black man has got to help himself”.

✊🏿

Conclusion

When it comes to software upgrades, I can give you some tangible advice. When it comes to societal upgrades, there is no clear cut plan to execute. Nonetheless, it’s imperative we try.

The policy changes we must demand are crucial. Here, I defer to those with deep knowledge:

But when it comes to fighting racism in our own communities, this is something we can start today. Learn history and enrich your education with the works of minority voices. Understand the centuries of racism that have led us to this point and the stories of those who had to live that experience.

And then start conversations. With friends. With family. “I want to talk about race”. That’s the work we need to do next.

A bridge is a pretty special piece of infrastructure. It connects two areas together that previously would have been inaccessible. When you learn about Race and Society, and initiate conversations in your own community - you become a bridge, a pretty special person in the fabric of our society.

The societal upgrade can start with you. That’s a pretty powerful thought.

Until next time,

Phil ✊