Welcome!
Last week was Software Mentor’s first week. At the time, I had no audience. This week, after answering questions on Reddit, we’re now at 60+ subscribers! Infinite growth 🎉! Thank you for signing up and I’m excited to share my lessons with you all!
Housekeeping
But before we begin with this week’s newsletter, I have some housekeeping. Software Mentor is meant to serve readers and I want to get a better sense of who you are. I’ve put together a short informational survey that will help me tailor my writing to your interests 👉 https://forms.gle/Y7g8gJYyvGuzScQ3A
Fill it out if you have a time. Thank you! 🙏
Also, come find me on Twitter.
Transactional: A Topic both Timely and Technical
Times are crazy and I wanted to write about a topic that would be both timely and technical. Alas, I couldn't find something that would span both, but we'll settle for a word that has a double meaning - one that is technical, and one that serves as a commentary on the times.
That word is "Transactional". The technical story first.
My First Outage: A Transactional Story
At my first job (Junior Software Engineer 👶 at Symphony Commerce), in my first month, we had an outage. Symphony was an ecommerce shop and we powered, among many other things, subscriptions for some of our clients.
Every day, we sent out recurring orders - a process that included payments and shipments. Usually smooth, but not today.
Subscriptions were not going out. For some reason, we couldn’t send out shipments. To make matters worse, we were repeatedly charging customers every hour. Instead of being charged once for their shipment, customers were getting charged every hour for items that weren’t even getting shipped. This was bad.
The oncall engineers mitigated the issue by refunding the bad charges, but couldn’t identify the root cause. They did, however, deduce that the culprit was likely one of a few code changes. And mine was one of them 😰
I remember working with the oncall response team to reason about my change and we agreed it was unlikely to have caused this outage. Eventually, they found out what happened. It was a combination of two bugs - the first was an error introduced in the subscription creation code. The second was the accidental addition of a @Transactional
annotation that occurred during the resolution of a git merge conflict.
I could understand the first bug. But the second bug? I had no idea where to start. Transactional? Never heard of it.
What is @Transactional?
The @ sign is a Java specific feature that isn’t important for this story. The important thing is to understand what Transactional means. At the time I had no idea, and felt extremely out of my depth. Imposter syndrome meter at a full 10/10.
I have, since then, come to understand what Transactional means. As a reader of this newsletter, I want to make sure you never run into a situation where you’re hit with that concept and don’t know where to start. So let’s dive in.
In this context, Transactional refers to a database transaction - it’s a feature included in many popular relational databases like MySQL and Postgres.
Starting a database transaction:
To invoke this feature, you simply type a command: START TRANSACTION
(or BEGIN
in Postgres). Then you make your changes. And when you’re done, you type COMMIT
if you’re happy, or ROLLBACK
if you’re not.
Given how intimidating the language of database transactions might be, it’s extremely reassuring that using them is a piece of cake 🍰
When to use them:
Now, let’s understand why a transaction is useful . Typically, textbooks say there are 4 properties of transactions that are important, which they present in a helpful mnemonic of “A.C.I.D.”. But really, only two properties matter, and the other two were added to make a psychedelic backronym (ok, I don’t have a citation to prove it was a backronym, but it was the 70s 🤔).
Let’s focus on the two useful features (the A and I in ACID), and then briefly examine the two tertiary features that are extremely dull (the C and D):
Useful features:
Atomicity (All or nothing): Imagine you have 3 statements and you want to execute them together. If any of them fail, you don't want the others to succeed. They're the three musketeers - all for one and one for all. Using the transaction feature of your database allows you to get this all or nothing behavior.
Isolation (The illusion of sequential execution): Imagine you are the only user of a database. You can do whatever you want to the data because you're the only one there. But imagine now it gets shared between two people. An interesting question arises when both people try to read and write the same data - at the same time. What happens? Who wins? Transactions help out here. They can make it seem like these users aren't contending for the data at all. It will seem as if one of them actually goes first, and then the other. (This illusion is maintained by locking data - which causes a performance tradeoff. I’m not going to get into this, but this is an EXTREMELY useful property in a world of concurrent users - but comes at a cost in performance)
Tertiary (Bullshit) features:
Consistency - This is a bullshit feature because it’s a feature that is ALWAYS there. It’s not a special feature of database transactions. Consistency here means the constraints are respected, but relational databases ALWAYS respect their constraints - otherwise, they wouldn’t be called constraints. Note: Another annoying thing about this term “Consistent”, is that it means different things in the context of relational databases vs in the context of distributed databases. The number of people that tell me MySQL is “consistent” when we’re comparing it to DynamoDB is too many 🤦♂️Sadly, I can’t entirely blame them, the term consistent is not consistent across domains. Double facepalm, cry 🤦♂️🤦♂️😭
Durability - This is a bullshit feature because it’s obvious. It just means your data is saved (usually to disk, but really, any non-volatile location). Again, it’s not a feature of transactions in particular, but a feature of databases in general. IT IS NOT SPECIAL. WHY DO WE NEED TO DEDICATE BRAIN CYCLES ON THIS???
I’m getting worked up thinking about how much unnecessary complexity went into that backronym. It makes the topic of transactions seem larger than it is - a huge pet peeve of mine. But I digress. Back to understanding when to use them.
Here’s an example:
Let's say you want to do the following two insertions (in MySQL). A shipment, and a payment:
INSERT INTO payments (...) VALUES (...);
INSERT INTO shipments (...) VALUES (...);
But you want to them to happen together - or not happen at all. So you can “wrap” these in a transaction like so:
START TRANSACTION;
// your application makes a successful payment
INSERT INTO payments (...) VALUES (...);
// your application makes a successful shipment
INSERT INTO shipments (...) VALUES (...);
COMMIT;
If something goes wrong, you can abort the transaction at any time with the ROLLBACK command:
START TRANSACTION;
// your application makes a payment
INSERT INTO payments (...) VALUES (...);
// SHIT something goes wrong!
ROLLBACK;
That’s how they’re used! By wrapping these statements in a transaction, you get this all or nothing behavior AND you get some benefits with preventing weird concurrency bugs! It’s great!
Well, it’s great when you desire this behavior.
Back to our story
In our story, we ADDED a transaction by accident. Very likely, the engineer handling the merge conflict thought there was a missing annotation because transactions are usually good. Transactions have all these benefits, so we should be using them more right? Well, it just depends. Do you need these benefits? Does your code expect your database to perform this way?
In our case, the code wasn’t expecting the failed payment to get rolled back. Which is why it kept charging customers.
Alone, the addition of the @Transactional annotation wouldn’t have caused a mess. In combination with the other bug - and it did.
And that concludes our story.
For those who never learned about database transactions, I hope that plugs a conceptual hole in your lattice of mental models. For those who have, I hope this was a refresher.
Transactions and Today
Now, let’s jump back to the present.
I wanted to take a short moment and talk about transactional relationships. Not “acidic” database transactions, but just good old fashioned economics.
Much of our lives are spent in transactional relationships - I give work to an employer and they compensate me. I pay Netflix and they give me a buffet of entertainment. There's a give and take to everything. That is the natural, long term, sustainable, steady-state of things.
But right now, we’re facing turbulent times. Unemployment, what the Fed is and is not doing, the coronavirus - there’s just so much uncertainty (in the negative direction).
I started this newsletter because I realized that now, more-so than maybe any other point in my life, is time for a different approach. We need to give value, without any expectation of a return. We might need to reinvent the rules of employment - and fast. To do that, people need new skills - and fast.
That’s what this newsletter is all about, how can I share my experiences so that others might benefit from them. It’s my attempt at “giving value”.
So - heres’s what I ask of you. Share what you know. Learn what you can. And help others along the way. It’s time to think big.
Alright, my friends, as they might have said in the 70s (when they were tripping on acid and decided to over complicate database transactions with a bloated backronym): Peace ☮️ and Love ❤️
Until next week,
Phil
#2 - Transactional Thoughts
great post! I like the way you differentiate what's really important about ACID versus what's simply there to flesh out the academic theory (or to fill in a wanted backronym). Oftentimes explanations of programming concepts can be very abstract which is probably correct from an academic perspective, but not too helpful from an industry perspective, so thanks for writing this up. looking forward to the next one