Looking back on Stripe’s payment API migration

7 min read Original article ↗

This 2020 post walks through the first and second iteration of Stripe’s payment API. The original API was designed for credit card charges and was delightfully simple. You could move money with a single curl command! However, as Stripe added support for payment methods like Bitcoin, the flow became more complex and often required customers to perform a delicate choreography of webhooks and API calls. Stripe set out to improve the situation with its v2 PaymentIntent API, a years-long but triumphantly successful migration.

The Stripe blog post itself is more concerned with history than technical details. Actually, I don’t think it explains the workings of the APIs (or their differences) very clearly, so I’ll get to that later. The migration story itself goes like this:

  • v1 API was designed for credit/debit cards. Cards turn out to be the easiest payment flow, because charges (1) settle immediately and (2) do not require explicit action by the cardholder.
    • In fact, US debit/credit cards were the only flow with both those properties, out of the many domestic and global payout rails Stripe eventually added. 
  • The API began to evolve when Stripe added support for Bitcoin and ACH. Bitcoin is the opposite of credit cards: funds settle asynchronously and require secondary interaction from the bitcoin holder. This, too, is an unusual combination!
    • Most payout rails aren’t like credit cards or bitcoin.
    • So, the abstractions Stripe developed were highly path-dependent and based on “outlier” examples. 

Stripe spent months designing a new flow from scratch and then years rolling it out. The new flow was much more ergonomic for most payment flows. The only shortcoming was that it was a little more complex for the good old “just let me charge a credit card” flow, so they found a way to simplify that part too, and all was well™. It’s a great story!

What was less clear to me, from the post, was the real difference between the two flows, so I spent some time trying to work it out in more detail. 

The Bad Old Days

Let’s say you’re operating an online store, accepting payment via Stripe. You want to do the following:

  1. Get the user’s payout information.
  2. Charge their account.
  3. Wait for the charge to settle in Stripe’s bank account.
  4. Give your customers what they paid for. 

Let’s look at how steps 1, 2 and 3 work for credit cards vs. bitcoin.

CardBitcoin
Get payout informationUsers upload their credit card info to Stripe using the stripe.js integration. Stripe sends back an encrypted Token. That way our store’s server never needs to see the sensitive credit card information.Users provide their public bitcoin address. 
Charge accountOur server asks Stripe to pull funds from the card details represented by Token. This action is represented by a Charge object.Stripe cannot directly initiate a transfer from the user’s bitcoin address. 
Instead, stripe.js asks the customer to initiate a transfer to a bitcoin address held by Stripe. The action/flow of requesting an action from the customer was represented by the BitcoinReceiver object (later known as Source.) 
Once the funds arrive in the escrow account, Stripe notifies the storefront account via a webhook/polling/etc and the storefront can finally initiate the Charge that will move funds virtually from the escrow account to the store’s Stripe balance.
If the store doesn’t initiate the Charge in time, then the transaction will be reversed, and funds will return to the customer’s account. 
Funds settleCard Charge events settle immediately. From the store’s perspective, a single synchronous request completes the transaction and confirms that it succeeded.The final Charge settles immediately, since it’s a virtual transfer.

The card flow is elegant and synchronous. The bitcoin flow is horrifying. If anything goes wrong (for instance, the store’s backend goes down and misses the webhook for BitcoinReceiver success) then the transaction will fail, even though (from the customer’s perspective) they successfully initiated a transfer from their bitcoin account. The store must track separate, asynchronous state machines for BitcoinReceiver and for Charge. The store needs to ensure that both the BitcoinReceiver and Charge flows are unique for each purchase to avoid double charges.  

I feel the need to repeat: the bitcoin flow is horrifying. It seems insane to model a payment as an asynchronous, multi-step process where part of the orchestration is farmed out to the store’s own server. However, I think we can piece together how Stripe got here:

  1. We (Stripe) have a nice Charge flow that allowed a store to initiate a payment from its server.
  2. The Charge() function requires a “root password” like a credit card number or ACH routing number. Bitcoin users don’t like sharing their private keys, so that basically blocks us from running Charge().
  3. Okay, how do we get around this? Well, if we could get customers to send funds in advance to a bitcoin account that we control, then we do have the “admin perms” to that account, so now Charge() will work.
    1. To quote the blog post: “As Bitcoin didn’t fit into our abstractions, we had to introduce a new BitcoinReceiver API to facilitate the client-side action we needed the customer to take in the online payment flow”.
  4. Of course, Charge() needs to be initiated by the store, so we’ll have to let them know when we have control of the funds (via webhook, etc) so they can finally call the function.
    1. Oops! Looks like sometimes stores don’t call the final Charge(). Well, we’d better return the customers’ bitcoin after a while, then!

I suspect that people were willing to go along with this because Bitcoin was already widely seen as very complex and distributed and so it just made sense that Stripe’s Bitcoin integration would likewise feel complex and distributed. Plus, Bitcoin was sort of a weird edge case, so it didn’t make sense to mess too much with Charge().

Of course, as Stripe began to support payout methods like Alipay and Wechat, required customer integration shifted from a weird edge case to a core flow. BitcoinReceiver was already well-established, so Stripe just abstracted it a bit further to a Source object that could collect funds from different kinds of customer accounts. Going forward, if the customer wouldn’t give you the root password to their account, you would just have them transfer funds to Stripe in advance, and then call Charge() once those funds settled.

Still, it’s a completely insane flow, or in the formal terms of Stripe’s blog post, “a conversion nightmare.”

The New, Glorious v2 API

So, why do storefronts actually need to initiate that intermediate Charge() action at all? Couldn’t Stripe itself auto-trigger Charge() as soon as the Source account is funded? The blog post doesn’t address this question directly. 

In fact, we don’t really learn about any of the technical tradeoffs Stripe considered while designing v2. Instead, Stripe lists meta-lessons from their brainstorming process like “Close laptops” and “Use colors and shapes.” I love it.

However, I think the core reason why Stripe couldn’t auto-trigger Charge() is because they didn’t trust customers to decide how much money to send. The BitcoinReceiver / Source flows were fundamentally triggered by untrusted frontend code. Only the server-side Charge() was in the control of the storefront. 

So, we need a new abstraction that allows stores to declare the charges upfront. Sort of a combination of Charge/Source that is created on the store’s server at the beginning of the payments flow. We need a unified state machine that waits for any customer-initiated transfers to settle, checks that they sent enough money, then auto-triggers the corresponding charge.

We need PaymentIntent.

As Stripe explains, this object provides a single asynchronous state machine that covers all aspects of a payment lifecycle. It can block on customer action, but then proceed automatically to initiating the charge if everything is in order. Stores still need to listen for webhooks when the final payment has succeeded, but not for intermediate states.

It’s kind of brilliant, honestly. The new flow is not only better for the new payment methods introduced since bitcoin — it’s a better flow for bitcoin, too! 

The only problem is that it is slightly less ergonomic for credit card transactions, which were previously synchronous. To get around this, Stripe has added some semantic sugar to the PaymentIntent API that basically lets you make it synchronous by erroring out if any asynchronous actions are required. This is the cherry on top for a very, very impressive API migration.