Optimizing iWittr.com to reduce Google Cloud & Vercel costs

7 min read Original article ↗

Last month I decided to do a quick fun project as an excuse to try out AI coding tools, called iWittr.com. It’s a fan site for the Kermode & Mayo podcast, which I’ve been listening to for over 10 years.

I might do another post about that experience, but this one is about reducing it’s costs. I found that when it became popular (it’s been mentioned twice so far on their podcast), I began to get alerts from Google that I’d used half of my monthly budget (€20) in two days.

TLDR

  • Firestore is too bloody opaque to properly understand your site’s usage
  • Check your NextJS build logs to ensure that pages you think are cached on the Edge are not accidentally prevented from being cached
  • Close the Firestore console when you’re not using it
  • If you’re generating a huge HTML page that causes thousands of reads, in development mode just fetch a few records.
  • If you have a small data set, move it to a client Worker and read from it repeatedly rather than getting the same few hundred / low thousands of records repeatedly from the server
  • If you have a page that is expensive to load, make sure that any <Link> tags to it have prefetch={false} set

For context, the tech stack is

  • Firebase Firestore for storing data
  • Google Cloud Storage for files
  • NextJS for development
  • Vercel for deployment (UI and functions)

Debugging the costs

Finding the reason for the costs was initially quite difficult. Google’s Billing page lists them under App Engine, which I didn’t think I was using – I had nothing deployed on Google infrastructure. However, it seems they bundle Firebase related costs under App Engine, good to know.

This is where it gets difficult – Firestore will just tell you the total number of reads you are doing, but not the collections you are reading most from, forcing me to try to guess where I was being wasteful.

Optimizing the Map

My first guess was the Map page. I knew that this was the primary page that all users, both logged in and casual browsers, would go to, and it was reading (in the backend) over a thousand records every time they moved the map. Of course I was grouping these together to send far less to the client, but the reads were happening.

I decided to instead do all the map based filtering and grouping on the client, in a Worker thread. It turns out that if I just store the entire set of map data (a list of towns and the number of people checked into them) in a JSON file on Google Cloud Storage, it’s under 200k in size. Every time a new user checks in and adds themselves to the map, I now

  • Download the previous JSON file from storage
  • Update this list with the new town information and upload it again

The client app previously used to hit the /api/marker endpoint every time the map moved, passing it the bounds of the visible map (in longitude and latitude). This resulted in thousands of reads and CPU usage in grouping the data into as few visible map markers as possible all in the cloud.

The client app now does a single read from the server, asking for the URL of the most recent JSON upload. It downloads that file in the Worker thread and stores it in local storage. The UI thread now sends the map bounds to the Worker thread, it reads from memory, does all the filtering and grouping and returns in just a few milliseconds.

There’s two really cool things about this.

  • The first is that it’s much faster, basically instant, and works offline.
  • The second is that the refactoring from running on the API to running in the Worker took less than 10 minutes, as I simply told Claude Code to do it for me. It
    • Created the Worker file
    • Instantiated it in the React component for the Map
    • Moved all the logic from the API route to the Worker
    • Changed all calls in the React component from going to the API to speaking to the Worker
    • Wrote the LocalStorage code in the React component and had the Worker tell it to do the data storage, as the Worker does not have access to LocalStorage
    • When the Map page loads it first populates the map from LocalStorage, then gets the latest JSON file from the internet, if available, and updates the map with the latest data.

This approach only works of course because the data set is small, and I expect it to not grow to a huge size.

Result: a small reduction in the number of reads! Somehow, I’d guessed incorrectly.

Optimizing the Wiki

There was an old defunct wiki that contained loads of great user created content about the podcast, now stored graciously on Archive.org. I had taken all that data and rebuilt a new wiki from scratch, and decided that I wanted one huge page that lists all of the 1300+ articles. I’d told Vercel to cache this on the edge using the revalidate configuration.

However, it turns out that at some point I’d added a check for the user cookie (to determine whether or not they have Editor permissions), and this was causing it never to be cached!

Even worse, the home page used the <Link> component to link to all the other pages, and by default it would be preloaded, so regardless of whether or not someone visited the wiki, it would do its 1300 reads.

Finally, in development there is no Edge caching of course, and when I was working on the code, every time I was saving a file it was reloading the data.

The Fix

I told Claude Code to refactor all user related code to a client component that checks in the browser whether or not they are logged in, allowing the page to be cached properly. In development, I added a simple check to the page and if in development mode, I just load 20 articles instead of 1300+. Finally, I changed the link to <Link prefetch={false} ... > as it’s a huge HTML page and most people won’t navigate to it.

Avoiding the Firestore Console

After making these changes, the read count for Firestore dropped a lot, but there were still hundreds of thousands of unexplained reads. It turns out that if you leave the Firestore Console (their web dashboard) open on an active Collection, it repeatedly reads from the collection over and over again.

Solution: I closed the Firestore console, and only open it when I need it. With this, the number of reads finally became reasonable.

Results

Total number of reads per day have dropped from 9 Million to under 0.5 million, and improvement of around 20x. The map is far faster, and the Wiki loads much more quickly. All in all a good mornings work.

Wishlist

  • Google’s Firestore should give a breakdown per Collection, or even by Index, for reads, to make debugging easier.
  • Vercel should let you know if a previously statically cached page has become uncached, as happened to me accidentally

Unknown's avatar

Published by Shane O'Sullivan

I am a software engineer and manager from Ireland. I spent 7 years working in Ireland from 2003 – 2010, then ten years in Silicon Valley from 2010 to 2020. In California I spent about 6.5 years at Facebook Engineering, the last three of which I was an engineering manager in the Ads organisation focusing on customer facing products for creating and managing ads. At Stripe I built the Developer Productivity organisation, with teams that were responsible for the use of the Ruby language, testing infrastructure, documentation, developer tooling (e.g. IDE integrations) and more. At Promise, I was Head of Engineering from 2018 – 2020, responsible for building the first few iterations of our products, hiring for all product roles, meeting with clients and investors, and anything else needed to get a tiny startup bootstrapped and successful. Now I’m back in Ireland, working various projects while advising the great people at Stainless (stainless.com). This blog contains my various musings on all things technical/interesting on the interweb and beyond.

Published