Using (almost) pure CSS to make fancy scroll-driven image sequence animations

17 min read Original article ↗

We’ve all seen it. Some new Apple product gets announced and you go to the product page to check it out. As you scroll, you get blasted in the face by a flashy highly-complex scroll-driven animation. Apple isn’t the only company that uses this effect, but they really love it in particular. I was able to find one on their site in under a minute on the AirPods Pro 2 page:

A screen recording showing an animation of a pair of AirPods twisting and flying out towards the screen as you scroll.

If you’re a web dev like me, you’ve probably wondered how that effect works at least once. Maybe you’ve looked it up and were slightly underwhelmed like I was to discover that it’s all pretty much smoke and mirrors; these effects are usually just a pre-baked video split into individual images for each frame, and then JavaScript checks your scroll position to determine which frame to show, usually by rendering to a canvas.

The most disappointing part of this is the performance implications: anyone who knows a thing or two about video compression will be able to tell you that the file size difference between a nice 5-second mp4 file and dozens of individual PNGs is eye-popping to say the least.

My partner Jess recently had me make some updates to their site including, you guessed it, adding a scroll-driven image sequence animation to the homepage. I was excited to finally have a chance to take a crack at this effect, and I think it turned out great!

A screen recording of my partner Jess' updated website, showing an animation of a phone with the Crate and Barrel app on it twisting as you scroll.

I said the magic behind how these image sequence animations work seemed a little underwhelming, but actually trying to implement it myself was an interesting challenge and I learned a ton as I continued to refine it and improve its performance. After a lot of experimenting, I have ended up landing on an implementation which I haven’t really seen anyone else try before that is faster, more effecient, and can be implemented in almost pure CSS. I’m pretty excited to share it!

Image sequences, the Apple way

First, let’s talk about how most sites implement this effect. Pretty much every implementation I’ve seen, including the Airpods one shown above, take a fairly brute force approach:

  1. Preload every single individual frame image file you can up front, showing some loading state or placeholder in the meantime.
  2. Once the images are loaded, start watching for scroll events. As the scroll position changes, calculate which frame of the animation should be shown for that position and render that image to a canvas.

This definitely works! But in my opinion, it’s flawed.

Loading 65 individual image files at the same time is not great for performance.

I was shocked to discover that the Airpods animation is comprised of 65 PNGs weighing in at a total of 15.2MB. That’s so much data! If they converted the frames to a more efficient format like WebP, that alone could bring the total weight down to 1.7MB, a massive nearly 90% reduction. However, there are still additional issues with this approach to consider.

On top of download size, if you’re serving the image assets over HTTP/1, browsers will be limited to a maximum number of requests they can run in parallel, meaning you are all but guaranteed to run into a bottleneck that will cause the frames to take much longer to download and may start impacting how other parts of the site load as well.

Apple serves their images over HTTP/2 so that is less of a concern, but I have seen this mistake happen in the wild. I once saw a site where even on a fiber internet connection, you would still get stuck with a loading progress bar where you had to wait for maybe 30+ seconds before being able to interact with the site at all, just so they could have a fancy scroll-driven image sequence animation. There is no world where that is worth the wait, no matter how cool the animation is, never ever ever. Completely unacceptable.

Finally, my last point of contention is that all of this relies on quite a bit of JavaScript to manage loading the images, tracking your scroll position and calculating the current frame to display, and rendering that frame to a canvas. JavaScript comes with added potential points of failure and is much more expensive per byte than HTML or CSS due to added overhead required for parsing and execution. JavaScript can’t always be avoided, but I’d love to avoid relying on it as much as possible!

I did observe that Apple uses one optimization technique worth noting: if you apply network throttling in your dev tools, the page still takes way too long to load in my opinion, but you’ll be able to see that they try to prioritize loading frames at major checkpoints in the animation first so they can start serving a degraded version of the animation as soon as possible. For instance, you can see that they prioritize loading the first frame, then the last frame, then the frame at 50%, then at 25% and 75%, and so on, slowly increasing the resolution so that you can start playing an extremely choppy and kinda ugly but still functional version of the animation sooner while you wait for the rest of the frames to come in.

What if we used spritesheets?

I studied game development in college where spritesheets are a very common concept, but for those unfamiliar, a spritesheet is an asset optimization technique commonly used in game development for 2D games where you pack a bunch of images into a single big image file that can be loaded all at once.

The frames from the Airpods Pro animation shown above packed into a spritesheet.
The frames from the Airpods Pro animation shown above packed into a spritesheet. Note that this asset has been scaled down so its file size is not representative.

Spritesheets have a few notable benefits for our purposes.

First, when you pack multiple images into one, the final image will usually be able to be compressed a little smaller than the total combined size of the individual files. In the case of the Airpods Pro animation, converting each frame to WebP resulted in a total combined size of 1.7MB, but packing all of those into a WebP spritesheet image brings the size down further to 1.5MB. That’s a relatively modest 10% improvement, but still significant especially at scale.

Second, we’re able to load all of the frames for the animation in a single network request, so concerns about HTTP/1 request bottlenecks are significantly lessened. Being a single image also means you could even add a preload <link> tag to the <head> to get things loading as soon as possible.

Playing through a spritesheet with CSS

Using a spritesheet means we can use some tricks to play through the animation with almost pure CSS. This only requires two elements: an <img> tag with the spritesheet loaded, and a wrapper element with overflow: clip which will serve as a window into the current frame we want to display from the spritesheet animation. At that point, all we have to do is shift the <img>’s position around so the current desired frame is visible in the parent’s window, and we have a little animation going!

Here’s the HTML and CSS for my implementation:

<!-- It's important that an aspect-ratio is defined on the img-sequence element which matches the
      exact aspect ratio of one cell in the spritesheet grid. -->
<img-sequence style="--sprite-count: 65; --column-count: 11; --row-count: 6; aspect-ratio: 1440/810">
  <img src="/img/my-spritesheet.webp" alt="" fetchpriority="high" class="spritesheet">
</img-sequence>
<style>
  img-sequence {
    /* The wrapper img-sequence tag will serve as a window into a single frame of the spritesheet,
        so we need the child <img> tag to be able to be absolutely positioned relative to this tag
        and to clip any overflow. */
    position: relative;
    overflow: clip;
    display: block;
  }

  img-sequence .spritesheet {
    display: block;

    position: absolute;
    top: 0;
    left: 0;
    /* Size the spritesheet so a single cell fits in the dimensions of the parent element */
    width: calc(100% * var(--column-count));
    height: calc(100% * var(--row-count));

    /* Progress in the animation is driven by a --progress CSS variable
       which should be a number from 0-1.
       We will use that to calculate the index of the current cell in the spritesheet grid
       which best matches that percentage of the way into the animation. */
    --cell: round(
      clamp(0, var(--progress), 1) * (var(--sprite-count) - 1) + 1,
      1
    );
    /* Now we can derive the row and column numbers for the current cell in the spritesheet grid. */
    --row: round(up, calc(var(--cell) / var(--column-count)), 1);
    --column: calc(var(--cell) - (var(--row) - 1) * var(--column-count));

    /* Translate the image to the appropriate position to display the current row and column inside the
        bounds of the parent img-sequence element.
        Using translate3d for hardware acceleration. */
    transform: translate3d(
      calc(-100% * (var(--column, 1) - 1) / var(--column-count, 1)),
      calc((var(--row, 1) - 1) * -100% / var(--row-count, 1)),
      0
    );
  }
</style>

All we need to do at this point is set a --progress variable on the <img-sequence> element and it will shift the image around to show the correct frame which is that percent of the way into the animation. I am currently using some JavaScript to calculate the scroll percentage which drives this progress value, so that is where my (almost) caveat in the title comes in.

Here’s a rough example of how you would make the animation play with JavaScript:

window.addEventListener("scroll", () => {
  // Play through the animation from start to finish as the
  // user scrolls down by half of the window's height
  const progress = window.scrollY / (window.innerHeight / 2);
  imageSequence.style.setProperty("--progress", String(progress));
}, {
  // Don't forget to make your scroll listeners passive!
  passive: true
});

However, in the near future it should theoretically be possible to make this work with pure CSS using @property and scroll timelines! There is a polyfill for scroll timelines which I attempted to use, but I was fighting with too many major inconsistencies in behavior between the real version running in Chrome and the polyfilled version running in Firefox and Safari. I’m going to wait for things to get a little more baked before I try that again.

Regardless, here’s a potential vision for how the image sequence could be played with a scroll timeline:

/* Register --progress variable's type as a number so
   the browser knows how to transition the value */
@property --progress {
  syntax: "<number>";
  inherits: true;
  initial-value: 0;
}

/* The scroll timeline will play through this
   animation to transition progress from 0 to 1 */
@keyframes imgSequenceProgress {
  from {
    --progress: 0;
  }

  to {
    --progress: 1;
  }
}

img-sequence {
  animation-name: imgSequenceProgress;
  animation-duration: 1ms;
  animation-direction: alternate;
  /* Play through the animation based on the
     scroll position of the element */
  animation-timeline: scroll(block nearest);
}

Isn’t that awesome? CSS is awesome.

I will always relish an opportunity to keep my reliance on JavaScript to a minimum. I expect this approach will also provide some significant low-level performance benefits in terms of both memory usage and CPU usage compared to rendering to a canvas, but I haven’t spent enough time profiling to be able to confidently speak to that.

Downsides

Like everything, this approach still has tradeoffs worth acknowledging.

First, it relies heavily on the round() CSS function, which has okay but not great browser support as of writing this. However, it is in all major modern browsers, and the progressive enhancement story isn’t terrible; if a browser doesn’t support round(), the animation will just be stuck on the first frame as if it were a static image.

The bigger flaw is that this simple implementation does not provide a good experience while you wait for the spritesheet image to load, and a 1.5MB image is definitely going to take some time to load for a lot of people. The one upside of using individual frames is that you can load and display the first frame a lot quicker while you wait for the rest to come in, but with a spritesheet, it’s all or nothing.

I ended up writing a simple img-sequence web component which is initially hidden and fades in once the spritesheet image finishes loading in an attempt to smooth things out a little bit, but it still leaves me wanting. I could probably adjust the component further to display a small placeholder frame while we wait for the larger spritesheet to finish loading, but it hasn’t felt necessary yet thankfully.

Overall, even with these downsides, I am still extremely pleased with how well this works and would highly recommend it to others. CSS is becoming extremely powerful, and it feels like this is only just beginning!


Addenda

Some additional content which I couldn’t find room for in the main article, but still wanted to touch on.

A1: Spritesheet asset prep tips and tools

I’ve learned a lot in the process of building and continuing to refine this project, so I wanted to share some of the tips I’ve come up with and helpful scripts I’ve written along the way.

1. You may not need as many frames as you think

This was a game-changing realization for me. Jess’ animation was initially delivered to me as 127 frames; they had just exported a 4.2-second-long clip at 30fps, which seems like a natural thing to do. However, it’s important to remember that when an animation is scroll-driven, it’s not going to play the same way.

I’m going to drop a classic programmer “it depends™” here, because every animation is probably going to need a slightly different treatment.

  1. How small/subtle are the movements in your animation?

    The less movement there is between each individual frame, the more frames you can probably afford to cut down.

  2. Consider your pixel:frame ratio.

    For Jess’ site, we wanted the animation to play over the course of scrolling a very small distance; I ended up measuring it to be from 0 to approximately 140 pixels down the page. Nobody can scroll less than one pixel at a time, and I would argue almost nobody is going to painstakingly scroll down your page one single pixel at a time. In my opinion, if you have more than one frame for every 2 pixels of scrollable area, you have too many frames.

I found that cutting the frame count down by half from 127 to 64 still felt just as smooth as before, but with obviously massive improvements in file size.

I probably could have gone even further; the AirPods product page I’ve been referencing is way more aggressive than I would ever dare, as they’re stretching 65 frames over a whopping 1200 pixels for a ~18.5 pixel:frame ratio. When I pay attention, especially when scrolling slowly, I can see the jitter, but it’s not as bad as I would have thought.

2. Trimming white space around frames

Every frame image that I received had some extra transparent padding around it. WebP is good at compressing these blank pixels so that they had basically no effect on file size, but it did make it difficult for me to position the image sequence exactly how I wanted on the page.

I ended up writing a script which takes the path to a directory of frame images and uses sharp to loop over every image, determine the closest dimensions you can crop in to without cutting off content on any frame, and then applies that crop.

Image cropping script source

3. Building the spritesheet

I also wrote a script which takes the path to a directory of frame images and uses sharp to resize and pack them into a single WebP spritesheet image file. It also outputs a helpful metadata file containing all of the relevant information which is needed to make the animation work.

Spritesheet generation script source

4. My <scroll-progress-region> web component

I glossed over how I’m tracking scroll position. I created a <scroll-progress-region> web component which tracks its scroll position on the page and provides a --scroll-pct CSS variable which its children can use. This is awesome because I’m then able to stay in CSS land to translate that --scroll-pct into my image sequence’s --progress variable.

I spent a decent amount of time toying around and doing math to figure out the exact formula I wanted for that translation, this is where it ended up:

--progress: calc((var(--scroll-pct) * -1.05) + 1.05);

<scroll-progress-region> component source

5. My <img-sequence> WebC component

Jess’ site is written using 11ty and WebC, so I wrote an img-sequence WebC component which takes the path to a spritesheet file, loads everything in, and writes the HTML for the image sequence exactly how I need it.

<img-sequence @spritesheet="src/assets/path/to/spritesheet.webp"></img-sequence>

<!-- output -->
<img-sequence style="--spritesheet-url: url('/img/jTOLRJLwjR-16128.webp'); --sprite-count: 64; --column-count: 63; --row-count: 2; aspect-ratio: 256/510">
  <img
    src="/img/jTOLRJLwjR-16128.webp"
    alt=""
    fetchpriority="high"
    class="spritesheet"
  >
</img-sequence>

WebC is pretty niche, but this may still serve as a good example that you can transfer to whatever your site is built with!

WebC <img-sequence> component source

A2: So… why can’t you use video files again?

I glossed over this question earlier, but I wanted to spend a little more time talking through why image sequences are the de-facto way to implement these animations for any curious enough. It definitely wasn’t immediately obvious to me at first!

All of this image stuff seems like a lot of effort, and if serving an MP4 file would be so much more efficient than a bunch of individual image frames, why not do that?

From my research, it seems the answer is that you technically can use video. The simplest implementation of this would be something like:

window.addEventListener("scroll", () => {
  // Sets the video's time to a percent of the duration relative
  // to the percent of the window height that the user has scrolled
  video.currentTime = video.duration * window.scrollY / window.innerHeight
}, { passive: true })

This will work… okay. However, there are some tradeoffs to this approach worth considering:

1. Transparency

Let’s get the big one out of the way first: the most common and well-supported video format, h.264/mp4, doesn’t support transparency. That’s a pretty common requirement for these types of effects, so that alone may be a showstopper.

More modern formats like WebM and the h.265 codec do support transparency, and browser support for those formats appears to be pretty good at this point, but this article covering how to wrangle all of the browsers into supporting video transparency does not make me feel good.

2. Rapidly scrubbing back and forth through a video is not the best way to use it

DISCLAIMER: I am not a video encoding expert and will probably get specifics wrong on this point. But I think I’m mostly right here.

This is an over-simplification, but video codecs like h.264 use techniques to save space by only encoding data describing the changes between each frame instead of having to encode the full image of every individual frame. This is great because it means you can skip encoding a lot of redundant information, and it’s still extremely fast to render during forward playback because advancing to the next frame is as simple as applying that frame’s changes to the current frame that you already have loaded.

However, now let’s imagine that you’re not performing forward playback. If you drop into a random point in the video, you now have to go back and look at the previous frames leading up to the current one to construct what the current frame should look like. That’s a lot more work!

This means that although rapidly scrubbing through a video’s timeline may work, the user’s device is going to be putting in a ton more work decoding frames from scratch up to 60 times per second as the user scrolls.

I haven’t done enough profiling to confidently say that this is a significant issue, but it’s something that I do think is worth considering.