Having fun with modern web APIs

One of the many reasons why I wanted to start blogging again was to explore modern browser APIs. Technically, I work as a full stack developer for my day job, but I usually spend most of my time on the backend, making sure Apache Superset can integrate properly and securely with databases, semantic layers, and APIs. And I don't complain, because when I do have to work on the frontend I realize how much it has changed since I first learned HTML in 1995 — with layers and layers of abstractions, dependencies, and transformations.

Because of that, I wanted an excuse to build a website were I could try to make frontend development fun. An epiphany I had about web development last year is that no one wants APIs, we want synchronized state. I might think I want to write an elegant RESTful API, using proper methods and statuses. But what I really want, at the end of the day, is to make sure that my blog and the people visiting it see the same data: when a new entry is created, guests should see it in their browsers. If they like or reply to a post, that reaction should be stored in my server database.

This brings in the first modern browser feature I wanted to learn: WebAssembly (wasm):

a portable binary-code format [...] for executable programs [.]

These days it's possible to compile programs into wasm, allowing them to run at almost native speeds on the browser. One the programs that has been compiled this way is SQLite, a lightweight database. Using sql-wasm.js it's possible to have a database that runs on the browser itself.

When someone visits my blog, their browser receives a copy of all the entries in my database that they would have access to normally — public entries that are not in draft mode, and any unlisted entries that match the current URL. And as long as Javascript is enabled, all the entries they see will come from that database: when they scroll the main feed, when they search for something, when they follow an internal link. (The site still works without Javascript!)

If the guest performs a reaction, either by liking or replying to a post, that reaction is stored as a new entry in the browser database, with its author set to the guest (guests need to be logged in order to react).

But if this database is running on the guest browser, how can it be persisted?

There are 2 different ways. First, it's persisted to the browser itself, in a permanent storage called IndexedDB. This means the guest can close their browser, or shut off their computer. When they visit my site again their browser will load the database from IndexedDB, and my server will send them any new entries since they last visited.

Second, the changes are also persisted on the backend; otherwise the no one would ever see likes and replies from guests. So the same process that sends the entries to the guest will receive their modifications to the entries, with one condition: the author of those entries has to match the logged in user. Otherwise, a malicious user would be able to modify my entries, or someone else's entries.

This bidirectional synchronization happens over another technology that was new to me: WebSockets. A WebSocket is a connection between the browser and the web server that is different from the normal connections a browser makes. In a traditional connection the browser opens a connection to the web server, sends a request, receives a response, and closes the connection. The WebSocket, on the other hand, provides full duplex interaction in a connection that can remain open for a long time.

That is the basic architecture of my site: the frontend has a SQLite database that is kept in sync with the backend. The beauty of this architecture is that the frontend code doesn't have to deal with network connections. There are no endpoints for it to connect to; no serialization of objects using JSON; no retries with exponential backoff. All the frontend has to do is write to a database.

A guest liked a post? Write to the database. They changed their mind and unliked it? Write again to the database, marking the initial entry as deleted. The guest searched for something? Run a query on the database.

The network is out of the picture. This not only makes frontend development simpler, but also means that my website is offline-first. Once a guest has loaded the blog for the first time, they can turn off their internet or go into the woods, and read every post that I've ever published. They can write long elaborate replies, and when they connect their computer back to the internet their database and mine will sync. (To prevent collisions during the sync, my blog uses a simple conflict resolution algorithm called hybrid logical clocks.)

In order to make my blog truly offline-first I rely on other modern APIs. My website runs service workers, which are special Javascript functions that intercept network requests, caching resources, and serving the cached content. They run in the background, even when the browser tabs are closed! Together with a manifest.json file, they allow my site to be installed as a Progressive Web App (PWA) and work offline.

There are more APIs that improve the PWA experience, making my site work and behave like a native application:

The Notification API allows me to receive notifications when someone leaves a reaction to any of me entries. It shows up on my phone just like any other notification, and when I click it, it opens the corresponding page.
The Badging API allows those notifications to be shown as a badge over the app icon, with the count of unread notifications.
The Media Session API allows me and guests to interact with music playing on my blog using native controls. On my phone, a song playing will show up as notification badge, and I can play, pause, or adjust the volume. On Mac OS I can use the media keys, and see the name of the song in the top right corner. This is important to me because I want to use my blog to share the music I make.

And speaking of music, this brings me to one more API, one of my favorites: the Web Audio API. When I post a song like this one, the audio can be played using a native control styled with CSS. Above the control is a visualizer — press play and you should see a spectrogram of the song, animated in real time. Click the spectrogram and it will cycle through all the visualizers available (as of right now there are 3: the spectrogram, a waveform, and a space travel abstract visualizer).

To generate the visualizer I use the Web Audio API to analyze the song in real time, including doing spectral analysis to compute the song spectrum. That information is then passed to the visualizer, so that it can choose how to render the information. The data is the same, but different visualizers render it differently.

Now, the visualizer could be implemented in many different ways. The easiest one would probably be a <canvas> element, using Javascript to draw over it. And that's what I did, but with a twist. The visualizer is actually an embedded mini computer.

Remember when I talked about WebAssembly, in the beginning of the post? Well, the visualizer runs a mini virtual computer called uxn, implemented in wasm.

Now, uxn deserves a whole post about it, because the concept and the rationale behind it are really cool and important from a sustainability point of view. But all you need to know is that it's a virtual computer, embedded into the entry. My site uses Javascript to write to the memory of the computer the values that it computes in real time from the audio stream: the overall loudness, the waveform data, the spectrum, and a frame counter.

Armed with this information, each visualizer is a program written in an Assembly-like language called uxntal. This is how the wave visualizer looks like:

( Waveform visualizer )
( Draws the audio waveform across the screen )

|00 @System &vector $2 &pad $6 &r $2 &g $2 &b $2
|20 @Screen &vector $2 &width $2 &height $2 &auto $1 &pad $1 &x $2 &y $2 &addr $2 &pixel $1 &sprite $1
|a0 @Audio &fft-count $2 &fft-idx $1 &fft-val $1 &wave-count $2 &wave-idx $1 &wave-val $1 &level $1

( constants )
%SCREEN-WIDTH { #0100 }
%CENTER-Y { #24 }
%SAMPLE-COUNT { #80 }

( variables )
|b0 @i $1

|0100
    ;on-frame .Screen/vector DEO2
BRK

@on-frame ( -> )
    ( clear screen )
    #0000 .Screen/x DEO2
    #0000 .Screen/y DEO2
    #80 .Screen/pixel DEO

    ( draw center line at y=36 )
    #00 CENTER-Y .Screen/y DEO2
    [ LIT2 01 -Screen/auto ] DEO
    #0000 .Screen/x DEO2
    LIT2r 0000
    &center
        #01 .Screen/pixel DEO
        INC2r GTH2kr SCREEN-WIDTH LTH2 ?&center
    POP2r
    #00 .Screen/auto DEO

    ( draw waveform - 128 samples stretched to 256 pixels )
    #00 .i STZ

    &loop
        ( x = sample * 2 )
        .i LDZ #00 SWP #10 SFT2 .Screen/x DEO2

        ( get waveform sample )
        .i LDZ .Audio/wave-idx DEO
        .Audio/wave-val DEI

        ( y = 4 + val/4 maps 0-255 to y=4-67 )
        #02 SFT #04 ADD
        #00 SWP .Screen/y DEO2

        ( draw pixel )
        #02 .Screen/pixel DEO

        ( draw second pixel at x+1 for thickness )
        .Screen/x DEI2 INC2 .Screen/x DEO2
        #02 .Screen/pixel DEO

        ( next sample )
        .i LDZ INC .i STZ
        .i LDZ SAMPLE-COUNT LTH ?&loop

BRK

Now, isn't this amazing?

When you play one of my songs on my blog, what you're seeing is a mini virtual computer embedded into the page. Running custom Assembly code. Rendering data that is being processed by Javascript, at 60 frames per second. And you can pause the music at any time by pressing a key on your keyboard.

I love how these APIs all work so well together.

Now, I don't think web development needs to be this complicated for it to be fun again. At the end of the day what matters is content and connections; protocols and APIs are always going to be secondary. But for me, being able to learn new technologies and bring all of them together like this — WebAssembly, Web Audio, Media Session — that makes it much more joyful.