Cracking the Pre-CS5 Binary FLA — Ed Moore

22 min read Original article ↗

Reverse-Engineering Adobe Flash's Forgotten Source Format

Or: how I learned to stop worrying and love MFC's CArchive.


When Adobe officially sunset Flash Player on December 31, 2020, the world lost a plugin. What got less attention was that the source files had already been stranded a decade earlier.

Every version of Flash Professional up to CS4 (released September 2008) saved its source documents, the .fla files, in a proprietary binary format built on Microsoft's OLE2 Compound Document container. That format was in continuous use from around 1996 through 2010, meaning something like 15 years of Flash source material sits inside those containers. Game jams, art portfolios, commercial animations, student projects, educational material, millions of files.

Flash Professional CS5 (April 2010) changed direction. CS5 introduced XFL, a clean zip-packaged XML format documented in an Adobe white paper and publicly understood. Everything since, including CS5.5, CS6, Creative Cloud, and Animate CC, uses XFL. Even now, current Animate CC can still import an old binary FLA through a one-way migration. As long as you have Adobe's current subscription, the current version works on your platform, and the binary FLA hasn't tripped one of the regression bugs that periodically reappear (Animate 22.x and 23.0.2 both broke binary import at various points, each time quietly fixed in a point release).

If you don't have Adobe's tooling, because you're on Linux, or building an open-source tool, or preserving files ten years after Adobe's servers stopped hosting old installers, the pre-CS5 binary FLA is a hard wall. The JPEXS Free Flash Decompiler, the de facto standard for Flash reverse engineering, puts it bluntly in their docs: "no one knows the meaning of every byte."

This post is about figuring out what those bytes mean.


Why this was hard

The binary FLA is genuinely undocumented anywhere public. Here's what you can find, and what you can't.

What you can find, with enough digging:

  • Flash files use Microsoft's OLE2 / Compound File Binary as their outer container. That part has a 250-page spec.
  • The streams inside the container are named things like Contents, Page 1..N, Symbol N, Media N. You can list them in 3 lines of olefile.
  • Flash's internal C++ classes (CPicPage, CPicLayer, CPicFrame, CPicShape, CMediaSound) are baked into stream headers as ASCII class-name strings. You can grep them out of any FLA.

What nobody had written down:

  • How those classes are serialized. What schema fields each carries. The rules for little-endian length-prefixed strings. Where fill colors live. How shape geometry is encoded. What the coordinate unit is. Any of it.

JPEXS, to their enormous credit, can write a binary FLA (their SWF decompiler can export to the old format for round-tripping into legacy authoring tools). But their writer generates a skeletal output designed to reopen in Flash and fill itself in; it's not a specification. And reading binary FLA was explicitly not supported because, again, nobody had figured out the shape geometry.

I looked at every related project I could find, and want to call them all out specifically because each one did an enormous amount of work in its own corner of the ecosystem and this post absolutely builds on top of them:

Project What it does Helps with binary FLA?
JPEXS FFDec Full SWF decompile + SWF→XFL export, export-only binary-FLA writer No for reading. Single best Flash tool in existence.
lifeart/fla-viewer Browser-based FLA renderer No. XFL (CS5+) only.
Ruffle Rust-based SWF runtime, actively maintained No. SWF only, but plays it beautifully.
swftools / swfmill / flasm SWF round-trip tools No. SWF only.
Kaitai Struct format gallery Declarative binary parsers Has SWF, not FLA.
jmendeth's "Reverse-engineering Flash" ABC bytecode disassembly No. SWF-internal only, but a great read.
Sothink SWF Decompiler Commercial SWF tool SWF→FLA conversion only.
Library of Congress FDD entry Format registry Describes FLA at a "this is an OLE2 file" level. No byte details.

Every major tool is a SWF tool. Binary FLA reading is an open problem.


Attempt 1: Read the bytes and hope

Quick disclaimer before going any further: I'm a frontend developer by trade, not a reverse engineer. I did start my career as a Flash developer back in the early versions (MX era), which is mostly why this problem caught my attention in the first place, but low-level binary format work is not something I normally do. Everything that follows was driven in tandem with Anthropic's Claude Code acting as a pair-programmer, which is worth stating up front because it shaped how the investigation proceeded. More on that below.

The most obvious starting point: dump streams with olefile, look at the bytes, see what falls out.

import olefile
ole = olefile.OleFileIO('example.fla')
for stream in ole.listdir(streams=True):
    name = '/'.join(stream)
    data = ole.openstream(name).read()
    print(name, len(data), data[:32].hex())

You immediately see:

  • Contents has lots of ASCII strings: CMediaSound, CQTAudioSettings, filenames like "sound.wav", publish-settings keys.
  • Each Symbol N stream starts with 01 FF FF 01 00 08 00 "CPicPage" 02 00 FF FF 01 00 09 00 "CPicLayer" 02 00 FF FF 01 00 09 00 "CPicFrame" 02 00 FF FF 01 00 09 00 "CPicShape". The class names are right there as plain ASCII.
  • After the class declarations, the bytes become opaque binary.

The recurring FF FF 01 00 <nameLen> 00 <className> pattern is recognizable as a class-tagged serialization scheme, specifically matching Microsoft Foundation Classes' CArchive::WriteObject output. Flash was built with MFC; this is MFC's format.

That lets you decode:

  • Class-definition tag: FFFF followed by u16 schema, u16 nameLen, nameLen ASCII bytes.
  • Back-reference tag: any <idx> 80 little-endian u16 with bit 15 set means "another instance of previously-declared class idx".
  • Null object: 0000.

And length-prefixed Unicode strings in the FLA follow a distinctive FF FE FF <u8 charLen> <charLen * 2 bytes of UTF-16LE> pattern, which is MFC CStringW's internal wire format.

Good start, but all you've got so far is the class hierarchy. The actual data inside the classes, the shape geometry, fill colors, matrices, is still opaque bytes.


Attempt 2: Pretend it's SWF and hope

A reasonable hypothesis: Flash stores the shape geometry the same way its compiled output does. SWF shape records (DefineShape, DefineShape2, DefineShape3) use a bit-packed edge encoding: one bit per record for "edge vs. style change," then variable-width nBits deltas, terminated by a six-zero-bit end marker.

So I wrote a bit reader. MSB-first, the way SWF works. I tried it at every plausible offset inside a shape body, every bit shift. The output was uniformly garbage: deltas of around −60,000 pixels, records that never terminated, impossible style indices.

As a sanity check, I extracted actual bytes from a DefineShape tag inside a known SWF and searched for them in the corresponding FLA. Zero matches. Not even the 8-byte `FillStyleCount + FillStyle + LineStyleCount

  • NumBits` prefix. Flash's internal FLA format is not a SWF record. It's something else.

This was the point where I seriously considered giving up.


Attempt 3: Find the XFL equivalent and diff

Clever idea: use JPEXS to generate an XFL project from a SWF, find a simple shape in the XFL XML where I know the twip coordinates exactly, and search for those coordinates as little-endian s16 or s32 in the binary FLA. Something has to match.

The XFL shape was a 53×14-pixel rectangle with corners at twips (492, −166), (492, 106), (−578, 106), (−578, −166). Four edges, trivial shape. I searched the FLA stream for all four coordinates encoded as s16_le, s16_be, s32_le, s32_be, 16.16 fixed-point, IEEE 754 float, and every encoding I could think of.

Zero full-tuple matches. Individual coordinates appeared as byte substrings (because they're small numbers, appearing in many contexts), but never as the adjacent four-coordinate sequence that would indicate the shape itself.

Combined with the SWF failure, this all but proved the coordinates were either bit-packed (with dynamically sized fields per record) or stored in some offset/scaled encoding I wasn't seeing. Staring at the bytes wasn't going to solve this.


Attempt 4: Give up on inference and read the source code

Or rather, read the compiled source code. Flash Professional 8 (2005) is on archive.org. Flash 8's flash.exe is a 17 MB Win32 PE executable. NSA's Ghidra decompiler is free and runs on macOS. Open-source C++ decompiler, closed-source Windows binary: that's a well-worn road.

The setup took an afternoon:

  1. Download the Flash Professional 8 installer from archive.org.
  2. Unpack with 7-Zip (InstallShield SFX → Data1.cab → flash.exe).
  3. Install Ghidra 12.0.x + OpenJDK 21.
  4. Load flash.exe into a Ghidra project, run auto-analysis (a 25-minute coffee break).

Then the fun starts.

The shape of the codebase

A plain strings flash.exe | grep ^C[A-Z] shows over 390 class names. Filtering down to the ones relevant to graphics:

CPicObj               base class for everything drawable
├── CPicPage          scene / page
├── CPicLayer         timeline layer
├── CPicFrame         timeline frame (inherits from CPicShape!)
├── CPicShape         vector shape
├── CPicShapeObj      library-symbol shape wrapper
├── CPicSymbol        library item
├── CPicSprite        movie clip
├── CPicButton        button
├── CPicText          text field
├── CPicBitmap        bitmap
└── CPicVideo, CPicSwf, CPicFont, CPicOle, ...

Plus 79 classes in a parallel MFI* hierarchy (Macromedia Flash Importer), which turns out to be a public plugin-importer SDK that was never fully released. Its class names (MFIShapeModule, MFIFillStyleDefinition, MFIShapeEdgePath, MFICubic, MFIContourShape) mirror the internal structure and confirmed the data model.

Every CPic* class has a CRuntimeClass struct in the binary's .data section. The struct layout is standard MFC:

struct CRuntimeClass {
    const char* m_lpszClassName;          // → "CPicShape"
    int         m_nObjectSize;            // sizeof
    UINT        m_wSchema;                // DECLARE_SERIAL version
    CObject*  (*m_pfnCreateObject)();
    CRuntimeClass* m_pBaseClass;
    ...
};

A small Python script using pefile walks the .data section looking for pointers into the string table;

Each hit is a CRuntimeClass for a specific class. Walk the whole section and you get a table like:

class size schema CreateObject VA base class
CPicObj 116 1 0x9033b0 CObject
CPicShape 300 1 0x910c60 CPicObj
CPicFrame 672 1 0x8faa70 CPicShape
... ... ... ... ...

The inheritance graph lets you predict serialization structure: CPicFrame's Serialize calls CPicShape's, which calls CPicObj's, each layering its fields on top of the previous one.


The Rosetta Stone: CArchive::ReadObject

Ghidra's RTTI-aware analyzer automatically labels many standard MFC symbols, including the crucial one: CArchive::ReadObject at VA 0x00ee3e6c. Its decompilation is a short, beautiful thing:

CObject* CArchive::ReadObject(CArchive *this, CRuntimeClass *pClass)
{
    if (!(this->m_nMode & 1)) { /* throw */ }

    CRuntimeClass *rc = ReadClass(this, pClass, ...);
    CObject *pOb;
    if (rc == NULL) {
        // back-reference to previously-read object
        pOb = LookupObject(...);
    } else {
        pOb = rc->CreateObject();
        MapObject(this, pOb);
        // *** THE KEY LINE ***
        (**(code **)(*(int *)pOb + 8))(this);
    }
    return pOb;
}

That last line is a virtual-method call at vtable[+0x08], which is slot 2 of the primary vtable. For any MFC-derived class, slot 2 is Serialize.

With that, the whole format unlocks. Every CPic* class's Serialize method can be found in its vtable, decompiled individually, and its field reads turned into a Python parser. The chain is:

  1. Read the class tag.
  2. If it's a new class, register it; call CreateObject to make an instance.
  3. Call instance->Serialize(archive) via vtable slot 2.
  4. That reads some fields for the class plus its base class's Serialize, recursively, bottoming out at CObject::Serialize (a no-op).

The FLA file, from bytes to shapes

Armed with that dispatch model, I worked through each class's Serialize method. Here's the full structure.

The container

[OLE2 compound file]
├── Contents              document metadata + sound library
├── Page 1..N             per-scene state
├── Symbol N              one library item per stream
└── Media N               raw audio PCM / compressed bitmap data

The wire protocol inside a stream

Each stream starts with a one-byte 0x01 root-object tag, then a standard MFC class-tag sequence:

  • 0x0000: null pointer / end of list
  • 0xFFFF <u16 schema> <u16 nameLen> <ASCII nameLen bytes>: new class
  • 0x8000 | classIndex: back-reference to an earlier class
  • 0x7FFF followed by extended u32 index: long-form back-reference

Length-prefixed UTF-16 strings appear as FF FE FF <u8 charLen> <2 * charLen bytes of UTF-16LE>. All multi-byte integers are little-endian.

CPicObj::Serialize, the base every other Pic class starts with

u8   schema
u8   flags
┌── children list ─────────────
│   loop:
│       class_tag = read()
│       if tag == NULL: break
│       child = ReadObject(this_archive)   // recursive
│       append child to linked list
└──
if schema >= 1: 2 × s32 point   (often INT_MIN, INT_MIN = uninitialized)
if schema >= 3: u8 extra_flags
if schema >= 4: u8 extra2

The children loop is the key recursion point. When reading a CPicPage stream, its children are CPicLayer instances. CPicLayer's children are CPicFrame instances. CPicFrame's children are CPicShape instances. At each level, the parent's Serialize recurses through the same MFC reader machinery.

CPicShape::Serialize

CPicObj::Serialize(archive)          // base fields including nested children
u8  shape_schema
6 × u32 matrix                       // see next section
shape_data                           // see section after

The matrix, a beautiful mess

Flash stores a 2D affine matrix as six u32 values:

u32 a    16.16 fixed-point (1.0 == 0x00010000)
u32 b    16.16 fixed-point
u32 c    16.16 fixed-point
u32 d    16.16 fixed-point
u32 tx   integer twips (!)
u32 ty   integer twips

a, b, c, d are unitless scale/rotation coefficients in 16.16 fixed-point. But tx and ty are plain integer twips (1/20 of a pixel). Two different unit conventions packed into the same six-word struct. This is a legacy of Flash's internal rendering pipeline, where the upper-left 2×2 gets applied to vectors and the lower 1×2 gets applied to absolute pixel positions.

The shape data, where the real work was

Buried inside the function I eventually named ReadShapeData (originally FUN_00f3da60 in Ghidra), the geometry block has this structure:

u8   shape_data_schema       (0 legacy, 5 is modern)
u32  edge_count_hint         (approximate; informational only)
u16  fill_style_count
per fill:
    if schema < 3: legacy solid, u32 color + u16 flags
    else:          modern fill style (variable layout, see below)

u16  line_style_count
per line style:
    u32  stroke_color
    u16  flags
    inline_fill  (4 bytes bit-packed compact color)
    if caps_flag: u8 × 4 caps + u16 miter + full fill style

Fill styles in the modern format are:

u32  color
u8   subtype_flags
u8   more_flags
switch subtype_flags & 0x70:
    0x00: SOLID (done, just the color)
    0x10: GRADIENT
          24-byte matrix
          u8 num_stops (capped at 15)
          if caps_flag: u16 grad_hints + u8 grad_type
          per stop: u8 position(0-255) + u32 color
    0x40: BITMAP
          24-byte matrix
          u32 bitmap_id

Then the actual edge stream. For shape_data_schema >= 2 it runs until a zero-byte terminator:

loop:
    u8 edge_flags
    if edge_flags == 0: break

    if edge_flags & 0x40:
        if edge_flags & 0x80: read 3 × u8  style_change values
        else:                 read 3 × u16 style_change values
        (interpret as: fill0_idx, fill1_idx, line_idx, 1-based)

    delta1 = read_coord_delta(edge_flags      & 3)   // "move" from previous position
    delta2 = read_coord_delta((edge_flags>>2) & 3)   // control offset
    delta3 = read_coord_delta((edge_flags>>4) & 3)   // endpoint offset

    from   = prev_endpoint + delta1
    ctrl   = from          + delta2
    to     = from          + delta3
    prev_endpoint = to

    if (edge_flags & 0x0C) == 0:
        emit straight edge (control = midpoint(from, to))
    else:
        emit quadratic Bezier (from, ctrl, to)

Each coordinate delta uses one of four encodings, selected by two bits of edge_flags:

type 0 (0 bytes):   (0, 0)
type 1 (4 bytes):   (s16, s16)            fine precision, small range
type 2 (8 bytes):   (s32, s32)            full range
type 3 (4 bytes):   (s16 << 7, s16 << 7)  coarse precision, wider range

That left-shift is the crucial detail. Flash's internal coordinate system uses ultra-twips: 1 pixel = 2560 units (= 20 twips × 128). Type 3 stores coordinates as twips and shifts left by 7 at read time to produce ultra-twips; types 1 and 2 store ultra-twips directly.

Once you know this, rendering to SVG is just a coordinate divide: px = ultra_twips / 2560.

Getting to "all edges are quadratic Beziers"

A small implementation detail that saved enormous complexity: Flash's internal representation treats every edge as a quadratic Bezier. A "straight" edge just has its control point set to the midpoint of from and to, which makes the Bezier degenerate into the straight line you'd expect. This means the shape-writing code downstream has to handle exactly one edge type. Clever.


The recovery scanner, or, what you do when the boss fight is too long

I spent a weekend writing field-by-field decoders for CPicObj, CPicShape, CPicFrame, and CPicPage. Most classes decode cleanly all the way through. But CPicFrame has a lot of trailing fields: frame labels, timeline tween data, sound cue pointers, layer transforms, child sprite references, gated by a schema value I observed as high as 23. Many of those fields call out to helper functions for variable-length sub-blocks.

In a sufficiently-complex timeline-heavy symbol, the decoder stops on an unread helper block, which means we miss the children list continuation for the parent CPicLayer, which means we miss every subsequent frame in that layer, which is most of the shape content.

Writing perfect decoders for twenty-plus schema-gated fields per class was going to take a week. I took a shortcut.

The CPicShape body has a distinctive, well-defined 10-byte header right after the two schema/flags bytes: the NULL-terminated child list (00 00) followed by two INT_MIN values that represent the "uninitialized" sentinel for the shape's registration point. The INT_MIN pair is rare enough that it's a usable signature.

So I added a second pass: after the structured parser finishes (or fails), walk the entire stream looking for 00 00 00 00 00 80 00 00 00 80. For each hit, back up two bytes to the plausible schema/flags bytes and attempt CPicShape::Serialize from there. If it produces a shape with at least three edges, keep it.

This one-page addition raised the decoder's yield from 81% to 95% across my test corpus. The remaining 5% are timeline-container symbols that genuinely contain no shape data; they're pure composition metadata that references other symbols.


A couple of bonus formats

While we're here, two more Flash formats that were mysterious and are now documented.

Lossless bitmaps in Media N streams

Importing a PNG or GIF into a Flash library stores it as a CPicBitmap whose Media N stream uses this layout (ported from JPEXS's LosslessImageBinDataReader, their one piece of binary-FLA knowledge):

u8   = 0x03                   signature 1
u8   = 0x05                   signature 2
u16  rowSize
u16  width
u16  height
u32  frameLeft               (twips)
u32  frameRight
u32  frameTop
u32  frameBottom
u8   flags                   bit 0 = has alpha
u8   variant                 1 = chunked zlib
loop:
    u16 chunkLen
    if chunkLen == 0: break
    chunkLen bytes           concatenate into one zlib stream

After zlib-inflating the concatenated stream you get raw pixels as u8 A, u8 B, u8 G, u8 R per pixel, with 1-based premultiplied alpha. If 0 < A < 255, subtract 1 from A and scale RGB by 256 / A_new to un-premultiply. That edge case took a while to notice.

Imported audio

CMediaSound records in the Contents stream describe each audio clip in the library, with metadata including an unusual rate-tag byte:

rateTag = 0x0a  →  22050 Hz
rateTag = 0x0e  →  44100 Hz (mono)
rateTag = 0x0f  →  44100 Hz (stereo)

The actual sample bytes live in a separate Media N stream. For .wav imports it's raw 16-bit PCM that you can wrap with a standard WAV header. For .mp3 imports it's raw MP3 frames you can save directly. The channel count, for PCM, is inferred from stream_size / (2 × sample_count), either 1 or 2.


The false starts worth remembering

A few dead ends that looked plausible but didn't pan out:

Assuming Flash's internal representation matches SWF's. It does not. They're cousins, not twins. SWF's bit-packed shape records are a compiled representation designed for small file size; the FLA representation is a slightly-higher-level editable form with style changes as tagged records rather than bit-packed flags.

Trying to brute-force offsets with every byte-level reader. You could in principle write a parser that tries every plausible coord encoding at every byte offset and keeps whichever produces sensible output. In practice the false-positive rate is catastrophic. Flash streams are mostly small integers and the "this looks like a coordinate" heuristic finds matches constantly.

Assuming the binary FLA is just a BLF (Binary Linear Format) or a variant of a known container. There are several OLE2-based Microsoft formats (.doc, .xls, .msi) and they do share the container, but the contents are completely independent per application. Flash built its own serialization on top of OLE2 and reused nothing from Office's schemas.

Expecting JPEXS to have documented the internals. JPEXS is a phenomenal project and almost every Flash tool ultimately depends on it, but their binary-FLA support is export-only (SWF→FLA for round-tripping) and is explicitly described in their docs as "hard because no one knows the meaning of every byte." They're right. Reading the format is a separate problem from writing it; a writer only has to produce bytes Flash will accept, not decode bytes Flash already wrote.


A note on tooling

Most of the actual work on this project, from driving Ghidra and reading the decompiled C to writing the Python decoder and iterating on the recovery scanner, was done with Anthropic's Claude Code acting as a pair-programmer. The approach was: I would frame the problem, pick the next sub-goal, and review intermediate results; Claude Code would handle most of the boilerplate (Ghidra scripts, pefile walkers, bit readers, SVG emission) and carry a running mental model of what had been tried.

There's no sense pretending otherwise. Large parts of this post were also drafted with its help. The format insights, the hypotheses, the strategic pivots, and the verification of each claim came from the combined investigation; calling the whole thing "AI-written" or "human-written" both miss what actually happened, which was a collaboration.

What I will say is that the kind of work this represents, closing in on a proprietary format by mixing static analysis, byte-level inference, and iterative test/verify, used to take weeks. It now takes days, and the documentation emerges alongside the code. That's a real change in what one-person reverse engineering projects can realistically produce.


What this took, in practice

  • ~5 hours of initial byte-level inference (fruitless on geometry, useful for the class-tag protocol).
  • ~1 hour of Ghidra setup (downloading tools, installing Java, running auto-analysis).
  • ~8 hours of staring at decompiled C (identifying each Serialize method, walking through its logic, turning it into Python).
  • ~3 hours of iteration on edge cases. The fill-style caps_flag parameter confusion cost a full debugging session by itself.
  • ~1 hour on the recovery scanner that got the last 14% of content.

That's not including side quests: the MFI importer SDK discovery, audio format decoding, the lossless bitmap format reverse, the couple of decoders I threw away before landing on the final architecture.

Serious reverse engineering is strangely humbling: most of the time is spent understanding the shape of the problem, which data lives where, which methods call which, rather than parsing individual bytes. The bytes themselves are usually straightforward once you've found the right function and read it carefully.


What's public now

The result of all this, beyond one working decoder, is a documentation artifact: a format specification for the pre-CS5 binary FLA that covers:

  • OLE2 container + stream conventions (solved before, now confirmed)
  • MFC class-tag protocol (new documentation at this level of detail)
  • CPicObj, CPicShape, CPicFrame, CPicPage field-by-field serialization
  • Shape geometry: header, fill styles, line styles, coordinate delta encodings, edge records
  • The ultra-twip coordinate system
  • Lossless bitmap container
  • Audio sample rate tags and channel-count inference
  • The signature-based recovery approach for resyncing past undocumented tail fields

Every CPic* class's CRuntimeClass struct in Flash 8's binary is enumerated with size, schema, and base class. Every serialization function's virtual-address is pinned in a lookup table. The Python decoder is compact (~650 LOC for the core, plus a small SVG emitter) and depends only on olefile plus the standard library.

For format-preservation purposes this should meaningfully reduce the effort required to recover old FLA content. It's not a complete solution (CPicFrame's full timeline-tail fields and CPicText's multiple-inheritance vtable are not fully reversed), but it covers the visually renderable portion of typical FLA files.


A caveat on coverage

Worth stating clearly: this is not a 100% solution. The decoder extracts roughly 95% of the renderable vector content from the FLAs I've tested it on. The remaining few percent are timeline and animation containers (CPicSprite state machines, empty CPicText fields, placeholder CPicFrame nodes) that don't carry shape data of their own. They describe how to arrange other symbols into animations rather than drawing anything themselves, and a full extractor for them would need to interpret Flash's timeline model (keyframes, tweens, scripted frame actions), which is a different problem from shape geometry.

For static asset recovery (vector art, fills, strokes, gradients, bitmaps, audio) the 95% figure is effectively complete. For full animation fidelity it's not, and whoever wants that level should treat this as a starting point rather than a finished job.

Why bother

Because proprietary file formats die with their creators.

Flash as a platform had a 25-year run, spawned an entire culture of web animation and indie games, and is the primary reason two generations of people became interested in graphics programming at all. The content built with it is of real cultural and historical value, and most of that content exists nowhere else: the source files were never shared, the developers have moved on, the platforms that hosted the compiled SWFs have shut down.

Reverse-engineering legacy formats is slow, unglamorous, and unpaid. It is also, in many cases, the only way cultural content survives the end of its enabling technology.

The .fla format is now a little less opaque than it was. If you have old FLA files you thought were stranded, on Linux, on modern macOS without Adobe tooling, in an archive somewhere, they can now be read.

One less dead format.


The decoder source, format documentation, and a pile of Ghidra scripts that found the key functions are available at github.com/eddiemoore/fla-decoder. The decoder renders to SVG by default, but the intermediate data model is a straightforward tree of Python dicts suitable for adapting to any target runtime.