Settings

Theme

Writing an MP4 Muxer for Fun and Profit

obsproject.com

95 points by skrrtww 2 years ago · 30 comments

Reader

Andrews54757 2 years ago

Having worked with some MP4 demuxing for my extension [1], I feel the pain. Lots of times I would play the video only to find inexplicable issues such as drifting audio. I highly recommend using an mp4 inspector tool, such as mp4box [2], to debug these issues.

1: https://github.com/Andrews54757/FastStream

2: https://gpac.github.io/mp4box.js/test/filereader.html

somat 2 years ago

Nice, when playing around one weekend trying to see if I could use ipfs as a transport layer for streaming video I got hung up because most video formats I tried behaved very poorly with inconsistent streams where you may not have the beginning. I ended up on mpeg-ts as the best behaving of the bunch. It felt a little weird, as I was sort of expecting something more modern to have better performance, but seeing as my goal was not to evaluate video formats but just ship them around I just accepted it and moved on.

Thinking back on it now, I just did a little trial and error until I found something that worked, but what would I search for if I was trying to find data on how... ?streamable? an encoding is?

If curious, I got my proof of concept working but it was unpleasantly slow. I blindly chunked the incoming stream into megabyte sized chunks registered the chunks on ipfs then used ipfs pubsub to announce the chunk to any watchers. The watcher would watch the pubsub channel for announcements download the chunk and try to reassemble it in order and play it. one neat side effect that I found was when the stream was done if I had stored all the ipfs address I could then generate a whole ipfs file structure you could use to download the stream at a later date.

CrendKing 2 years ago

Can someone explain how does an existing media player understand the new mdat format without modification? I assume if they find a completed moov at end of the file, it would recognize the file as a unfragmented mp4. It should then try to find a list of recognized codecs directly inside the mdat (like in the first picture), but instead they will find another moov, a bunch of moofs and sub-mdats, all of which are clearly not proper for a unfragmented mp4. Why doesn't the player report this as a "unrecognizable, badly formatted" mp4 file?

  • der_rod 2 years ago

    The mdat box does not have a defined structure, and the specification actually states that attempting to define a structure is almost certainly a mistake. In order to find the data the player is looking for it has to read the moov box, which contains the byte offsets and sizes of "chunks" of data. Since there is no requirement for chunks to be contiguous, or even in the same file, we can simply skip over the fragmentation-related boxes within the data box.

  • Andrews54757 2 years ago

    The moov contains a list of byte offsets which the player can use to directly access media data. You can skip the moofs and other headers inside by using gaps in the offsets.

convivialdingo 2 years ago

This is awesome work. I’ve coded some extensions for mp4 livestream to handling dozens of real-time streams and I’d love to try out the multi stream mux / demux…

Retr0id 2 years ago

> It kind of hurts that several days of work and research can be summed up in a couple paragraphs, but that's what the "pain" part in the subtitle is for.

Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.

I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)

  • Lammy 2 years ago

    I feel zero shame for torrenting ISO standards PDFs.

    • Retr0id 2 years ago

      Me neither, but I couldn't actually find any torrents for the mp4-related specs (I did find what I needed with some google-fu, though)

  • der_rod 2 years ago

    > Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.

    Would be curious to hear what goals you had with writing a muxer yourself as well, given that most people just use LibAV/GStreamer/GPAC and call it a day.

    > I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)

    I certainly wouldn't call the OBS implementation "clean" either. It's very much inspired by the FFmpeg/LibAV implementation since that one is fairly straightforward (not a lot of abstraction), and gets the job done (and also is GPL/LGPL so not a huge concern looking at it).

    • Retr0id 2 years ago

      The short answer is, it's for an exploit. It involves some slightly less-well-trodden boxes, and adding specially crafted metadata to live-generated videos in real-time, which existing libraries couldn't help me with much (and I did spend some time fighting a few libraries, but couldn't make them do precisely what I wanted).

      "Library" is perhaps an overstatement, it does the things I need and not much more.

mastax 2 years ago

It looks like GStreamer has supported this for a few years: https://gstreamer.freedesktop.org/documentation/isomp4/GstBa...

I always forget about GStreamer but I think I have a perfect application for it. Hopefully it’s easier to use as a library than MediaFoundation or FFMpeg.

akira2501 2 years ago

MP4. The answer to the question of "Is there a way to make RIFF and AVI even worse somehow?" It makes you genuinely pine for MPEG2 Transport Streams. ISO 13818 for life.

donpark 2 years ago

Great work!

Would love to see MP4 Hybrid supported in popular packages like mp4-muxer [1] and mp4box [2] someday.

1: https://github.com/Vanilagy/mp4-muxer 2: https://github.com/gpac/mp4box.js

Lammy 2 years ago

> moof (Movie Fragment Box)

Very cute easter egg. Moof is what dogcows say: http://clarus.chez-alice.fr/history.php

ogurechny 2 years ago

So this is a “soft” sequential access limitation (we can tolerate some random writes to data as long as it is small enough and short enough). I wonder if there are formats that result in finished indexed multimedia file with “hard” sequential access, when nothing can be overwritten.

  • TD-Linux 2 years ago

    Digital video tape formats (e.g. DV, HDV) are an example. Other containers that operate in this mode are TS and Ogg (and optionally, MKV). Any sort of live streaming format generally is, too.

cornstalks 2 years ago

(context, this is talking about fragmented MP4 downsides)

> 2. They are slow to access on HDD or network drives, as each fragment's header needs to be read to get the complete metadata of the file and start playback

Huh? That's not right. The whole point of fragmented MP4 is that you can access any fragment without having to read the headers of the other fragments. That's why adaptive streaming is built around fragmented MP4.

  • ogurechny 2 years ago

    To figure out the total length of media streams, you need an external index metadata (web streaming) or a remux of the file that adds an index. The whole point of the article is removing the need to remux the file after recording, otherwise you can use existing solutions just fine.

    • cornstalks 2 years ago

      You can write a sidx for your index. And it doesn't require a whole remux.

      • der_rod 2 years ago

        Unfortunately, some of the most popular/problematic software (default Windows video player and explorer) does not support `sidx` boxes.

karolist 2 years ago

> Except there is no profit, only pain

I have 20 years or professional experience and my conclusion, if someone asked, what IT boils down to: pain.

The pain is what filters who can succeed and who fail. Can you endure hunting a bug for 7 hours in your chair? Can you fix problem after problem to get a system running? Everything that can fail, will fail, and you have to deal with it.

  • 38 2 years ago

    Or when you encounter (another) corner case that requires a top down rewrite of your code. If you start to question your sanity or the meaning of life, you're probably on the right track.

  • lylejantzi3rd 2 years ago

    I came to the same conclusion in business. Is what you're doing 100% pure pain? Good. It means you're providing value.

dylan604 2 years ago

"The new MP4 output now also supports multiple video tracks"

MP4 has been able to have multiple video streams for quite some time. One of the very first advanced MP4 authoring tools I saw in the early 00s allowed for this, and we used it to make a few advanced files to demo the "new" MP4 format. Much like multi-angle DVDs, this was a niche feature that did not gain very much attraction. I could see why someone not around at that time might think this is a new feature, but it's not

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection