My video streaming wishlist for the next 3 to 5 years
I work in the video streaming space and curious on others thoughts. Here's my wishlist though
1. WebTransport and WebCodec becomes the primary means for client to server real-time video delivery (e.g. compositing, off-device analysis)
2. No more vendor lockin with WebRTC (WHIP and WHEP might help here). Build a solution once on the client and if I don't like my provider just change the endpoint URL.
3. Google MediaPipe or a high level API on the browser to run AI models easily for audio / video. Right now it seems like most solutions for simple things like blurring are just minor abstractions on top of MediaStreamTrackProcessor.
4. Optimized headless browser for cloud rendering. Too many terrible solutions at the moment using CEF and chrome that then use ffmpeg or gstreamer, XVFB and pulseaudio.
5. Plug and play pipelines in the cloud for video processing (like zapier for video). I can plug in any processing I want in between the source and sink without a convulted mess of trying to push audio and video around to different apps either in network or across the internet. Hey! Just a rando here, but I would be interested in hearing your opinion as to where Peertube does well with this wish list and where it needs improvement. https://framablog.org/2023/11/28/peertube-v6-is-out-and-powe... If I understand correctly from checking out the websites, it looks like peertube is focused on Video-on-demand (VOD) and tackling the problem of content delivery and storage. From a technology standpoint I actually feel pretty good about where VOD is currently since HLS / LL-HLS (the more common delivery mechanism I see and also low latency being increasingly supported) is pretty easy to setup with any video provider (CDN, cloud providers, video platforms e.g. mux, livekit egress). I do see that peertube is using WebRTC (I'm guessing for the P2P aspect). So I'll tackle the question in the two technologies that might need advancing in that space: 1. Peer to peer should be broken out as its own standard in the browser. Right now to implement peer to peer you're stuck with WebRTC which brings in possibly a ton of features you don't need. Also, if you're trying to use it outside of a browser you're pretty much stuck with using libwebrtc which is the de-facto WebRTC implementation and effectively is an entire networking + application + av codec layer. 2. Storage! This probably needs to be re-designed for handling video in the future. No idea how but the issue I see is that usually for VOD you will want to upload a master. The master file is then transcoded to multiple resolutions / formats and such so that folks can view according to bandwidth. This ends up eating up a lot of space. Maybe we need better ways of storing content? 3. Adaptive bit rate / WebRTC delivery. This is the part that might kinda bite peertube if I had to guess. WebRTC is designed for speed over quality. When ABR kicks in the video bitrate drops and you start getting crappier video. In my experience when people are viewing VOD content they usually want things as crisp and clear as possible. This isn't so much of an issue on a conference call or live streams (usually video degrades before audio). For a peer to peer streaming network I'd have to imagine that you'd want more control on how content is delivered and I'm not sure how peertube handles that. That's just my thoughts off the cuff though, I'll give it more thought though. Genuinely appreciate the time and thoughts. Thank you. I see 1 and 2 as going in the opposite directions. WebTransport+WebCodec enables the shipping of binary blobs for each individual service. WHIP+WHEP might see enough demand (OBS input) that locked down services have to offer it. What cloud rendering are you trying to do? My hope/goal is to drop the browser dependency completely. Yeah, I agree 1 and 2 tackle very different problems. For additional context on what I've observed For 1: I work in the live streaming space and see cases where an input stream may need to be composited, transformed, mixed in, or something and an entire webrtc stream needs to be setup (usually going through some 3rd party service, e.g. twilio). OBS is normally the go to for a single broadcasted show however in the cases I work with it's a lot of shows (independent streams) that need to be managed and generally have rules around how the compositing works. Hiring a person to manage OBS, VMix, etc... is wasteful and expensive in this case (and not necessary). WebTransport helps in being able to deliver audio / video (or any data) to an app and take care of that. If anything I see this technology as making it easier to build apps for AV and hopefully getting rid of some of the reliance on existing broadcast software which is really intended for the traditional studio setup. For 2: It definitely seems like adoption is growing. Cloudflare supports WHIP/WHEP (in beta), livekit supports WHIP ingress, and Dolby.io supports WHIP/WHEP (although thats not a surprise since Sergio Garcia Murillo works there after the milicast acquisition and he's at the forefront of some of those standards). Regarding 4, cloud rendering, this is just mainly a use case I've had to deal with more and more of lately where folks are trying to get off OBS and VMix for compositing shows. An example use case is joining streams from multiple sources at scale (usually for large scale LED wall installations and such) and dynamically adding content / graphics based on interaction, data, etc.... It's much easier to program in the rules and have it be re-useable / configurable vs requiring someone to setup OBS scenes (at least at scale). I agree, I don't like the browser dependency (primarily since the devex is quite terrible for all of this). However it seems to be coming from a place of the lowest common denominator. It's easy for someone to whip up a figma/design of what they want and have that translated to html, use a motion library for animating, and stick it into a video feed (especially if you're trying to stick with an existing brand where you may already have a website and web apps with all of that ready to go).