Table of Contents
Questions about SRT vs MOQT come up often when engineers evaluate low latency video transport options. As a Lead Real-Time Video Architect working on MOQ development at Red5, I ran into this question during architecture discussions and realized others will likely face the same comparison soon. This article is written from my perspective and has also been reviewed and verified by my teammate Paul Gregoire, Red5 Solutions Architect.
If you want a quick summary, read the Key Takeaways section. If you want a deeper technical comparison, continue through the rest of the blog.
Key Takeaways
SRT is a well-established, reliable protocol for video contribution, offering strict end-to-end latency controls. However, its payload-agnostic approach to packet dropping can introduce playback instability under severe congestion, and its single-stream architecture can still create Head-of-Line blocking conditions, limiting its applicability to modern SVC workflows.
MOQT, pairing flexible streaming formats mapped to independent QUIC streams, provides equivalent latency controls while enabling granular, payload-aware data handling and discard strategies. Utilizing parallel streams, isolated packet loss recovery, and priority-based delivery, it can safely drop late data and natively support SVC adaptation. The protocol’s architecture is highly optimized for resilient, low-latency, bandwidth efficient media distribution.
Baseline for Comparison
When comparing Secure Reliable Transport (SRT) and Media over QUIC Transport (MOQT), it is important to establish equivalent architectural layers. Technically, SRT is a payload-agnostic transport protocol. However, in standard broadcast and streaming workflows, it is predominantly used to carry multiplexed MPEG-TS (Transport Stream) payloads.
MOQT is an end-to-end media transport protocol designed to operate in conjunction with a wide range of application-layer streaming formats (current drafts define MOQT Streaming Format (MSF) and CMAF MSF (CMSF). The streaming format and related container structure provides necessary media meta-data including timestamps. Therefore, a functional comparison for video delivery is best framed as SRT + MPEG-TS versus MOQT + MSF/CMSF.
Note: it is important to mention that MOQT occupies different use-case spaces, with overlap primarily on the contribution side – SRT is generally not considered for distribution at scale to end-consumers.
| Category | SRT + MPEG-TS | MOQT + MSF/CMSF |
| Primary role | Primarily used for contribution workflows between encoders, transcoders, and broadcast infrastructure | Designed as an end-to-end media transport protocol that can support both contribution and scalable distribution architectures |
| Latency comparison | ~125 ms or more depending on configured latency buffer and network stability | ~200 ms to 500 ms depending on streaming format configuration and congestion conditions |
| Transport architecture | Single UDP connection carrying a multiplexed MPEG-TS stream | Built on QUIC with multiple independent streams |
| Payload awareness | Payload agnostic. The transport layer does not understand media structure or importance of packets | Works together with streaming formats that provide timestamps and metadata, allowing payload aware handling |
| Packet scheduling | Primarily FIFO scheduling over a single connection | Priority based scheduling across multiple streams |
| Congestion handling | Cannot distinguish between critical media and enhancement data without application level logic | Relays can prioritize or discard lower priority streams when bandwidth is constrained |
| Packet loss recovery | Uses ARQ retransmissions but all packets share a single transport pipeline | Uses QUIC ARQ with independent recovery per stream |
| Head-of-Line blocking | Possible. Lost packets delay subsequent packets until retransmission or drop | Avoided because audio, video, and other components can travel on separate QUIC streams |
| Handling late data | Too-Late Packet Drop (TLPKTDROP) discards packets that exceed the latency buffer | Application evaluates presentation timestamps and instructs transport to discard late frames cleanly |
| Data discard method | Packet level dropping which can lead to partial media frames and possible decoder artifacts | Semantic media unit dropping using STOP_SENDING and RESET_STREAM |
| SVC support | Limited effectiveness because base and enhancement layers share the same transport queue | Native support through independent streams for different SVC layers |
| ABR adaptation | Typically handled outside the SRT stream or by switching feeds | Client can switch Tracks and relays can drop lower priority layers |
| Behavior during congestion | Entire stream affected by packet loss or retransmission delays | Selective degradation such as dropping enhancement layers while preserving base playback |
| Transcoding workflow | Requires full ingestion and demultiplexing of the entire MPEG-TS stream before processing | Transcoders can selectively ingest only required tracks or layers |
| Ecosystem maturity | Very mature with broad support across encoders, FFmpeg workflows, and broadcast systems | Emerging ecosystem but designed for modern scalable media architectures |
SRT + MPEG-TS vs MOQT + MSF/CMSF Streaming Comparison Table
Packet Scheduling and Multiplexing
Under network congestion, transport protocols must manage how data is queued and transmitted through restricted bandwidth.
- SRT Scheduling: SRT processes packets in a primarily First-In, First-Out (FIFO) sequence over a single UDP connection. Because it does not natively inspect the payload, it cannot differentiate between critical media (like base video layers or audio) and supplemental media (like video enhancement layers) without custom application layer multiplexing.
- MOQT Scheduling: MOQT incorporates a prioritization model utilizing various priority parameters per stream. This allows the sender and intermediate relays to identify the relative importance of different media components. Under bandwidth constraints, an MOQT relay can use this logic to selectively delay or drop lower-priority streams to ensure the timely delivery of higher-priority streams.
Packet Loss Recovery and Head-of-Line Blocking
Both SRT and MOQT (via QUIC) use Automatic Repeat reQuest (ARQ) to recover lost packets. When a packet is dropped by the network, the receiver asks the sender to retransmit it. The operational difference lies in how this recovery impacts the rest of the data in transit.
- SRT (Head-of-Line Blocking): SRT transmits its payload sequentially over a single connection. If a packet is lost, the receiver must hold all subsequent packets in a buffer until the lost packet is retransmitted and successfully arrives. However, if the retransmission delay exceeds the set latency buffer, SRT drops the packet entirely, allowing the stream to continue rather than waiting forever. This creates Head-of-Line (HoL) Blocking. Because all media (audio, video base layer, video enhancement layer) shares this single pipeline, a single lost network packet stalls the delivery of the entire transport stream, increasing latency across the board. Again this only “stalls” within the latency period, not forever.
- MOQT (Independent Stream Recovery): MOQT relies on QUIC’s multiplexed stream architecture. Because different media components (e.g., audio and video) are mapped to independent QUIC streams, packet loss is isolated. If a packet containing video data is lost, only the specific video stream waits for a retransmission. The audio stream, operating on a parallel QUIC stream, continues to deliver data to the application without interruption. This prevents a single network packet loss from stalling the entire media presentation.
Handling Late Data under Congestion
When network delays cause data to arrive past its intended playback deadline, the two protocols use different mechanisms to discard that data.
- SRT (Network-Level Dropping): SRT utilizes a configured latency buffer. If a packet cannot be delivered within this timeframe, SRT’s Too-Late Packet Drop (TLPKTDROP) mechanism discards it. Because SRT drops data at the network packet level without payload awareness, this can result in the delivery of partial media frames. In an MPEG-TS workflow, this fragmentation can lead to decoder errors or visual artifacts, potentially persisting until the next keyframe.
- MOQT (Application-Aware Dropping): MOQT relies on a feedback loop between the application’s Streaming Format and the QUIC transport layer. The application layer evaluates the Presentation Timestamp (PTS); if a frame exceeds its playback deadline, it instructs the transport layer to issue a QUIC STOP_SENDING frame. MOQT then discards the complete, semantic media unit via a RESET_STREAM operation. This preserves the structural integrity of the remaining video streams and avoids corrupting the decoder.
Support for Video Adaptation (ABR and SVC)
Modern video delivery relies on Adaptive Bitrate (ABR) and Scalable Video Coding (SVC) to adjust to changing network conditions.
- SRT and SVC: Because SRT typically carries a single, multiplexed MPEG-TS stream, all SVC layers (base resolution and enhancement details) share the same transport queue. If the network drops a base layer packet while successfully delivering an enhancement layer packet, the enhancement data cannot be decoded, limiting the practical effectiveness of SVC over a standard SRT link.
- MOQT and SVC/ABR Integration: MOQT maps SVC layers to independent QUIC streams (Subgroups), facilitating two types of adaptation:
- Layer Dropping (SVC): During transient network drops, MOQT relays autonomously discard low-priority enhancement Subgroups. The player experiences a temporary reduction in quality while maintaining uninterrupted playback.
- Track Switching (ABR): For sustained changes in network capacity, the client can issue a SUBSCRIBE command for a lower-bitrate MOQT Track. MOQT processes these switches at defined Group boundaries (Keyframes), providing clean transitions between quality tiers.
Integration with Transcoding Workflows
The integration of these protocols into transcoding pipelines involves a tradeoff between current ecosystem support and architectural efficiency.
- SRT Ecosystem Maturity: SRT combined with MPEG-TS is a mature, widely adopted standard. It possesses extensive support across legacy hardware encoders, software transcoders (e.g., FFmpeg), and existing cloud broadcast infrastructure.
- MOQT Processing Efficiency: SRT’s monolithic payload requires transcoders to ingest, demultiplex, and decode the entire Transport Stream before processing. By contrast, MOQT’s architecture separates media into independent Tracks and Subgroups. This allows a modern transcoder to selectively ingest only the required streams (e.g., processing a 1080p base layer while actively ignoring higher-resolution streams), offering a more compute-efficient pipeline as the software ecosystem matures.
Conclusion
In summary, the SRT vs MOQT comparison highlights the difference between a mature contribution protocol built around a single transport stream and a newer architecture designed for multiplexed, media aware delivery. SRT remains widely used and reliable, while MOQT introduces transport level capabilities that align better with modern scalable video workflows and adaptive streaming models.
If you want to explore how MOQ compares with other real time delivery approaches, read our related blog on MOQ vs WebRTC.
Try Red5 For Free
🔥 Looking for a fully managed, globally distributed streaming PaaS solution? Start using Red5 Cloud today! No credit card required. Free 50 GB of streaming each month.
Looking for a server software designed for ultra-low latency streaming at scale? Start Red5 Pro 30-day trial today!
Not sure what solution would solve your streaming challenges best? Watch a short Youtube video explaining the difference between the two solutions, or reach out to our team to discuss your case.
Giovanni Marzot
Lead Real-Time Video Architect at Red5
Giovanni Marzot is the Lead Real-Time Video Architect at Red5, bringing decades of experience in large-scale system engineering, network architecture, and real-time media delivery. His background spans technical leadership, DevOps, and cloud infrastructure, with deep expertise in system integration, network engineering, and secure distributed architectures. Giovanni has extensive experience working with technologies such as C, C++, C#, Java, Python, Bash, and low-level networking protocols including HTTP and TCP/IP, as well as security standards such as SSL/TLS and X.509. He has worked extensively with cloud platforms including AWS and Azure, and has led the design and deployment of complex production systems across Linux and Windows environments. At Red5, Giovanni focuses on advancing real-time video architectures and exploring next-generation media delivery technologies such as Media over QUIC, bringing practical large-scale WebRTC and media transport experience to help shape the future of low-latency streaming systems.
Giovanni Marzot is the Lead Real-Time Video Architect at Red5, bringing decades of experience in large-scale system engineering, network architecture, and real-time media delivery. His background spans technical leadership, DevOps, and cloud infrastructure, with deep expertise in system integration, network engineering, and secure distributed architectures. Giovanni has extensive experience working with technologies such as C, C++, C#, Java, Python, Bash, and low-level networking protocols including HTTP and TCP/IP, as well as security standards such as SSL/TLS and X.509. He has worked extensively with cloud platforms including AWS and Azure, and has led the design and deployment of complex production systems across Linux and Windows environments. At Red5, Giovanni focuses on advancing real-time video architectures and exploring next-generation media delivery technologies such as Media over QUIC, bringing practical large-scale WebRTC and media transport experience to help shape the future of low-latency streaming systems.
