Streaming lossless VNC to a web browser with Quite OK image format
kasmweb.comThis has been a dream of mine ever since I first saw QOI. The encoding and decoding times and relatively high compression were perfect for this use case it was just a matter of implementation. Because the VNC protocol with tight encoding breaks frame changes into 100s of tiny images it allows us to encode and decode in a threaded manner. This means high FPS and low latency on a truly lossless remote Linux desktop in your web browser. The easiest way to try this out is an all in one docker container https://github.com/linuxserver/docker-kasm . To try it ephemerally on a Docker host simply run:
docker run --rm -it --privileged -p 3000:3000 -p 443:443 linuxserver/kasm bash
I had spent a long time trying to simplify Linux Desktop application delivery with linuxserver/webtop and all the derivative dedicated app images, but the speed and quality was always lacking as it was using XRDP in tandem with Guacamole. The difference with this new KasmVNC https://github.com/kasmtech/KasmVNC implementation is night and day. Depending on client hardware it will deliver 60fps 1080p and 40-60fps 1440p for both the JPEG and QOI rendering modes. A quick video can be seen here https://youtu.be/VkzG5BU2gjo .
It makes more sense just to use a normal video codec. They can handle all sorts of situations efficiently that this can't e.g. scrolling, zooming and embedded videos.
That is how NX4, Chrome, RuskDesk, Zoom, etc. all do it.
Yes video codecs are great but in almost every case they need a GPU. This functions with no special hardware. The only lossless option would be AV1 and that needs an Intel arc card or 4000 series Nvidia card.
You also have an inherent latency issue as you have to buffer 2-5 frames at 16ms a pop server side to encode the data.
> You also have an inherent latency issue as you have to buffer 2-5 frames at 16ms a pop server side to encode the data.
Consider that oculus link uses video compression and doesn’t introduce the extra latency you’re describing. You only need to buffer frames if you want the best compression ratio. But you can always configure the encoder to not do look ahead. It’s also better to choose cbr over vbr to avoid the second pass of a frame at the cost of reducing quality/bitrate a bit. I’m practice it can work really well because even 20mbit/s is sufficient to send high res text.
I wish Remote Desktop applications would copy the oculus link architecture. You can easily get only a few frames of latency (sub 100ms) provided you composite on the GPU, use hardware encode/decode, and slice the video stream (which ie send 1/4 of the screen while encoding the next 1/4) which cuts down on decode latency and ensures you smear the expensive work across the entire refresh cycle time instead of having to do it all at once in a non-pipeline fashion (which introduces bubbles into scheduling).
When doing streaming for realtime interaction you don't use B-frames. You typically just have P-frames and slices of I-frames. Max one frame of latency and pretty consistent data rates.
And just about every phone, tablet, laptop, desktop made in the past decade has some form of H.264 fixed-function hardware decoding.
Probably a naïve question, but why does it need to be run as privileged?
That is only the all in one container solution geared to home server users.
The standard install uses no priv containers. https://www.kasmweb.com/downloads
I only mention the Linuxserver container because most Linux/Docker users do not want to pollute their base OS with stuff just to try it out.
This is running a full desktop as a service system that uses docker in docker to provide multiple full remote desktops each with login etc.
github repo mentions they use Docker in Docker and link to the docker.com article from 2013 where the -privileged flag introduced (with one dash?). So maybe the container you download is actually just a "wrapper" to setup a "real" kasm setup? But it's just my guess!
Lossless encoding can and will consume all available bandwidth available to a client and is designed to be run on local networks.
VNC is unfortunately inherently inefficient because it is just a framebuffer protocol, instead of RDP which passes through the graphics primitives to be rendered. The former will always involve encoding/decoding overhead at the server and client.
RDP doesn't pass through graphics primitives. You're probably thinking of X11.
Sending graphics primitives turned out to be the worst way to do remote desktop. All modern solutions just use video codecs. NX (the best solution on Linux) even switched from X11 forwarding to a video codec in NX4.
VNC is inefficient because it is ancient and uses extremely inefficient methods to encode the graphics. GIF is really inefficient too but you wouldn't say that means the idea of encoding animated images as bitmaps is a bad one.
The original incarnations of RDP, going back 20 years and more, preferred graphics when it could. Back then, it was more bandwidth efficient, and most programs actually used the GDI primitives to draw directly onto the screen, so it actually worked.
But IIRC, by 2007, it switched completely to sending tiles just like VNC.
However, if you read the protocol descriptions, you get the wrong idea that primitives are still used.
Similarly, most X11 clients have been doing everything client side for ages, but many people still believe that peimitives dominate.
VNC, the protocol, isn't at all inefficient. There are some encoding methods which are inefficient, but I'd say that more "modern" encoding methods such as "tight" aren't all that bad, depending on how they're implemented. There has also been a recent addition of h264 to some implementations.
An example of a truly inefficient encoding method is ZRLE. It is a tiled zlib and run-length encoded format that can't be split up into multiple jobs because future computation depends on past computation.
They got things right with the tight encoding method of which there are two variants: zlib (lossless) and jpeg. With zlib, you can have up to 4 separate zlib streams, which means that you can utilise 4 CPU cores for encoding in parallel. The jpeg method has no such limitations.
In the past, maybe Windows XP to perhaps Windows 7 era it did — not sure exactly when they stopped using primitives. But not since a long time. It's just (lossy) compressed bitmap streaming nowadays.
RDP on Windows from Windows now uses h264 by default.
Interesting. Here is some background: https://techcommunity.microsoft.com/t5/security-compliance-a...
I guess “by default” means if both client and server have the required hardware support and there aren’t too many concurrent sessions.
I am pretty sure that no RDP solution for any Linux desktop system uses graphics primitives. For that to happen, all the toolkits would have to implement some way of passing their scene graph and their primitives to the compositor and the compositor would have to know what to do with that info.
RDP also supports plain framebuffers and that's what we use.
Anyhow, with the advent of specialised hardware for video encoding, passing framebuffers isn't necessarily such a bad option. You just have to make sure that the CPU doesn't touch the framebuffer before encoding and after decoding.
Ah the good 'ole days of X11
This depends on what your use case is. I think this might be related to gaming uses (e. g. your gaming PC is in another room and you wanna play a game and stream it to your TV or something), although it doesn't seem to be their main priority.
Sad that in 2022 an "entire gigabit connection" is still considered a lot of bandwidth, the statement would have sounded the same 10 years ago and only slightly out of the ordinary 20 years ago.
A gigabit internet connection would've been a lot more than slightly out of the ordinary in the consumer space 20 years ago.
Shrug. Why spend more when 100mbit is also enough for most purposes.
Yes, the ISPs have successfully shifted a lot of the usage away from a peer-to-peer internet to TV-like use cases, and a lot of people don't even miss the old capabilities.
People used to do a lot of file sharing for example over the internet without Dropbox like corporate choke points. And of cousre this article's subject, remote desktop usage, has been important for individual freedom of computing, to be able to eg use your home desktop/IT infra from work or travels.
Because some ISPs (like the monopoly over here) give you only about 3% of your download speed in upload. No, that figure is not off by an order of magnitude - I have 900Mbps download and only 25Mbps upload. That's less than 3%. And they don't offer 25Mbps upload for any cheaper.
Sometimes you gotta take what you can get.
While that is true, I really enjoy my 600/600 connection, because most downloads basically finish instantly (same with uploads).
However, I agree with your point, that it's not that useful when I only use the full connection a few minutes a month. Still, it was only an extra 5 € from 300/300, so I figured why not.
I just got this up and running and the performance is amazing. I've never felt a VNC session feel so fast. Very impressed. Kudos to you for this!
Edit: I wish I could install this on my mac so I could VNC into it efficiently as well!
Screen sharing on the Mac has always been very impressive in performance to me, especially considering the graphics involved.
While their product (Kasm Workspaces) doesn't seem to be open source, their VNC server is! https://github.com/kasmtech/KasmVNC (GPLv2)
I still have a dream of a DIY thin client setup for remote development. I've spent a couple of weekends here and there over the past years trying out different servers and clients, but always struggled with anything non-VNC. Either on the server (linux) or client (macOS) side.
What is the state of the art for remote desktoping from linux to macos/windows, over "almost lan" conditions (symmetric 1 Gbit ethernet, < 10 ms rtt)? Enterprise VDI solutions I've used at work has worked better than anything I've managed to put together, especially if you consider stuff like streaming audio from the host...
Nomachine NX has been mentioned elsewhere in this thread, it works really well for me. Usually the latency is low enough that I forget I'm on a remote desktop.
Spice is good for that
I needed this for my VR VNC, I hope I can set it up for that!
can this "tech" be used in development of remote desktops like anydesk/rustdesk ?
KasmVNC is open source https://github.com/kasmtech/KasmVNC