Settings

Theme

Show HN: Real-Time 3D Gaussian Splatting in WebGL

antimatter15.com

309 points by antimatter15 2 years ago · 63 comments

Reader

naavis 2 years ago

This is really cool! The control scheme is confusing though. Instead of the typical WASD for moving and using the mouse to look around, dragging the mouse moves forwards and backwards and orbits around some point, A and D strafe, while W and S look up and down.

EDIT: Looks like a full list of controls is in the readme: https://github.com/antimatter15/splat#controls

  • antimatter15OP 2 years ago

    Author here- I'm sorry about the camera controls! Happy to accept pull requests that replace it with something more sensible

    The original idea was to be able to navigate around with just arrow keys (conceptually by turning yourself around in place and being able to walk back and forward).

    • crubier 2 years ago

      This is insanely cool!

      If you integrate this with ThreeJS you'd have a lot of control options for free!

      Whilst you're here, I have a question for you: It seems like you don't render read gaussians (I see sharp edges in many cases). Is this a bug on my side or is this an optimization made to be able to run fast? I created an issue to discuss if you prefer https://github.com/antimatter15/splat/issues/2

    • boppo1 2 years ago

      If you do an update, consider this a vote for WASD + mouselook. It's a ubiquitous scheme among everyone with an interest in real time computer graphics

    • naavis 2 years ago

      No need to apologize, it's a minor thing! Anyway really neat stuff and I love seeing it here.

  • alanbernstein 2 years ago

    It's very similar to the N64 FPS controls (i.e. Goldeneye): arrow keys (joystick) for the "primary movements" of forward/backward and yaw, with which you can move and look anywhere in a 2D space. Then, WASD (C buttons) for the "secondary movements" of strafe and pitch.

    • gorkish 2 years ago

      It's pretty telling that you had to reach back 26 years to find a control scheme that could be used as an analogy. I don't even know where to start with the mouse controls. Up/down translation lock after right click; reverse yaw. Thing is a test of patience!

      • moffkalast 2 years ago

        It actually seems like the FreeCAD control scheme almost verbatim, I always hated that thing and its insistence to not provide any way to orbit around the up vector.

        Like, are there people whose head does not rotate around their neck on axis but ends up sideways and rolled when they turn or something, to whom this makes perfect sense? I can't see any other explanation.

  • wingerlang 2 years ago

    FWIW, OP, I liked the control scheme a lot (using mouse only).

  • gorkish 2 years ago

    Being brutally honest here, but I just cant get over the control scheme enough to even appreciate the rendering demo. It is unusably unintuitive and awful.

Lichtso 2 years ago

Really cool, I am also working on a port of gaussian-splatting [0] but to WebGPU.

Like all the other implementations I have seen so far, this also makes the same mistake when projecting the ellipsoids in a perspective: First you calculate the covariance in 3D and then project that to 2D [1]. This approach only works with parallel / orthographic projections and applying it to perspectives leads to incorrect results. That is because perspective projections have three additional effects:

- Parallax movements (that is the view plane moves parallel to the ellipsoids) change the shape of the projected ellipse. E.g. a sphere only appears circular when in center of the view, once it moves to the edges it becomes stretched into an ellipse. This effect is manually counter balanced by this matrix I believe [2].

- Rotating an ellipse can change the position it appears at, or in other words creates additional translation. This effect is zero if the ellipse has one of its three axes pointing straight at the view (parallel to the normal of the view plane). But, if it is rotated 45°, then the tip of the ellipse that is closer to the view plane becomes larger through the perspective while the other end becomes smaller. Put together, this slightly shifts the center of the appearance away from the projected center of the ellipsoid.

- Conic sections can not only result in ellipses but also parabola and hyperbola. This however is an edge case that only happens when the ellipsoid intersects with the view plane and can probably be ignored as one would clip away such ellipsoids anyway.

The last two effects are not accounted for in these calculations in any of the implementations I have seen so far. What would be correct to do instead? Do not calculate the 3D covariance. Instead calculate the bounding cone around the ellipsoid which has its vertex at the camera position (perspective origin). Then intersect that with the view plane and the resulting conic section is guaranteed to be the correct contour of the perspective projection of the ellipsoid.

[0]: https://github.com/graphdeco-inria/gaussian-splatting [1]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2... [2]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2...

  • porphyra 2 years ago

    In general, a Gaussian is no longer a true Gaussian after camera projection since the pinhole camera projection function is nonlinear (due to dividing by z). However, if the Gaussian is small relative to the size of the image, you can apporximate it by linearizing the projection function. Therefore the Gaussian splatting paper uses the Jacobian of the projection function as described in equation 5 of the paper [0]. In practice, this approximation is extremely good. This Jacobian is the matrix you mentioned in the third link and it is mathematically sound and not "manually counter balanced". For a derivation, see [1].

    [0] https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_...

    [1] https://math.stackexchange.com/a/4716514/43771

    • Lichtso 2 years ago

      I read the paper and I am aware that the gaussian projection is an approximation anyway (hence I spoke about ellipsoids, not gaussians). Still, one could at least aim to get the iso contour right and yes using the Jacobian matrix is not unsound, just incomplete. As I said, this approach can not produce the distinctive "wiggle" that you get from rotating an ellipsoid while staring dead center at it.

      • porphyra 2 years ago

        True, it is an approximation after all. But it is a useful approximation since the main advantage of Gaussian splatting is the speed.

  • contravariant 2 years ago

    Yeah I think you're right, they're pretending the projection is a linear transformation (in cartesian coordinates) and using it to transform the Gaussian.

    Or viewed alternatively they're approximating the projection by assuming all of the Gaussian is at a fixed depth, which I suppose works if it is far enough away.

    A projective transformation of a Gaussian seems somewhat annoying, though I assume someone will have done it before. Seems like it should be possible to do it with projective coordinates but the final projection to cartesian coordinates is tricky.

    For what it's worth, projecting a contour is also wrong, the whole density changes which also affects the contours.

  • ImHereToVote 2 years ago

    Hi. I'm not very familiar with the gaussian splat technique but aren't they essentially quads with some intrinsic data in the vertices. I thought projecting quads was already a solved problem. Could you elaborate how this differs from a simple array of quads? Thank you.

  • jimmySixDOF 2 years ago

    If you can implement the intersecting bounding cone idea without impacting frame rates that's going to be even smoother on WebGPU but it would be interesting to see the difference apples to apples with this type of implementation.

  • m1sta_ 2 years ago

    Dynamic? I've. video?

bluescrn 2 years ago

When you zoom out there's lots of visible polygon edges that don't look like they should really be there, as if it's trying to draw soft 'blobs' but the texture coords aren't quite right? Is that a bug or an intentional part of the technique?

  • KaiserPro 2 years ago

    Intentional.

    Basically its a semidense point cloud [1], but instead of a point, there is a blob which has been coloured, angled and scaled to match the input picture. This means they are optimised to be viewed from a certain distance.

    Think of it like a 3d vector drawing, if you zoom in too much, or pull one part away, it all starts to look a bit funky.

    [1]https://www.researchgate.net/publication/326621750/figure/fi...

jansan 2 years ago

So far I have only seen gaussian splatting used on photographic data. Would it make sens to use it for other graphics data, too. Or in other words, does it have potential to be used in games?

  • Lichtso 2 years ago

    Depends, radiance field approaches (like gaussian splatting) are basically 3D photos. They do only capture color at geometry (position and direction), but have no concept of surfaces, materials and light transport in general (emission, absorption, transmission, reflection, scattering, etc.). In other words, they can only do static scenes (no animations) with pre-baked lighting.

    The industry seems to be trying to move away from this with things like PBR (physical based rendering) and ray / path tracing which enables far better dynamic lighting.

    Also, they are extremely space inefficient at the moment. A scene that would take a good traditional rendering engine a few dozen GB would take TB instead. Though, that might improve in the future with more optimization.

    One exception to the above, where gaussian splatting might be interesting to see is procedural / generated content (possibly even animated). Especially for volumetric effects which currently use particle systems, like smoke, fire, clouds, flowing water, etc.

    • poslathian 2 years ago

      I thought I understood that the speculator highlights and view dependent color problems you mention are massively improved via adding spherical harmonics into each ellipse?

  • Solvency 2 years ago

    Sure, why not? It's just a fancy point cloud. I can easily imagine an open world Minecraft-esque game that uses this for its base engine instead of voxels.

gsuuon 2 years ago

Would this technique work for video? The readme of the inria work[1] seems to imply a model is trained per static scene, does that rule out video?

[1] https://github.com/graphdeco-inria/gaussian-splatting

andrewstuart 2 years ago

What am I looking at?

  • KaiserPro 2 years ago

    Gaussian splatting is a fancy word for pointcloud but with coloured shapes instead of points.

    Its been around for ages, but It was never used because if you have a million points in a point cloud, you'd need to artistically manipulate a million points.

    Its like 3d hair, its pretty simple, just render a billion hairs, but in practice its hard to make it look good.

    Here we tell a machine learning model to adjust the angle, colour, shape and size of a million primitives (ie a square, circle, triangle etc.) so that it looks like a the photos we provide.

    • Geee 2 years ago

      It's a little bit more than that. Gaussians are view-dependent, which means that they can capture the full radiance field of the scene, rather than just the color and geometry of the objects. All the light bouncing around from different objects can be reproduced, including reflections etc.

      See the reflections here: https://www.youtube.com/watch?v=mD0oBE9LJTQ

      This is also pretty good, but more subtle: https://www.youtube.com/watch?v=tJTbEoxxj0U

      • ath92 2 years ago

        This implementation does not support view-dependence though (mentioned in the readme)

      • KaiserPro 2 years ago

        > Gaussians are view-dependent,

        indeed, but that's just adding view dependent points.

    • dclowd9901 2 years ago

      My initial understanding is these scenes can’t be made dynamic (animated, physically responsive). Is that correct?

  • bestest 2 years ago

    basically this: https://github.com/graphdeco-inria/gaussian-splatting — a somewhat different approach at rendering 3d scenes.

lwansbrough 2 years ago

Does this use the method proposed by Kerbl and Kopanas at SIGGRAPH 2023?

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

  • porphyra 2 years ago

    Yes, but this is just the splatting/rendering part and not the optimization part that generates the reconstruction in the first place.

gabereiser 2 years ago

This is beyond cool. Point clouds are one thing but this… this is amazing. Kudos and great job. It even runs on my work Lenovo at 60fps.

  • esperent 2 years ago

    It runs on my mid range phone at 36fps. Did not expect that.

    Lots of artefacts though, especially if I move the camera.

crubier 2 years ago

Wow this is insanely cool.

If you make it work within ThreeJS, you're going to leave a trace in the history of 3D on the web with that stuff!

klysm 2 years ago

I've never experienced this set of mouse controls for a 3D view ever before and was highly confused for a bit.

adfm 2 years ago

Very impressive! Curious what the frame rate would be like for stereoscopic rendering of the same scene on the same hardware. Are there optimizations to be had past the halfway mark?

  • Yenrabbit 2 years ago

    Definitely. One of the time-consuming parts of rendering is sorting the gaussians by distance to the camera, which for two nearby cameras could be optimized. This also goes for adjacent frames - assuming smooth motion im pretty sure there is some speedup to be had by assuming the previous sort will be close to correct rather than starting from scratch each frame.

matt3210 2 years ago

Fancy! I like that on mobile I can drag to move around!

cchance 2 years ago

Gotta try on a pc for some reason on iOS the cloudiness feels more like nerf than Gaussian to me for some reason, gotta try it on pc later

smusamashah 2 years ago

Is it possible to increase the number of points (resolution) with some setting? I want to see more refined view on a higher end machine.

crtasm 2 years ago

Click through to the github for a list of the controls (I didn't think to try spacebar!) and links to other example scenes.

tmilard 2 years ago

Gaussian Spatting is the new sensation of the sumer in the 3D Scanning Field. Will it live to its expectation ?

teucris 2 years ago

Wow. I was literally just working on my own implementation. You beat me to it! Great work!

jheriko 2 years ago

why the hell is it that anyone who makes these "clever" demos provides the world's shittiest camera that adds unwelcome rolls.

late 90s bedroom me is shaking his head.

q_andrew 2 years ago

Can't wait to pull this up on my desktop tomorrow.

msk-lywenn 2 years ago

Runs fine on my 2016 iPhone SE. kudos

agys 2 years ago

Last sentence of the readme…!

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection