Zip-NeRF: Anti-aliased grid-based neural radiance fields [video]
youtube.comAnyone know how many input images were used to train the network for the house scene? I don't see that information anywhere.
Project page: https://jonbarron.info/zipnerf/
Looks nice! NeRFs have come a long way.
The only artifact I can see is that lines, e.g. moldings become wavy. Not very jittery, but consistently wavy. Adding some line detection to force-straighten? Perhaps the same would be true for ellipses for images with large circles. These interior videos had more straight lines, though.
Hoping for the day this technology makes it into gaming. Looks incredible.
The problem for gaming is, these scenes are completely static.
Lot of game assets are static, and people are working on animated NERFs too. Also another necessary thing will be collisions and physics.
example for character animation:
https://zju3dv.github.io/animatable_nerf/
example for the static world already working in Unreal engine real-time:
https://www.youtube.com/watch?v=GjpzMDur7UY
diffusion models for NERF asset generation:
https://sirwyver.github.io/DiffRF/
https://dreamfusion3d.github.io/
Physics:
https://www.youtube.com/watch?v=Md0PM-wv_Xg
https://www.youtube.com/watch?v=Eklh1pIAri0
and there are a lot more research being done on this, it's hard to keep up with all the things happening in AI
True but even for static areas and backgrounds I feel like this detail and fidelity blows away what is used in current AAA games.
I think a bigger problem is how much computational resources NERF models need. From the paper,
"Our model, Mip-NeRF 360, and our 'mip-NeRF 360 + iNGP' baseline were all trained on 8 NVIDIA Tesla V100-SXM2-16GB GPUs."
Even with all that, it takes almost an hour to train. A developer would need to do this for every scene, which may make it unfeasible for indies, those who would benefit the most from NERF in gaming.
The better question is how many person hours would it take to model a scene like that with alternative approaches, and can a NeRF be used to create outputs that could be used in a game engine?
That's basically just FMV games all over again.
My thoughts exactly. This is the future - both realistic and lazy (as a scene can simply be described to a proompt).
To clarify for people who don't follow NeRF techniques, this research is not prompt based. The algorithm is capturing the 3d scene from real life images. There is some super promising work in mixing NeRF based techniques with various generative models to create 3d objects from prompts but it doesn't seem close to creating anything of this kind of scale / detail yet. I do agree this is a future possibility though.
I will admit I stand corrected about not being close: https://twitter.com/_akhaliq/status/1648848468234911754
Can someone explain what this is in layperson's terms? I watched the video and read the paper summary, but I don't know what I'm looking at.
Techniques like NeRF allow you to take a bunch of photos of a real 3D scene and then generate images/video of the scene from arbitrary viewpoints, where NeRF will infer the 3D structure using machine learning. So what you're seeing is the camera smoothly flying around rooms where the video was generated (in near-real-time, I think) by an "AI" that was trained on pictures of the rooms.
Thanks. Pretty cool!
The depth map from this is super sharp.
This will be quite neat for real estate