Zero-Shot Text-Guided Object Generation with Dream Fields

2 min read Original article ↗

Compositional generation

The compositional nature of language allows users to combine concepts in novel ways and control generation. A template prompt describing a primary object (an armchair or a teapot) is stylized with 16 materials: avocado, glacier, orchid, pikachu, brain coral, gourd, peach, rubik's cube, doughnut, hibiscus, peacock, sardines, fossil, lotus root, pig, or strawberry. These prompt templates are sourced from DALL-E.

an archair in the shape of a ____.
an archair imitating a ____.

a teapot in the shape of a ____.
a teapot imitating a ____.

Related publications

Overview of DietNeRF's semantic consistency loss.

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Ajay Jain, Matthew Tancik, Pieter Abbeel ICCV 2021 International Conference on Computer Vision

DietNeRF regularizes Neural Radiance Fields with a CLIP-based loss to improve 3D reconstruction. Given only a few images of an object or scene, we reconstruct its 3D structure & render novel views using prior knowledge contained in large image encoders.

mip-NeRF integrated positional encoding

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, Pratul Srinivasan ICCV 2021 International Conference on Computer Vision

NeRF is aliased, but we can anti-alias it by casting cones and prefiltering the positional encoding function. Dream Fields combine mip-NeRF's integrated positional encoding with Fourier features.

Citation

Ajay Jain, Ben Mildenhall, Jonathan T. Barron, Pieter Abbeel, Ben Poole. Zero-Shot Text-Guided Object Generation with Dream Fields. arXiv, 2021.

@article{jain2021dreamfields,
  author = {Jain, Ajay and Mildenhall, Ben and Barron, Jonathan T. and Abbeel, Pieter and Poole, Ben},
  title = {Zero-Shot Text-Guided Object Generation with Dream Fields},
  joural = {CVPR},
  year = {2022},
}