TRELLIS.2: state-of-the-art large 3D generative model (4B)
github.comhttps://microsoft.github.io/TRELLIS.2/ I'm surprised at the lukewarm reception. Admittedly I don't follow the image-to-3D space as much, but last time I checked in, the gloopy fuzzy outputs did not impress me. I want to highlight what I believe is the coolest innovation: their novel O-Voxel data structure. I'm still trying to wrap my head around how they figured out the conversion from voxel-space to mesh-space. Those two worlds don't work well together. A 2D analogy is that they figured out an efficient, bidirectional, one-shot method of converting PNG's into SVG's, without iteration. Crazy. As an old guy "System: The code is currently tested only on Linux." :) TRELLIS 1 had a massive impact on the research in this area, not least because it’s actually open (full dataset, training and inference). Research like SynCity or PhysX-3D (not the NVIDIA one) wouldn’t have been possible. Excites for the follow ups for this new generation. State of the art for Open Source. This is a nice improvement, but far out I cannot wait for a Sparc3D equivalent for local use. Its a step change in quality. I really hope Hunyuan3D-3 is the one to level up to that quality now To me it seems Trellis 2 has higher quality and it also generates PBR materials and textures. Where they get training data from? > All our model, code, and dataset will be publicly released to facilitate reproduction and further research. Will take some time to publish. The TRELLIS 1 dataset is public here: https://github.com/microsoft/TRELLIS/blob/main/DATASET.md You can play with a demo here: https://huggingface.co/spaces/microsoft/TRELLIS.2
(requires a hugging face acct) The results from arbitrary pictures are not nearly as good as what's shown in the posting. So either the demo is running a gimped version of the model or the examples are _very_ handpicked. Needs 24GB gpu to run. If it takes 60 seconds on a GPU I can leave it running over night on a CPU. (And going off previous experience, it won't be even be that slow, I'm just being conservative.) you can chat with vlm.run to generate these assets from image generation without needing a gpu. Modi: https://chat.vlm.run/c/bdbaf1dc-b3c2-4b8a-ad17-6e26d87475fd Musk: https://chat.vlm.run/c/894f44ce-c366-4c93-b348-de6eebfb9f03 Ronaldo: https://chat.vlm.run/c/db5eff27-8c52-4e0b-8e4e-157df6bef278 Gomez: https://chat.vlm.run/c/02b92f23-a6fd-4380-b7a9-52ae7463c9c1 Jackie chan: https://chat.vlm.run/c/62ca5502-2066-4d5f-b994-c8845b2e72c9 Messi: https://chat.vlm.run/c/5bfda3c2-a2bf-4ca2-a874-78f1ce66edf1 Taylor swift: https://chat.vlm.run/c/d411c16a-e2b1-490c-85cc-9f86f2564a59 Obama: https://chat.vlm.run/c/c55fe1d6-853b-4113-bdaf-d828bee0da53 Trump: https://chat.vlm.run/c/84b57509-ed38-454a-9b17-85c03a247993 Kohli: https://chat.vlm.run/c/da8ffe8a-0818-4820-bdeb-b82d965ae3a4 Lamar: https://chat.vlm.run/c/ef6488c3-b0f0-4651-944e-7d0c4f98bdd1 Mrbeast: https://chat.vlm.run/c/c24d1cac-feb0-4a53-b1a2-2bbb62a7b7fb Got a good laugh out of these, thanks. was fun generating the as well these are garbage compared to what's shown in the posting. Project website gives a nice look: https://microsoft.github.io/TRELLIS.2/ Thanks, we'll put that link in the toptext as well.