Abstract

Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts. However, most existing models fall short in simultaneously providing rapid generation speeds and high fidelity to input images - two features essential for practical applications. In this paper, we present One-2-3-45++, an innovative method that transforms a single image into a detailed 3D textured mesh in approximately one minute. Our approach aims to fully harness the extensive knowledge embedded in 2D diffusion models and priors from valuable yet limited 3D data. This is achieved by initially fine-tuning a 2D diffusion model for consistent multi-view image generation, followed by elevating these images to 3D with the aid of multi-view conditioned 3D native diffusion models. Extensive experimental evaluations demonstrate that our method can produce high-quality, diverse 3D assets that closely mirror the original input image.

Project Page: https://sudo-ai-3d.github.io/One2345plus_page/

Paper: https://arxiv.org/pdf/2311.07885.pdf

Code: https://github.com/SUDO-AI-3D/One2345plus

Demo: https://www.sudo.ai/3dgen

  • kryllic
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Let’s see the wireframe lol, if it requires a ton of retopology I don’t see this being too significant of a time saver. Really cool tech tho

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      I don’t know about immediate applicability for generating images, but I think that in the long run, models are going to have to be based on 3d representations. That’s what humans do, after all.

      Doing 2d is enormously limiting – you not making effective use of a lot of training data, since you can’t combine much knowledge about an object from different orientations.