Sunday, June 8, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Zhejiang University Researchers Propose UrbanGIRAFFE to Tackle Controllable 3D Aware Image Synthesis for Challenging Urban Scenes

November 20, 2023
in Data Science & ML
Reading Time: 4 mins read
0 0
A A
0
Share on FacebookShare on Twitter


UrbanGIRAFFE, an approach proposed by researchers from Zhejiang University for photorealistic image synthesis, is introduced for controllable camera pose and scene contents. Addressing challenges in generating urban scenes for free camera viewpoint control and scene editing, the model employs a compositional and controllable strategy, utilizing a coarse 3D panoptic prior. It also includes the layout distribution of uncountable stuff and countable objects. The approach breaks down the scene into things, objects, and sky, facilitating diverse controllability, such as large camera movement, stuff editing, and object manipulation. 

In conditional image synthesis, prior methods have excelled, particularly those leveraging Generative Adversarial Networks (GANs) to generate photorealistic images. While existing approaches condition image synthesis on semantic segmentation maps or layouts, the focus has predominantly been on object-centric scenes, neglecting complex, unaligned urban scenes. UrbanGIRAFFE, a dedicated 3D-aware generative model for urban scenes, the proposal addresses these limitations, offering diverse controllability for large camera movements, stuff editing, and object manipulation.

GANs have proven effective in generating controllable and photorealistic images in conditional image synthesis. However, existing methods are limited to object-centric scenes and need help with urban scenes, hindering free camera viewpoint control and scene editing. UrbanGIRAFFE breaks down scenes into stuff, objects, and sky, leveraging semantic voxel grids and object layouts before diverse controllability, including significant camera movements and scene manipulations. 

UrbanGIRAFFE innovatively dissects urban scenes into uncountable stuff, countable objects, and the sky, employing prior distributions for stuff and things to untangle complex urban environments. The model features a conditioned stuff generator utilizing semantic voxel grids as stuff prior for integrating coarse semantic and geometry information. An object layout prior facilitates learning an object generator from cluttered scenes. Trained end-to-end with adversarial and reconstruction losses, the model leverages ray-voxel and ray-box intersection strategies to optimize sampling locations, reducing the number of required sampling points. 

In a comprehensive evaluation, the proposed UrbanGIRAFFE method surpasses various 2D and 3D baselines on synthetic and real-world datasets, showcasing superior controllability and fidelity. Qualitative assessments on the KITTI-360 dataset reveal UrbanGIRAFFE’s outperformance over GIRAFFE in background modeling, enabling enhanced stuff editing and camera viewpoint control. Ablation studies on KITTI-360 affirm the efficacy of UrbanGIRAFFE’s architectural components, including reconstruction loss, object discriminator, and innovative object modeling. Adopting a moving averaged model during inference further enhances the quality of generated images.

UrbanGIRAFFE innovatively addresses the complex task of controllable 3D-aware image synthesis for urban scenes, achieving remarkable versatility in camera viewpoint manipulation, semantic layout, and object interactions. Leveraging a 3D panoptic prior, the model effectively disentangles scenes into stuff, objects, and sky, facilitating compositional generative modeling. The approach underscores UrbanGIRAFFE’s advancement in 3D-aware generative models for intricate, unbounded sets. Future directions include integrating a semantic voxel generator for novel scene sampling and exploring lighting control through light-ambient color disentanglement. The significance of the reconstruction loss is emphasized for maintaining fidelity and producing diverse results, especially for infrequently encountered semantic classes.

Future work for UrbanGIRAFFE includes incorporating a semantic voxel generator for novel scene sampling, enhancing the method’s ability to generate diverse and novel urban scenes. There is a plan to explore lighting control by disentangling light from ambient color, aiming to provide more fine-grained control over the visual aspects of the generated scenes. One potential way to improve the quality of generated images is to use a moving average model during inference.

Check out the Paper, Github, and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

\"\"

Hello, My name is Adnan Hassan. I am a consulting intern at Marktechpost and soon to be a management trainee at American Express. I am currently pursuing a dual degree at the Indian Institute of Technology, Kharagpur. I am passionate about technology and want to create new products that make a difference.


Source link

Tags: AwareChallengingControllableImageProposeResearchersscenessynthesisTackleUniversityurbanUrbanGIRAFFEZhejiang
Previous Post

How to build your own Rails user authentication

Next Post

ExLlamaV2: The Fastest Library to Run LLMs

Related Posts

AI Compared: Which Assistant Is the Best?
Data Science & ML

AI Compared: Which Assistant Is the Best?

June 10, 2024
5 Machine Learning Models Explained in 5 Minutes
Data Science & ML

5 Machine Learning Models Explained in 5 Minutes

June 7, 2024
Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’
Data Science & ML

Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’

June 7, 2024
How to Learn Data Analytics – Dataquest
Data Science & ML

How to Learn Data Analytics – Dataquest

June 6, 2024
Adobe Terms Of Service Update Privacy Concerns
Data Science & ML

Adobe Terms Of Service Update Privacy Concerns

June 6, 2024
Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart
Data Science & ML

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

June 6, 2024
Next Post
ExLlamaV2: The Fastest Library to Run LLMs

ExLlamaV2: The Fastest Library to Run LLMs

Episode 1: Introduction For Cloud Computing I Cloud Tutorial For Begginers [TAGALOG]

Episode 1: Introduction For Cloud Computing I Cloud Tutorial For Begginers [TAGALOG]

Musk Defends Himself on X After Antisemitic Furor Deepens

Musk Defends Himself on X After Antisemitic Furor Deepens

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
Accenture creates a regulatory document authoring solution using AWS generative AI services

Accenture creates a regulatory document authoring solution using AWS generative AI services

February 6, 2024
Managing PDFs in Node.js with pdf-lib

Managing PDFs in Node.js with pdf-lib

November 16, 2023
Graph neural networks in TensorFlow – Google Research Blog

Graph neural networks in TensorFlow – Google Research Blog

February 6, 2024
13 Best Books, Courses and Communities for Learning React — SitePoint

13 Best Books, Courses and Communities for Learning React — SitePoint

February 4, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In