Saturday, June 28, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Meet CommonCanvas: An Open Diffusion Model That Has Been Trained Using Creative-Commons Images

November 1, 2023
in AI Technology
Reading Time: 3 mins read
0 0
A A
0
Share on FacebookShare on Twitter


Artificial Intelligence Advancements in Text-to-Image Generation

Artificial intelligence has made significant progress in text-to-image generation in recent years. This technology has various applications, including content creation, aiding the visually impaired, and storytelling. However, researchers have faced two major obstacles: a lack of high-quality data and copyright issues related to internet-scraped datasets.

In a recent study, a team of researchers proposed the idea of building an image dataset under a Creative Commons license (CC) and using it to train open diffusion models that can outperform Stable Diffusion 2 (SD2). To achieve this, they needed to overcome two major challenges:

Absence of Captions

Although high-resolution CC photos are open-licensed, they often lack the necessary textual descriptions (captions) required for training text-to-image generative models. Without captions, the model struggles to comprehend and produce visuals based on textual input.

Scarcity of CC Photos

Compared to larger proprietary datasets like LAION, CC photos are scarce despite being a valuable resource. The scarcity raises concerns about whether there is enough data to successfully train high-quality models.

The team employed transfer learning techniques and created synthetic captions using a pre-trained model, which they matched with a carefully selected collection of CC photos. This approach leveraged the model’s ability to generate text from photos or other inputs. They compiled a dataset of photos and fabricated captions, which could be used to train generative models that translate words into visuals.

To tackle the second challenge, the team developed a compute- and data-efficient training recipe. This recipe aims to achieve the same quality as current SD2 models with significantly less data. Only around 3% of the original data used to train SD2 (approximately 70 million examples) is required. This suggests that there are enough CC photos available to efficiently train high-quality models.

The team trained several text-to-image models using the data and the effective training procedure. Together, these models form the CommonCanvas family and represent a significant advancement in generative models. They can generate visual outputs of similar quality to SD2.

The largest model in the CommonCanvas family, trained on a CC dataset less than 3% the size of the LAION dataset, achieves performance comparable to SD2 in human evaluations. Despite the limitations in dataset size and the use of artificial captions, the method effectively generates high-quality results.

The team summarized their primary contributions as follows:

  • They utilized transfer learning to produce excellent captions for Creative Commons (CC) photos that initially lacked captions.
  • They provided a dataset called CommonCatalog, consisting of approximately 70 million CC photos released under an open license.
  • The CommonCatalog dataset was used to train a series of Latent Diffusion Models (LDM). Collectively, these models, known as CommonCanvas, perform competitively both qualitatively and quantitatively compared to the SD2-base baseline.
  • The study incorporated various training optimizations, resulting in the SD2-base model training almost three times faster.
  • To encourage collaboration and further research, the team made the trained CommonCanvas model, CC photos, artificial captions, and the CommonCatalog dataset freely available on GitHub.

For more information, please refer to the paper. All credit for this research goes to the researchers involved in this project. Don’t forget to join our ML SubReddit, Facebook Community, Discord Channel, and Email Newsletter for the latest AI research news, cool AI projects, and more.

If you enjoy our work, you’ll love our newsletter. We are also available on Telegram and WhatsApp.

About the Author

Tanya Malhotra is a final year undergraduate student at the University of Petroleum & Energy Studies, Dehradun. She is pursuing a BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning. Tanya is a Data Science enthusiast with strong analytical and critical thinking skills. She has a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.

🔥 Meet Retouch4me: A Family of AI-Powered Plug-Ins for Photography Retouching



Source link

Tags: CommonCanvasCreativeCommonsDiffusionImagesMeetmodelopenTrained
Previous Post

How To Start A Digital Marketing Agency in 2023 (Step by Step)

Next Post

AI-Driven Insights: Big Data Empowering Dynamic Scheduling Tools

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
AI-Driven Insights: Big Data Empowering Dynamic Scheduling Tools

AI-Driven Insights: Big Data Empowering Dynamic Scheduling Tools

6 Ways to Use Data to Improve Employee Productivity

6 Ways to Use Data to Improve Employee Productivity

Making the right tradeoffs • Lea Verou

Making the right tradeoffs • Lea Verou

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
How ‘Chain of Thought’ Makes Transformers Smarter

How ‘Chain of Thought’ Makes Transformers Smarter

May 13, 2024
Amazon’s Bedrock and Titan Generative AI Services Enter General Availability

Amazon’s Bedrock and Titan Generative AI Services Enter General Availability

October 2, 2023
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
The Importance of Choosing a Reliable Affiliate Network and Why Olavivo is Your Ideal Partner

The Importance of Choosing a Reliable Affiliate Network and Why Olavivo is Your Ideal Partner

October 30, 2023
How To Build A Quiz App With JavaScript for Beginners

How To Build A Quiz App With JavaScript for Beginners

February 22, 2024
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In