Sunday, June 8, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Automatic Labeling With GroundingDino | by Lihi Gur Arie, PhD | Feb, 2024

February 6, 2024
in AI Technology
Reading Time: 4 mins read
0 0
A A
0
Share on FacebookShare on Twitter



Prompt EngineeringThe GroundingDino model utilizes text prompts to encode text into a learned latent space. Modifying the prompts can result in different text features, which can impact the performance of the detector. To improve prediction accuracy, it is recommended to experiment with multiple prompts and select the one that yields the best results. It is important to note that during the writing of this article, several prompts were tried before finding the ideal one, sometimes leading to unexpected outcomes.

Getting Started
To begin, we will clone the GroundingDino repository from GitHub, set up the environment by installing the necessary dependencies, and download the pre-trained model weights.

Clone the repository:
“`html
!git clone https://github.com/IDEA-Research/GroundingDINO.git
“`

Install the dependencies:
“`html
%cd GroundingDINO/
!pip install -r requirements.txt
!pip install -q -e .
“`

Download the pre-trained model weights:
“`html
!wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
“`

Inference on an image
We will now explore the object detection algorithm by applying it to a single image of tomatoes. Our initial objective is to detect all the tomatoes in the image using the text prompt “tomato”. If you want to use different category names, you can separate them with a dot (e.g., “tomato.red”).

“`html
python3 demo/inference_on_a_image.py \
–config_file ‘groundingdino/config/GroundingDINO_SwinT_OGC.py’ \
–checkpoint_path ‘groundingdino_swint_ogc.pth’ \
–image_path ‘tomatoes_dataset/tomatoes1.jpg’ \
–text_prompt ‘tomato’ \
–box_threshold 0.35 \
–text_threshold 0.01 \
–output_dir ‘outputs’
“`

Annotations with the ‘tomato’ prompt:

Image by Markus Spiske.

The GroundingDino model not only detects objects as categories, such as “tomato”, but also comprehends the input text, a task known as Referring Expression Comprehension (REC). Let’s change the text prompt from “tomato” to “ripened tomato” and observe the outcome:

“`html
python3 demo/inference_on_a_image.py \
–config_file ‘groundingdino/config/GroundingDINO_SwinT_OGC.py’ \
–checkpoint_path ‘groundingdino_swint_ogc.pth’ \
–image_path ‘tomatoes_dataset/tomatoes1.jpg’ \
–text_prompt ‘ripened tomato’ \
–box_threshold 0.35 \
–text_threshold 0.01 \
–output_dir ‘outputs’
“`

Annotations with the ‘ripened tomato’ prompt:

Image by Markus Spiske.

Remarkably, the model can understand the text and differentiate between a ‘tomato’ and a ‘ripened tomato’. It even tags partially ripened tomatoes that aren’t fully red. If our task requires tagging only fully ripened red tomatoes, we can adjust the box_threshold from the default 0.35 to 0.5:

“`html
python3 demo/inference_on_a_image.py \
–config_file ‘groundingdino/config/GroundingDINO_SwinT_OGC.py’ \
–checkpoint_path ‘groundingdino_swint_ogc.pth’ \
–image_path ‘tomatoes_dataset/tomatoes1.jpg’ \
–text_prompt ‘ripened tomato’ \
–box_threshold 0.5 \
–text_threshold 0.01 \
–output_dir ‘outputs’
“`

Annotations with the ‘ripened tomato’ prompt, with box_threshold = 0.5:

Image by Markus Spiske.

Generation of tagged dataset
While GroundingDino offers remarkable capabilities, it is a large and slow model. If real-time object detection is required, it is recommended to use a faster model like YOLO. However, training YOLO and similar models requires a significant amount of tagged data, which can be expensive and time-consuming to produce. Fortunately, if your data is not unique, you can use GroundingDino to tag it. For more information on efficient YOLO training, refer to my previous article [4].

The GroundingDino repository includes a script to annotate image datasets in the COCO format, which is suitable for YOLOx, among others.

“`html
from demo.create_coco_dataset import main

main(image_directory= ‘tomatoes_dataset’,
text_prompt= ‘tomato’,
box_threshold= 0.35,
text_threshold = 0.01,
export_dataset = True,
view_dataset = False,
export_annotated_images = True,
weights_path = ‘groundingdino_swint_ogc.pth’,
config_path = ‘groundingdino/config/GroundingDINO_SwinT_OGC.py’,
subsample = None)
“`

Parameters:
– export_dataset: If set to True, the COCO format annotations will be saved in a directory named ‘coco_dataset’.
– view_dataset: If set to True, the annotated dataset will be displayed for visualization in the FiftyOne app.
– export_annotated_images: If set to True, the annotated images will be stored in a directory named ‘images_with_bounding_boxes’.
– subsample (int): If specified, only this number of images from the dataset will be annotated.

Different YOLO algorithms require different annotation formats. If you plan to train YOLOv5 or YOLOv8, you will need to export your dataset in the YOLOv5 format. Although the export type is hard-coded in the main script, you can easily change it by adjusting the dataset_type argument in create_coco_dataset.main, from fo.types.COCODetectionDataset to fo.types.YOLOv5Dataset (line 72). To maintain organization, we will also change the output directory name from ‘coco_dataset’ to ‘yolov5_dataset’. After making these changes, run create_coco_dataset.main again.

“`html
if export_dataset:
dataset.export(‘yolov5_dataset’, dataset_type=fo.types.YOLOv5Dataset)
“`

GroundingDino offers a significant advancement in object detection annotations through the use of text prompts. In this tutorial, we have explored how to utilize the model for automated labeling of images or entire datasets. However, it is crucial to manually review and verify these annotations before utilizing them in training subsequent models.

A user-friendly Jupyter notebook containing the complete code is included for your convenience.

Want to learn more?
[1] Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection, 2023.
[2] Dino: Detr with improved denoising anchor boxes for end-to-end object detection, 2022.
[3] An Open and Comprehensive Pipeline for Unified Object Grounding and Detection, 2023.
[4] The practical guide for Object Detection with YOLOv5 algorithm, by Dr. Lihi Gur Arie.



Source link

Tags: ArieAutomaticfebGroundingDinoGurLabelingLihiPhD
Previous Post

The Role of Logistics Software Development in Supply Chain Evolution

Next Post

The Do’s and Dont’s of Link Building [infographic included]

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
The Do’s and Dont’s of Link Building [infographic included]

The Do's and Dont's of Link Building [infographic included]

How To Help Your Internal Teams Fall in Love With Innovative Ideas

How To Help Your Internal Teams Fall in Love With Innovative Ideas

TikTok Updates: Incentivized Video, Shoppable Posts, and More

TikTok Updates: Incentivized Video, Shoppable Posts, and More

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
Managing PDFs in Node.js with pdf-lib

Managing PDFs in Node.js with pdf-lib

November 16, 2023
Accenture creates a regulatory document authoring solution using AWS generative AI services

Accenture creates a regulatory document authoring solution using AWS generative AI services

February 6, 2024
Salesforce AI Introduces Moira: A Cutting-Edge Time Series Foundation Model Offering Universal Forecasting Capabilities

Salesforce AI Introduces Moira: A Cutting-Edge Time Series Foundation Model Offering Universal Forecasting Capabilities

April 3, 2024
The Importance of Choosing a Reliable Affiliate Network and Why Olavivo is Your Ideal Partner

The Importance of Choosing a Reliable Affiliate Network and Why Olavivo is Your Ideal Partner

October 30, 2023
Programming Language Tier List

Programming Language Tier List

November 9, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In