Customized backend with GCP Deployment and Data Versioning using GCS Integration
Table of Contents
– Introduction
– Overview
– Goal
– Why semiautomatic?
– Entering Label Studio
– 1 frontend + 2 backends
– Implementation (Local)
– Install git and docker & download backend code
– Set up frontend to get access token
– Set up backend containers
– Connect containers
– Happy labeling!
– GCP Deployment
– Select project/Create new project and set up billing account
– Create VM instance
– Set up VM environment
– Follow previous section & set up everything on VM
– GCS Integration
– Set up GCS buckets
– Create & set up service account key
– Rebuild backend containers
– SDK upload images from source bucket
– Set up Target Storage
– Acknowledgement
– References
Creating training data for image segmentation tasks can be challenging, especially for individuals and small teams. In this post, I will discuss a solution used in my capstone project where a team of 9 people successfully labeled 400+ images within a week.
Thanks to Politecnico de Milano Gianfranco Ferré Research Center, we had access to thousands of fashion runway show images from Gianfranco Ferré’s archival database. To manage and analyze the database, I implemented image segmentation for smarter cataloging and research. Image segmentation of runway show photos also lays the foundation for creating informative textual descriptions for better search engine and text-to-image generative AI approaches. This blog will detail how to create your own backend with Label Studio, host it on Google Cloud Platform for collaboration, and employ Google Cloud Storage buckets for data versioning.
Goal
Segment and identify the names and typologies of fashion clothing items in runway show images.
Why semiautomatic?
A trained segmentation model cannot perfectly recognize every piece of clothing in the runway show images due to the variety of styles and preferences of different fashion designers. By using a semiautomatic approach, where a model provides predictions that can be corrected by humans, we can achieve accurate labeling.
Entering Label Studio
Label Studio offers an open-source, customizable, and free community version for data labeling. By creating a custom backend, you can connect the Label Studio frontend to a trained segmentation model for labelers to improve upon auto-predictions. The interface of Label Studio resembles Photoshop and provides helpful segmentation tools like brush, eraser, and Magic Wand.
1 frontend + 2 backends
To connect two backends to the frontend, one for segmentation prediction and the other for faster labeling, we can create a seamless labeling process.
Implementation (Local)
To run the app locally, follow the steps to set up git, docker, download backend code, set up frontend for access token, set up backend containers, and connect containers for seamless labeling.
GCP Deployment
For group collaboration, deploy the app on Google Cloud Platform by creating a new project, setting up a VM instance, configuring the VM environment, and setting up everything on the VM.
GCS Integration
Integrate Google Cloud Storage buckets for data versioning by setting up buckets, creating a service account key, rebuilding backend containers, uploading images from a source bucket using SDK, and setting up Target Storage.
Acknowledgement
Special thanks to Politecnico de Milano Gianfranco Ferré Research Center for providing access to fashion runway show images.
References
Find the code for this project on GitHub repo.
Source link