Sunday, May 18, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Unsupervised speech-to-speech translation from monolingual data – Google Research Blog

December 1, 2023
in AI Technology
Reading Time: 2 mins read
0 0
A A
0
Share on FacebookShare on Twitter



Posted by Eliya Nachmani, Research Scientist, and Michelle Tadmor Ramanovich, Software Engineer, Google Research

Speech-to-speech translation (S2ST) is a type of machine translation that converts spoken language from one language to another. This technology has the potential to break down language barriers and facilitate communication between people from different cultures and backgrounds. Previously, we introduced Translatotron 1 and Translatotron 2, the first models that were able to directly translate speech between two languages. However, they were trained in supervised settings with parallel speech data.

The scarcity of parallel speech data is a major challenge in this field, so much that most public datasets are semi- or fully-synthesized from text. This poses additional challenges in learning translation and reconstruction of speech attributes that are not represented in the text and are thus not reflected in the synthesized training data.

Here we present Translatotron 3, a novel unsupervised speech-to-speech translation architecture. In Translatotron 3, we show that it is possible to learn a speech-to-speech translation task from monolingual data alone. This method opens the door not only to translation between more language pairs but also towards translation of non-textual speech attributes such as pauses, speaking rates, and speaker identity. Our method does not include any direct supervision to target languages, making it the right direction for preserving paralinguistic characteristics (e.g., tone, emotion) of the source speech across translation.

To enable speech-to-speech translation, we use back-translation, a technique from unsupervised machine translation (UMT) where a synthetic translation of the source language is used to translate texts without bilingual text datasets. Experimental results in speech-to-speech translation tasks between Spanish and English show that Translatotron 3 outperforms a baseline cascade system.

Translatotron 3 addresses the problem of unsupervised S2ST, eliminating the requirement for bilingual speech datasets. Its design incorporates three key aspects: pre-training the entire model as a masked autoencoder with SpecAugment, unsupervised embedding mapping based on multilingual unsupervised embeddings (MUSE), and a reconstruction loss based on back-translation. The model is trained using a combination of unsupervised MUSE embedding loss, reconstruction loss, and S2S back-translation loss.

Translatotron 3 employs a shared encoder to encode both the source and target languages and has separate decoders for each language. The training methodology consists of two parts: auto-encoding with reconstruction and a back-translation term. The network is trained to auto-encode the input to a multilingual embedding space using the MUSE loss and reconstruction loss, and then further trained to translate the input spectrogram using the back-translation loss.

To evaluate the performance of Translatotron 3, we conducted experiments on English and Spanish using various datasets. Translatotron 3 outperformed the baseline cascade system in translation quality, speaker similarity, and speech quality. It achieved speech naturalness similar to that of the ground truth audio samples.

Overall, Translatotron 3 demonstrates the potential of unsupervised speech-to-speech translation and paves the way for more efficient and accurate translation between languages, while preserving important speech attributes.



Source link

Tags: BlogdataGooglemonolingualResearchspeechtospeechTranslationUnsupervised
Previous Post

What does a cloud engineer do ? | Vskills.in

Next Post

Brainstorming with a bot | ScienceDaily

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
Brainstorming with a bot | ScienceDaily

Brainstorming with a bot | ScienceDaily

Proposed EV tax credits would make cars with China-made batteries costlier

Proposed EV tax credits would make cars with China-made batteries costlier

Driving Product Impact with Actionable Analyses | by Dennis Meisner | Dec, 2023

Driving Product Impact with Actionable Analyses | by Dennis Meisner | Dec, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

November 20, 2023
A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

April 10, 2024
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In