Friday, May 9, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

A platform for computer vision accessibility technology – Google Research Blog

November 21, 2023
in AI Technology
Reading Time: 4 mins read
0 0
A A
0
Share on FacebookShare on Twitter



Posted by Dave Hawkey, Software Engineer, Google Research

Two years ago, we introduced Project Guideline, a collaboration between Google Research and Guiding Eyes for the Blind. This project aimed to empower individuals with visual impairments, such as blindness and low-vision, to walk, jog, and run independently. By using just a Google Pixel phone and headphones, Project Guideline utilized on-device machine learning (ML) to guide users along outdoor paths marked with a painted line. This technology has been extensively tested worldwide and was even showcased at the Tokyo 2020 Paralympic Games’ opening ceremony.

Since the initial announcement, our team has been working to enhance Project Guideline by incorporating new features like obstacle detection and advanced path planning. These additions ensure the safe and reliable navigation of users in more complex scenarios, including sharp turns and interactions with nearby pedestrians. The earlier version of the project employed a simple frame-by-frame image segmentation technique to detect the position of the path line within the image frame. While this was effective in orienting the user to the line, it provided limited information about the surrounding environment. Therefore, we needed to improve the navigation signals by developing a better understanding and mapping of the user’s surroundings.

To address these challenges, we created a versatile platform that can be utilized for various spatially-aware applications within the accessibility domain and beyond. Today, we are excited to announce the open-source release of Project Guideline, making it accessible to everyone for further improvement and the development of new accessibility experiences. The release includes the source code for the core platform, an Android application, pre-trained ML models, and a 3D simulation framework.

In terms of system design, the primary use-case is an Android application. However, we aimed to ensure that the core logic could be executed, tested, and debugged in diverse environments in a reproducible manner. Therefore, we designed and built the system using C++, enabling close integration with MediaPipe and other core libraries while still being able to integrate with Android using the Android NDK.

Behind the scenes, Project Guideline employs ARCore to estimate the user’s position and orientation as they navigate the course. A segmentation model, built on the DeepLabV3+ framework, processes each camera frame to generate a binary mask of the guideline. These segmented guideline points are then projected onto a world-space ground plane using the camera pose and lens parameters provided by ARCore. By aggregating the world-space points from multiple frames, we create a virtual mapping of the real-world guideline, allowing for the refinement of the estimated line as the user progresses along the path.

To guide the user, the system utilizes a control system that dynamically selects a target point on the line based on the user’s current position, velocity, and direction. An audio feedback signal is then provided to the user to adjust their heading and align with the upcoming line segment. By utilizing the runner’s velocity vector instead of the camera orientation, we eliminate noise caused by irregular camera movements during running. This approach enables us to navigate the user back to the line even when it is out of the camera’s view, such as when they have overshot a turn. This is possible because ARCore continues to track the camera’s pose, which can be compared to the stateful line map inferred from previous camera images.

Project Guideline also incorporates obstacle detection and avoidance features. We employ an ML model to estimate depth from single images, which is trained using the SANPO dataset consisting of outdoor imagery. The depth maps are converted into 3D point clouds and used to detect obstacles along the user’s path, alerting them through an audio signal.

To provide navigational sounds and cues, we implemented a low-latency audio system based on the AAudio API. Project Guideline offers several sound packs, including a spatial sound implementation using the Resonance Audio API. These sound packs were developed by a team of sound researchers and engineers at Google, utilizing panning, pitch, and spatialization techniques to guide the user along the line. For instance, if a user veers to the right, they may hear a beeping sound in their left ear to indicate that the line is on the left, with the frequency increasing for larger course corrections. If the user veers further, a high-pitched warning sound may indicate that the edge of the path is approaching. Additionally, a clear “stop” audio cue is always available in case the user deviates too far from the line or if any anomalies occur.

Project Guideline has been specifically built for Google Pixel phones with the Google Tensor chip. This chip enables optimized ML models to run on-device with higher performance and lower power consumption, ensuring real-time navigation instructions are provided to the user with minimal delay. On a Pixel 8, running the depth model on the Tensor Processing Unit (TPU) instead of CPU results in a 28x latency improvement, and a 9x improvement compared to GPU.

To facilitate testing and prototyping, Project Guideline includes a simulator that allows for rapid evaluation of the system in a virtual environment. This simulator replicates the full Project Guideline experience, from the ML models to the audio feedback system, without requiring physical hardware or a real-world setup.

Looking to the future, we are excited to collaborate with WearWorks, an early adopter of Project Guideline. WearWorks will integrate their patented haptic navigation experience, providing haptic feedback in addition to sound to guide runners. Their expertise in haptics has already empowered the first blind marathon runner to complete the NYC Marathon without sighted assistance. We believe that such integrations will lead to new innovations and contribute to a more accessible world.

Furthermore, our team is actively working on eliminating the need for a painted line altogether. We aim to leverage the latest advancements in mobile ML technology, such as the ARCore Scene Semantics API, which can identify sidewalks, buildings, and other objects in outdoor scenes. By expanding the capabilities of Project Guideline, we hope to encourage the accessibility community to explore new use cases and continue improving this technology.

We would like to express our gratitude to the many individuals involved in the development of Project Guideline and its underlying technologies. Special thanks to our partners at Guiding Eyes for the Blind and Achilles International, as well as all the team members, contributors, and leaders who have played a role in making this project a reality.



Source link

Tags: accessibilityBlogComputerGooglePlatformResearchtechnologyVision
Previous Post

Tackle computer science problems using both fundamental and modern algorithms in machine learning

Next Post

Unveiling Sensory AI: A Pathway to Achieving Artificial General Intelligence (AGI)

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
Unveiling Sensory AI: A Pathway to Achieving Artificial General Intelligence (AGI)

Unveiling Sensory AI: A Pathway to Achieving Artificial General Intelligence (AGI)

AI and Machine Learning for developers

AI and Machine Learning for developers

Bitcoin Consolidates as Profitability for Long-Term Holders Increases – Blockchain News, Opinion, TV and Jobs

Bitcoin Consolidates as Profitability for Long-Term Holders Increases – Blockchain News, Opinion, TV and Jobs

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

April 10, 2024
Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

November 20, 2023
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In