Scaling transformers for graph-structured data – Google Research Blog

Posted by Ameya Velingker, Research Scientist, Google Research, and Balaji Venkatachalam, Software Engineer, Google Graphs, in which objects and their relations are represented as nodes (or vertices) and edges (or links) between pairs of nodes, are ubiquitous in computing and machine learning (ML). HTML tags are used to structure the content and specify formatting. For example, social networks, road networks, and molecular structure and interactions are all domains in which underlying datasets have a natural graph structure. ML can be used to learn the properties of nodes, edges, or entire graphs. Graph neural networks (GNNs) are a common approach to learning on graphs. They operate on graph data by applying an optimizable transformation on node, edge, and global attributes. The most typical class of GNNs operates via a message-passing framework, where each layer aggregates the representation of a node with those of its immediate neighbors. Recently, graph transformer models have emerged as a popular alternative to message-passing GNNs. These models adapt the success of Transformer architectures in natural language processing (NLP) to graph-structured data. The attention mechanism in graph transformers can be modeled by an interaction graph, where edges represent pairs of nodes that attend to each other. Unlike message passing architectures, graph transformers have a separate interaction graph that is different from the input graph. The typical interaction graph is a complete graph, which models direct interactions between all pairs of nodes. However, this creates computational and memory bottlenecks that limit the applicability of graph transformers to datasets with small graphs. Making graph transformers scalable has been a significant research direction. A solution is to use a sparse interaction graph with fewer edges. Many sparse and efficient transformers have been proposed for sequences, but they do not generally extend to graphs in a principled manner. In “Exphormer: Sparse Transformers for Graphs”, presented at ICML 2023, we address the scalability challenge by introducing a sparse attention framework for transformers designed specifically for graph data. The Exphormer framework makes use of expander graphs, which are sparse yet well-connected graphs that have useful properties. Expander graphs have applications in various areas, such as algorithms, pseudorandomness, complexity theory, and error-correcting codes. Exphormer replaces the dense, fully-connected interaction graph of a standard Transformer with edges of a sparse d-regular expander graph. The resulting graph has good connectivity properties and retains the inductive bias of the input dataset graph while remaining sparse. Each component of the interaction graph serves a specific purpose. Edges from the input graph retain the inductive bias from the input graph structure. Expander edges allow good global connectivity and random walk mixing properties. Virtual nodes serve as global “memory sinks” that can directly communicate with every node. The degree of the expander graph and the number of virtual nodes are hyperparameters that can be tuned to improve the quality metrics. Exphormer is as expressive as the dense transformer and obeys universal approximation properties. In experimental results, Exphormer achieved state-of-the-art performance on various datasets and allowed graph transformer architectures to scale well beyond the usual graph size limits.

Source link

Scaling transformers for graph-structured data – Google Research Blog

Examples of IBM assisting insurance companies in implementing generative AI-based solutions

Researchers from Washington University in St. Louis Propose Visual Active Search (VAS): An Artificial Intelligence Framework for Geospatial ExplorationÂ

Related Posts

How insurance companies can use synthetic data to fight bias

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

How Game Theory Can Make AI More Reliable

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

Deciphering Doubt: Navigating Uncertainty in LLM Responses

Researchers from Washington University in St. Louis Propose Visual Active Search (VAS): An Artificial Intelligence Framework for Geospatial ExplorationÂ

NVIDIA AI Introduces ChatQA: A Family of Conversational Question Answering (QA) Models that Obtain GPT-4 Level Accuracies

Chinaâs EV players ramp up competition with Tesla using new technology

Leave a Reply Cancel reply

Amazon’s Bedrock and Titan Generative AI Services Enter General Availability

9 Best Open Source Text-to-Speech (TTS) Engines

Link between adversity, psychiatric and cognitive decline

Digital Marketing Trends 2022 | Digital Marketing Future And Career In 2022 | Simplilearn

Digital Marketing Basics for Beginners | Fundamentals of Digital Marketing 2023 | Simplilearn

Creating Fluid Typography with the CSS clamp() Function — SitePoint

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

AI Compared: Which Assistant Is the Best?

How insurance companies can use synthetic data to fight bias

5 SLA metrics you should be monitoring

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

CATEGORIES

SITEMAP

Welcome Back!

Create New Account!

Retrieve your password