Posted by Jagan Sankaranarayanan, Senior Staff Software Engineer, and Indrajit Roy, Head of Napa Product, Google
Google Ads infrastructure relies on an internal data warehouse called Napa. Napa stores tables containing records of ads performance, which are associated with specific customers and campaign identifiers. These tables are used to power critical dashboards that measure campaign performance for advertising clients. The challenge lies in efficiently retrieving the data for reporting queries, as the data is skewed and queries have strict latency requirements.
In our paper “Progressive Partitioning for Parallelized Query Execution in Napa”, presented at VLDB 2023, we describe how Napa addresses this challenge. We introduce a progressive query partitioning algorithm that can parallelize query execution effectively, even in the presence of complex data skews. This algorithm ensures that reporting queries are answered within a few milliseconds while meeting strict latency targets. With Napa, Google Ads infrastructure is able to serve billions of queries every day.
One of the main challenges in query processing is determining how to parallelize the query effectively. Napa’s parallelization technique divides the query into even sections that are distributed across available machines, reducing query latency. However, estimating the number of records associated with a specific key is not perfect, as reviewing all records would require the same effort as answering the query. Unequal distribution of work among machines leads to runtime skews and poor performance. Each machine also needs sufficient work to avoid underutilized infrastructure. Additionally, parallelization must meet stringent latency requirements for each query.
To address these challenges, we have developed a progressive partitioning algorithm. This algorithm minimizes the amount of metadata needed and focuses on efficiently partitioning the data based on the skewed part of the key space. It works within the allotted time, ensuring that partitioning takes no longer than tens of milliseconds. The algorithm determines the best possible partitioning that considers query latency expectations.
In managing the data deluge, Napa uses log-structured merge forests (LSM tree) to organize table updates. LSM allows us to update tables separately from query serving, ensuring atomic updates once the next batch of ingest (delta) is fully prepared for querying.
The data partitioning problem in Napa involves a massively large table represented as an LSM tree. We use a tree-traversal algorithm to quickly split the trees into two equal parts. To avoid visiting all nodes of the tree, we introduce the concept of “good enough” partitioning. The algorithm reduces the error estimate of partitioning and stops when the two pieces are approximately equal.
Our progressive partitioning algorithm makes a series of moves to reduce the error estimate and cuts the trees into more or less equal pieces. The algorithm is guided by statistics stored with each node of the tree. Progressive partitioning is effective for our use-case as it ensures that the longer the algorithm runs, the more equal the pieces become. It also allows for good partitioning even if the algorithm is stopped at any point.
In conclusion, Napa’s progressive partitioning algorithm optimizes database queries efficiently, enabling Google Ads to serve client reporting queries billions of times each day.
Source link