Sunday, June 1, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Intro to Temporal Architecture and Essential Metrics

October 26, 2023
in Front-Tech
Reading Time: 4 mins read
0 0
A A
0
Share on FacebookShare on Twitter



Managing your own Temporal cluster can be a challenging task. With multiple core services, various metrics to monitor, and a separate persistence service, it can be quite an undertaking for any team. In this post, we will start a new series that aims to review the work involved in hosting Temporal and make it more understandable. However, running your own cluster should not be taken lightly. Unless your organization has a well-established operations department with sufficient resources to monitor and administer the cluster, you might find yourself regretting not using Temporal Cloud instead. However, if you have to run your own cluster due to data ownership regulations, uptime guarantees, or other organizational requirements, we hope this series will help you uncover and address the common challenges of self-hosting Temporal. In this article, we will begin by reviewing the overall service architecture of Temporal and covering some important cluster concepts and metrics. It is crucial to be familiar with the different subsystems of Temporal and monitor their associated logs and metrics in order to effectively administer them.

Architecture of Temporal:

A Temporal Cluster consists of four core services – Frontend, Matching, History, and Worker – along with a database. Each of these services can be scaled independently to meet the specific performance needs of your cluster. The responsibilities of these services determine the hardware requirements and scaling conditions. Typically, the ultimate bottleneck in scaling Temporal is the limitations of the underlying persistence technology used in the deployment.

Frontend Service:

The frontend service is a stateless service that serves as the client-facing gRPC API of the Temporal cluster. It acts as a pass-through to the other services and handles rate-limiting, authorization, validation, and routing of inbound calls. Inbound calls include communication from the Temporal Web UI, the tctl CLI tool, worker processes, and Temporal SDK connections. The nodes hosting this service usually benefit from additional compute resources.

Matching Service:

The matching service is responsible for efficiently dispatching tasks to workers by managing and caching operations on task queues. The task queues are distributed among the shards of the matching service. One important optimization provided by the matching service is called “synchronous matching”. If a long-polling worker is waiting for a new task at the same time that the matching host receives a new task, the task is immediately dispatched to the worker. This avoids the need to persist the task, reducing the load on the persistence service and lowering task latency. Administrators should monitor the “sync match rate” of their cluster and scale workers as needed to keep the rate as high as possible. The hosts of this service typically benefit from additional memory.

History Service:

The history service is responsible for efficiently writing workflow execution data to the persistence service and acting as a cache for workflow history. The workflow executions are distributed among the shards of the history service. When the history service progresses a workflow by saving its updated history to persistence, it enqueues a task with the updated history to the task queue. A worker can then poll for more work and continue progressing the workflow using the new history. The hosts of this service typically benefit from additional memory. This service is particularly memory intensive.

Worker Service:

The cluster worker hosts are responsible for running background workflow tasks. These tasks support operations such as tctl batch operations, archiving, and inter-cluster replication. The scaling requirements of this service are not well-documented, but it is suggested that a balance of compute and memory is needed. Starting with two hosts is a good recommendation for most clusters, but the performance characteristics may vary depending on the internal background workflows utilized by your cluster.

Importance of History Shards:

History shards are a critical cluster configuration parameter as they determine the maximum number of concurrent persistence operations that can be performed. It’s important to select a number of history shards that can handle the theoretical maximum level of concurrent load that your cluster may experience. While it may be tempting to choose a large number of history shards, doing so comes with additional costs. More shards increase CPU and memory consumption on History hosts and put more pressure on the persistence service. Temporal recommends 512 shards for small production clusters, and even large production clusters rarely exceed 4096 shards. However, the suitability of a specific number of history shards for a high-load scenario can only be determined through testing.

Importance of Retention:

Every Temporal workflow keeps a history of the events it processes in the Persistence database. As the data in the database grows, it can impact performance. Scaling the database can help, but at some point, data must be removed or archived to avoid indefinite scaling. To maintain high performance for active workflows, Temporal deletes the history of stopped workflows after a configurable Retention Period. Temporal has a minimum retention of one day, but if your use case allows for it, you can raise the retention period to keep the stopped workflow data for a longer time, potentially months. Although the event data needs to be eventually removed from the primary persistence database, you can configure Temporal to preserve the history in an Archival storage layer, like S3. This allows Temporal to store workflow data indefinitely without impacting cluster performance. The archived workflow history can still be queried through the Temporal CLI and Web client, albeit with lower performance compared to the primary Persistence service.

Essential Metrics and Monitoring:

Temporal exports various Prometheus-based metrics to monitor the four services and the database. These metrics fall into two categories – metrics exported by the Cluster services and metrics exported by the clients and worker nodes running the Temporal SDK for your chosen language. Cluster metrics are tagged with information like type, operation, and namespace to distinguish activity from different services, namespaces, and operations. Monitoring these metrics is crucial for ensuring the smooth operation of the cluster. Some important metrics include service availability, service performance, persistence service availability, and persistence service performance. These metrics can be monitored using PromQL queries in tools like Grafana.

By understanding the architecture, concepts, and metrics of Temporal, you can effectively manage your own cluster and address common challenges that arise in self-hosting Temporal.



Source link

Tags: ArchitectureEssentialIntroMetricsTemporal
Previous Post

‘Response disproportionate’: Shashi Tharoor urges Israel to end airstrikes in Gaza

Next Post

CEO’s departure a sign of Carrefour Israel’s difficulties

Related Posts

The essential principles of a good homepage
Front-Tech

The essential principles of a good homepage

June 7, 2024
How to measure and improve user retention
Front-Tech

How to measure and improve user retention

June 6, 2024
Push Animation on Grid Items
Front-Tech

Push Animation on Grid Items

June 5, 2024
How to build a Rails API with rate limiting
Front-Tech

How to build a Rails API with rate limiting

June 4, 2024
Introduction to the B.I.A.S. framework
Front-Tech

Introduction to the B.I.A.S. framework

June 3, 2024
Blue Ridge Ruby is exactly what we need
Front-Tech

Blue Ridge Ruby is exactly what we need

June 3, 2024
Next Post
CEO’s departure a sign of Carrefour Israel’s difficulties

CEO's departure a sign of Carrefour Israel's difficulties

Meet FreeU: A Novel AI Technique To Enhance Generative Quality Without Additional Training Or Fine-tuning

Meet FreeU: A Novel AI Technique To Enhance Generative Quality Without Additional Training Or Fine-tuning

Hamas atrocities shadow Annual Franco-Israeli business dinner

Hamas atrocities shadow Annual Franco-Israeli business dinner

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Accenture creates a regulatory document authoring solution using AWS generative AI services

Accenture creates a regulatory document authoring solution using AWS generative AI services

February 6, 2024
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
Managing PDFs in Node.js with pdf-lib

Managing PDFs in Node.js with pdf-lib

November 16, 2023
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
Turkish Airlines Marketing Strategy: Beyond “Globally Yours”

Turkish Airlines Marketing Strategy: Beyond “Globally Yours”

May 29, 2024
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In