Pinterest created its next-generation asynchronous compute platform, Pacer, to replace its older solution, Pinlater, which the company outgrew, leading to scalability and reliability. The new architecture leverages Kubernetes to schedule workers to execute jobs and Apache Helix for cluster management.
The company previously created Pinlater, an asynchronous workflow execution platform, and open-sourced it a few years ago. Pinlater has been used in production at Pinterest for many years and has supported many important functional areas. The company already operates several Pinlater clusters on AWS EC2, processing millions of jobs per minute.
Qi Li and Zhihuang Chen, software engineers at Pinterest, summarize the challenges with Pinlater that motivated the team to build the new platform:
With Pinterest growing over the past few years and traffic to Pinlater increasing, we discovered many limitations of Pinlater, including scalability bottlenecks, hardware performance, lack of isolation and usability. We have also encountered new challenges with the platform, including those that have impacted the throughput and reliability of our data storage.
The team concluded that it was impossible to solve all the identified problems in the current architecture and instead decided to invest in creating a next generation platform, based on usage experience and operates their Pinlater.
Pacer, the new architecture, includes a stateless Thrift API service (Pinlater compatible), a datastore (MySQL), a stateful dequeue broker service, and a worker pool running on Kubernetes . Apache Helix with Zookeeper is used to manage the assignment of work queue partitions to queue brokers.

Pacer High Level Architecture (Source: Pinterest Technical Blog)
The dequeue broker is a stateful service responsible for prefetching job queue data from the data store and caching it to reduce latency and separate enqueue and dequeue workloads . Each dequeue broker is assigned a set of job queue partitions so that jobs are fetched and executed exclusively by a single broker, thus avoiding any contention. Each job queue has a dedicated set of pods provisioned in Kubernetes to eliminate the impact of uneven resource consumption across different job types.
The new execution and queuing model helps alleviate problems encountered with Pinlater, including avoiding scanning all partitions or minimizing lock contention when fetching data from hot partitions. Additionally, it supports in-order (FIFO) job execution with a single partition configured for the job queue.
The new design requires exclusive assignment of queue partitions to dequeue broker instances, similar to the topic partitioning task of a Kafka consumer. The team chose to use Apache Helix to help with that functionality. Apache Helix provides a common cluster management framework and is used to track partition tasks across a set of dequeue brokers that form a cluster. Helix uses Apache Zookeeper to communicate resource configurations between Helix Controllers and Helix Agents embedded in dequeue broker instances.

Coordinate Dequeue Broker with Apache Helix and Zookeeper (Source: Pinterest Technical Blog)
The Helix controller monitors dequeue broker instances joining and leaving the cluster as well as any changes to the configured job queues, and on any changes it recalculates the distribution. Ideal distribution of queue partitions for the broker. Once the latest partition allocation is saved in Zookeeper, individual broker instances update their internal state and fetch data for the queue partitions they are responsible for.
#Pinterest #enhances #asynchronous #computing #platform #Kubernetes #Apache #Helix
World Innovations: Top Trends Shaping the Future Worldwide
Global Migration Trends: Understanding the Modern Movement of People
World Sports: Discover the Most Exciting Global Sporting Events