Creating with Cat Media - Business Growth Technology Blog

Transforming Data Processing: An Insight into HubSpot's Hublet Architecture

Written by Juan A. Zabala | Apr 7, 2023 8:40:00 AM

In an era where data security and efficient data processing are of paramount importance, innovating in architectural design is a necessity. HubSpot's novel "Hublet" architecture is a testament to this very fact. This unique design structure is not just a sophisticated approach to data management; it has been rigorously tested and proven effective in a production environment for over one year.

Each Hublet is essentially a standalone copy of the entire HubSpot platform, serving a unique subset of customers. Hosted in a single AWS region with a secondary region for backup replication, each Hublet exemplifies data redundancy and availability. The naming convention involves a geographic identifier coupled with an incrementing number, such as na1 or eu1.

Routing of external traffic is primarily achieved through Hublet-specific DNS records. For instance, an EU customer accesses the product via app-eu1.hubspot.com, which subsequently makes API calls to api-eu1.hubspot.com. Each Hublet operates within its own AWS account and VPC, with databases locked at the network level to prevent accidental cross-Hublet traffic. This ensures that each Hublet operates independently, without dependence on others.

 

AI Representation of 2 Datacentres Connected US and EU. Click here to learn more about HubSpot DataCenter in the EU

 

 

Data security is a top priority within the Hublet architecture. Secrets such as encryption/decryption keys and 3rd party API keys are unique to each Hublet and stored exclusively within it. This gives customers granular control over where their data is processed and stored while also enhancing performance by bringing data closer to the customer.

Learn more about HubSpot's Journey to Multi-Region.

A significant challenge of this architecture was routing API calls to the correct Hublet without compromising performance, reliability, or data localization. To address this, HubSpot implemented a solution using Cloudflare Workers. They transformed api.hubapi.com into a thin facade backed by a Cloudflare Worker. This worker determines the appropriate Hublet to forward the request to, based on the request data itself.

Cloudflare Workers provide a serverless execution environment that allows the creation of entirely new applications or augmenting existing ones without configuring or maintaining infrastructure. In HubSpot's case, each API call coming into api.hubapi.com is intercepted by the Cloudflare Worker. The worker examines the request and makes a decision on which Hublet the request should be sent to, i.e., api-na1.hubapi.com or api-eu1.hubapi.com. To make this routing decision, the worker uses information embedded directly into the API keys and OAuth tokens. This design ensures that the routing decision doesn't require any additional network calls, which could potentially affect performance or reliability.

 

 

Cloudflare Workers are highly scalable and execute code at the edge of Cloudflare's network, closer to users. In HubSpot's case, they utilized Cloudflare Workers as a routing mechanism for their API traffic across their different Hublets. Here's how they achieved this:

 

  • Creation of a Facade: HubSpot transformed api.hubapi.com into a thin facade supported by a Cloudflare Worker. This facade doesn't process the API requests itself but instead decides where to forward them based on the request data.

 

  • Routing Decisions: Each API call coming into api.hubapi.com is intercepted by the Cloudflare Worker. The worker examines the request and makes a decision on which Hublet the request should be sent to, i.e., api-na1.hubapi.com or api-eu1.hubapi.com.

 

  • Embedding Hublet Information: To make this routing decision, the worker uses information embedded directly into the API keys and OAuth tokens. This design ensures that the routing decision doesn't require any additional network calls, which could potentially affect performance or reliability.

 

  • Performance and Reliability: By making routing decisions at the edge of Cloudflare's network, the worker minimizes latency and maximizes performance. The worker makes routing decisions based solely on the data in the request itself, without any additional network calls. This design ensures that the routing decision doesn't require any additional network calls, which could potentially affect performance or reliability.

 

This approach has been thoroughly tested and has been successfully implemented in production for one year. It allows HubSpot to efficiently route API traffic to the correct Hublet without sacrificing performance, reliability, or data localization. This innovative use of Cloudflare Workers has enabled HubSpot to improve the scalability and robustness of their platform, while also providing a better user experience for customers.

 

Intrigued by the mechanics behind HubSpot's Hublet architecture? Dive deeper into the technical details here.