We use cookies to understand how people use Depot.
👩‍🚀 Introducing Depot Registry

How People Data Labs made their builds 25% faster at a third of the cost with Depot

Products Used
Remote container builds
GitHub Actions Runners
Depot Cache

People Data Labs is the leading provider of high-quality, compliant data on people and companies. Through its network of proprietary sources and the public web, PDL standardizes its dataset into a single source of truth available via API or data sharing. Companies building data-driven products in HR, Investment Research, Sales, and Marketing turn to PDL for accurate, reliable, and easy-to-use data at scale.

By implementing Depot, People Data Labs saved over 68% on their CI spend. We spoke with Site Reliability Engineer Chris Carlson to learn how they swapped out their own self-hosted runner with Depot to remove 3,000 lines of code, make their builds 25% faster at a third of the cost, and allow them to shift from infrastructure maintenance to innovation mode.

The challenge

The main thing that brought us to Depot was their GitHub Actions runners. We were hosting our own in AWS ECS. It was really expensive, and every little GitHub glitch resulted in something going wrong on our end. I do not envy you guys for taking that on, but I appreciate it.

Prior to adopting Depot, the People Data Labs team had been facing multiple challenges around GitHub Actions runners and Docker builds.

Expensive, glitchy GitHub Actions runners were a primary issue for the team. “We wanted to move our GHA runners to Fargate,” Chris said, “so that we weren’t hosting and managing the hardware. So that we could simplify our infrastructure and make it easier to understand and manage.”

Caching presented another problem for the team. Because of a lack of visibility into their cache, the team wasn’t able to easily identify and remedy sub-optimal practices around caching. They had also run up against cache size constraints; their runners were in AWS and the official GHA cache is in Azure, and they were experiencing increased network latency as a result.

“We were doing a lot of back and forth that way, and our caching was severely limited as a result,” they said. In fact, the team was limited to a ~1GB cache due to the painfully slow transit between AWS and Azure, and this size limit was far too restrictive for them to take full advantage of the benefits of caching.

Other problems the People Data Labs team was facing include:

  • Runners running out of disk space: The team launched jobs using ECS with the EC2 launch type. Those instances were shared with other GitHub Actions runners, which introduced problems like disk space, memory, etc. These issues resulted in wasted time troubleshooting and testing solutions.
  • OOM issues: Due to the use of shared EC2 instances, the team was experiencing frequent and mysterious OOM issues, further resulting in wasted time.
  • Build platform emulation: The team develops mostly on M1 Macs and runs amd64 in production, so local builds needed emulation.

Chris told us that because they found their Kubernetes operator not responsive enough, they “ended up writing a new operator based on ECS from scratch.” While that made managing runners and interactions with GitHub easier, “it was also more expensive, because we had to keep more standby hardware, and it was slow to respond. We were basically throwing a lot of money at it just to make sure we had the capacity for the level of responsiveness that we needed.”

The solution

Pulling up a container build, looking at its logs, seeing the steps that it took, what was cached, what wasn’t, how much time it took – that was a view that we could have reconstructed, but it would have taken a lot of time and effort. Having it suddenly available meant that we had a lot more insight into what was and wasn’t working the way we wanted and expected it to.

After dealing with the frustrations of managing their homegrown solution, Chris told us that the People Data Labs team, “stumbled on Depot, and saw that we could just cut over and not have these problems anymore.”

People Data Labs had a collection of builds that also needed to access resources internal to their own AWS VPC. As such, we worked with People Data Labs to implement a VPC-peering solution that would allow them to use Depot-hosted runners but still access their own private resources.

Chris said that, “Once we figured out how to get connected with VPC-peering, the actual implementation was extremely fast. It’s probably the fastest service I’ve ever set up here.”

As Chris and the team moved their GitHub Actions workflows over to Depot, they quickly realized the following benefits:

  • Faster jobs and builds: The team has been able to shave 25% off deployment times thanks to Depot.
  • Increased build, job and cache visibility: Depot dashboards like the Cache Explorer have increased the team’s visibility into job and build failures, allowing the Platform team to spend less time troubleshooting. This increased visibility has also enabled them to provide better support to other teams, specifically in answering why something isn’t working, as well as provided insight into additional optimizations they can make to further improve their cache performance.
  • Unlimited caching: Depot offers unlimited GitHub Actions caching. People Data Labs are now able to have caches in excess of 100GB, and retrieval times are very fast. This came with a mental shift from “don’t cache because it’s too slow” to “cache everything.”
  • Moving from maintenance to innovation: Because Depot takes care of so much of what the People Data Labs team was trying to solve with their homegrown solution, they’ve been able to shift from maintaining that infrastructure to implementing additional improvements.
  • Flexibility in runner sizing: The ability to tune runner sizes per job without having to make infrastructure deployments allowed the team to move from a two-runner type system on self-hosted runners to being able to right-size the runner type to the job using six different runner types from Depot.
  • Build platform emulation: The team now delegates their local builds to Depot, meaning performance is only limited by bandwidth.

Chris and the team were particularly happy about what Depot’s caching strategy unlocked for them. “When we moved over to Depot, we suddenly had this nice fast cache, and that opened up whole worlds to us,” they said. “We’re doing so much more caching than we ever did before, and as a result of that, our builds are much faster.”

The measurable impact

Even with our usage beyond our initial estimates, our costs are still below what they were prior to adopting Depot. So, we’re saving money, and seeing better performance, and not managing all that infrastructure.

With Depot, Chris and their team have been able to build faster, decrease CI spend, and remove unnecessary infrastructure. Their adoption has resulted in these measurable impacts:

  • 25% faster deployment times.
  • 68% reduction in CI spend.
  • 3,000 fewer lines of code to maintain. The team was able to decommission their homegrown GHA runners, scaling controller, and Bazel remote-cache services.

Beyond the directly measurable cost and speed savings, Chris highlighted the time they and the team get back simply by not having to manage Depot. “I haven’t looked at your documentation in months,” they said, “because it’s just working.”

They also mentioned that no longer needing to troubleshoot has left the team free to work on their products. “We aren't able to quantify this,” Chris said, “but it's huge.”

Looking ahead

As People Data Labs continues to grow, they plan to further leverage Depot’s container build and GitHub Actions runner performance with additional workloads they’re bringing online to support their growing software systems.

Your builds have never been this quick.
Start building