We use cookies to understand how people use Depot.
🚀 Inference case study

How Inference ships 4x faster using Depot GitHub Actions runners

Products Used
GitHub Actions runners
Remote container builds
Depot Cache

Inference.net provides AI infrastructure that helps companies train and host frontier AI systems at scale. They offer custom model training, dedicated GPU hosting with OpenAI compatibility, and specialized inference services for both real-time and batch workloads, processing trillions of tokens weekly for their customers.

When something breaks for a customer, the Inference team needs to ship a fix in minutes. But their unreliable CI was a bottleneck. With Depot’s GitHub Actions runners and container builds, they ship 4x faster and much more reliably. We spoke with Senior Engineer Harry Bairstow and Founding Engineer Francesco Virga to learn how they use Depot to solve one of the most frustrating engineering problems: slow, unreliable builds.

The challenge

"There's been a number of times where we've had mission critical bug fixes get completely blocked. There's nothing worse than fixing a bug quickly and wanting to tell the customer, 'Hey, 2 minutes after you noticed this bug, we fixed it'. But then we wait another 30 minutes just to get builds out."

Francesco Virga, Founding Engineer

Inference moves fast. They iterate with customers to develop efficient AI models and solutions. They build momentum and trust, and regularly ship fixes within minutes of identifying an issue. But slow builds and unreliable GitHub Actions runners were causing friction in the system, killing that momentum.

During one customer deployment their GitHub Actions runner just hung. With no visibility into the issue, no way to cancel the run, and no way to start a new instance due to concurrency limits, they had to wait it out. For over 2 hours. Only to have the same commit build successfully the following morning.

They’d tried other solutions, from large CI providers to hosting their own GitHub Actions runners on site. As Harry recounted, “We'd had our own runners in-house. We then moved on to Raspberry Pis and small boxes and then even to the point of having a full K8s cluster dedicated to it. They all had their own issues.” Ideally, they wanted to avoid forcing the engineering team to learn a new system and build replacements for existing GitHub Actions.

The solution

"We connected [Depot GitHub Actions runners] and our builds just started running. We no longer had random tasks that never got picked up, or tasks living for hours without being able to kill them."

Harry Bairstow, Senior Engineer

Harry and Francesco were willing to make a big switch, and they “were trying out lots of products at the same time” when they heard about Depot.

They set up a time to meet with Depot’s co-founders, so Harry decided to give Depot a trial run beforehand. He swapped to Depot runners in Inference’s GitHub Actions workflows and saw immediate improvements in both build stability and execution time. “...it just made GitHub Actions seem to work, at a much higher rate than we could ever get it to do.”, he recalled.

Once Harry and Francesco connected with the Depot team for support, Depot’s engineers jumped in to help them optimize even further, including switching to Depot for container builds and downsizing some runners to save costs.

The benefits Inference gets with Depot include:

  • Improved GitHub Actions reliability: Builds are stable. Engineers can count on GitHub Actions running reliably.
  • Faster builds: Fresh builds dropped from 2.5 minutes to ~56 seconds. Cached builds went from 2 minutes to 30 seconds.
  • Fast, reliable caching: Depot Cache works out of the box with our GitHub Actions runners; Inference notes that cache is key to their fast builds.
  • Support that’s “right there”: Depot support and engineering are always available for questions and troubleshooting.

Inference engineers benefit from Depot’s drop-in replacement for GitHub Actions. They can keep using their preferred third-party GitHub Actions and also have reliable runners. “We got the best of both worlds. We didn't have to move out of that ecosystem, but we actually get runners that work consistently.” said Francesco.

The measurable impact

"We can now rely on the fact that when we push code our Actions will always run. GitHub being down is unrelated. At least we know that the actual Action runners will work."

Harry Bairstow, Senior Engineer

Stable builds were the critical requirement for Inference, and Depot delivered. Faster builds are a bonus.

Before Depot, it would routinely take up to an hour to push product updates through their entire deployment pipeline. Now end-to-end deployments reliably take about 5 minutes.

The speed and reliability still surprises them. "I'll push code and then... I still have this kind of knee-jerk reaction. I need to go check: how far is it along?" Francesco says. "I go look and it's already live and it's just super exciting to see that happening.”

Using Depot has resulted in these measurable impacts for Inference:

  • Zero CI infrastructure failures (since migrating to Depot)
  • Up to 12x faster end-to-end deployments
  • 4x faster cached builds
  • 2.7x faster fresh builds

Looking ahead

Inference is just getting started with Depot. They’ve built a reliable CI foundation to maintain their rapid deployment cadence without worrying about stability. Next up: trying out Depot CI, Depot's drop-in replacement for GitHub Actions.

Your builds have never been this quick.
Get started