
Ganesh Datta
HostCTO & Co-founder of Cortex

Randy Shoup
SVP of Engineering at Thrive Market
February 26, 2026
In This Episode
In this episode of Braintrust, Cortex co-founder and CTO Ganesh Datta sits down with Randy Shoup, SVP of Engineering at Thrive Market. Randy shares lessons from his leadership roles across multiple companies and explains how measurement and transparency can help teams build stronger engineering cultures.
Randy and Ganesh chat about how fear can block progress, why recovery speed matters more than trying to prevent every failure, and how teams improve through steady, incremental gains. They also discuss a few practical ways to build trust around metrics so organizations can use visibility for learning instead of punishment.
You’ll learn
Randy says teams are much more likely to care about reliability and delivery when they can clearly see their current state.
Randy argues that metrics like deployment frequency, change failure rate, and MTTR should never be used to stack rank individual engineers.
Leaders can reduce anxiety by being direct about why they're introducing metrics and by proving over time that the data is used to help teams improve.
Randy says that resilience depends on how fast teams recover when failures happen, not on the unrealistic goal of eliminating all failures.
Sustained improvement comes from celebrating progress, sharing what works, and raising standards over time.
Quotes
"The sole goal of the team is to mitigate the failure, whatever it is, and restore service as quickly and as fully as possible."
Randy Shoup
SVP of Engineering at Thrive Market
"As a service provider, I should prioritize recovering quickly when I do fail, as opposed to trying to prevent all failures."
Randy Shoup
SVP of Engineering at Thrive Market
"The unit of production of value is the team."
Randy Shoup
SVP of Engineering at Thrive Market
"A lot of times, incidents are unplanned investments."
Randy Shoup
SVP of Engineering at Thrive Market
"How do I get people to care about X? Measuring X and being transparent about X. That is the way to do it."
Randy Shoup
SVP of Engineering at Thrive Market
Timestamps
(02:45)
Three types of organizational culture and why fear blocks transparency.
(08:15)
Using DORA metrics to build trust and improve delivery at scale.
(11:24)
Why comparing a team to its past self works better than comparing teams to each other.
(18:03)
Treating incidents as unplanned investments and capturing the learning return.
(26:28)
Measurement and transparency as the first step toward a reliability culture.
(35:13)
Making MTTR visible and putting service owners on call.
Other episodes
Why DevOps Transformations Fail in Regulated Industries, with Merge Ready's Matt Bailey
Matt Bailey is a DevOps consultant and the founder of Merge Ready, a DevOps community and YouTube channel. He spends most of his time working with large regulated organizations across finance, healthcare, and government, helping them untangle the tooling decisions and processes that stall their software delivery.
In this episode of Braintrust, Matt and Cortex CTO Ganesh Datta dig into why buying a new CI/CD platform doesn't count as a DevOps transformation, what "decision latency" costs regulated organizations, and how to automate compliance.
May 21, 2026

Matt Bailey
Founder & Executive Producer at Merge Ready
Building a developer platform like a product: Inside The New York Times with Sneha Rao and Ahmed Bebars
At The New York Times, roughly 1,000 engineers build everything in the NYT ecosystem from breaking news notifications to games to the cooking app. A single team, ‘Developer Platforms,’ owns the infrastructure they all ship on, from CI/CD and runtime to SRE and FinOps. Sneha Rao leads the group as its VP of Product, and Ahmed Bebars is a principal engineer who works closely with Sneha.
In this episode of Braintrust, both Sneha and Ahmed, join Cortex CTO Ganesh Datta to dig into what it actually means to run a developer platform with a product mindset. They discuss how the team's rename from Delivery Engineering reshaped which conversations they were part of, and why centralizing both reliability and cost functions inside the Platform org leads to better decisions than keeping them separate.
May 7, 2026

Sneha Rao & Ahmed Bebars
VP of Product Management & Principal Engineer at The New York Times

Rootly's Dan Sadler on why AI coding tools are driving more incidents and why reliability is the product
In this episode of Braintrust, Cortex co-founder and CTO Ganesh Datta sits down with Dan Sadler, VP of Engineering at Rootly. Dan shares how building a business-critical product forced Rootly to prioritize operational maturity far earlier than most companies would, and why that discipline is now paying off as AI coding tools push incident volumes higher across the industry.
They also discuss what a genuine reliability culture looks like in practice, the specific cadences Rootly uses to stay ahead of production issues, and why Dan believes the rising pace of AI-generated code makes the infrastructure around code more important than ever.
April 23, 2026

Dan Sadler
VP of Engineering at Rootly


