Canva was built on the idea that you don't need to be a professional designer to create something great. Tyler Davis, a software engineer at Canva, brought that same thinking to operational excellence at IDPCON 2025. "We want to democratize operational excellence as well," he said. "We don't want it to be a centralized problem. We want teams to be able to do this in a distributed fashion."
At 240 million monthly active users and 35 billion designs created globally, Canva has plenty of operational surface area to manage. For years, that work fell on a centralized reliability team conducting manual reviews across the org, a process Tyler described as one that simply wasn't going to scale. Canva launched Cortex two weeks before IDPCON, which made Tyler uniquely qualified to speak to what it was like to implement the platform since he and his team were actively in the process of doing so.
Instead of walking through the migration, Tyler used his time at IDPCON to address a more fundamental question: what does it actually take for operational excellence to stick? His answer was less about tooling than about human behavior. More specifically, Tyler explored the two conditions that determine whether engineering teams stay on top of reliability work or let it quietly accumulate.
The two ingredients for operational excellence: awareness and incentive
Tyler's framework for operational excellence comes down to two things: knowing a problem is coming, and having a reason to fix it before it arrives.
For awareness, he compared it to his houseplant, which he doesn't water nearly enough. It just sits there, quietly declining, because it doesn't make itself known. His dog Colby, on the other hand, has never missed a meal. He sits. He makes eye contact. He gets dinner. "I have had zero incidents in the Colby feeding schedule," Tyler said.
This was a fun way of unpacking how problems that don't surface themselves get ignored. Scorecards change that by linking the priority of a known issue to its visibility. The squeaky wheel gets the grease, so Tyler says it's important to make things squeakier.
The incentive side is trickier. The deeper problem, as Tyler sees it, is that operational work is simply hard to talk about.
"Avoiding a problem is hard to talk about. It's easy to talk about launching feature X, Y, and Z. It's a lot harder to talk about avoiding an incident or doing some operational work which improves something's reliability." — Tyler Davis, Software Engineer, Canva
Scorecards give engineers something concrete to bring into a performance review. Raising the profile service from bronze to silver is a specific, visible outcome, one that otherwise has no artifact to point to, because there was no incident.
Start with your lowest standard, not your highest
Before implementing Cortex, Tyler said one of Canva's goals was to avoid putting engineers on call too often. In response, they developed a standard where no engineer would be on call more than 25% of the time. The problem, however, was that when they rolled it out across their full catalog, too many teams were falling short of this standard.
It would have been easy to remove the standard altogether, but Tyler said Canva adjusted by lowering the bar to 50% of the time. This by any measure was a bad target, but Tyler says that it was exactly the triage signal they needed.
"Your lowest standard is the thing that tells you who's worst off, who needs the most help." — Tyler Davis, Software Engineer, Canva
By identifying who was failing even the floor, the team knew exactly where to focus first. They got everyone to 50%, raised the bar to 25%, and kept going.
At the time, all of this was manual, which made it slower to identify who needed help and harder to track progress. "Scorecards are a natural fit to this problem," Tyler said, "And if we had them back then it would have been a lot easier." The principle rings true for any standard you're rolling out against services that already exist: your lowest bar is a diagnostic that tells you who's worst off and where to start.
Don't be too draconian
Tyler closed with Armageddon. The movie, not actual coding Armageddon. NASA has strict standards for who can go to space, and the oil drillers recruited to save the planet fail the medical exam. Facing limited options, NASA clears them anyway with a promise to get in shape on the way back.
"We can't get too draconian with the scorecards," Tyler said. Standards matter, but rigid enforcement without judgment creates its own problems. Tyler urged the audience to know when to use the exemption and be willing to use it.
The underlying concern is about trust. Scorecards only work as an incentive structure if engineers believe the system is reasonable. A team that gets blocked or penalized because of a requirement they genuinely can't meet right now isn't going to engage more deeply with operational excellence. They're just going to route around the system entirely. An exemption, used with judgment, isn't a concession that the standard is wrong. It's what keeps the scorecard from becoming an obstacle instead of a signal.
What Canva built and what's next
Two weeks post-launch, Canva already had three scorecards in production, built hand-in-hand with domain teams:
A reliability Scorecard converted directly from the manual Confluence-based launch checklist the reliability team had maintained, covering on-call schedules, SLOs, alerting, and incident response training
An engineering quality Scorecard focused on test coverage, flake rates, and code-level health
An entity setup Scorecard to measure the quality of the catalog itself, on the principle that better catalog data makes every other Scorecard more valuable
Beyond Scorecards, Canva is using Cortex Workflows as the front end for its component scaffolding system, has ported Backstage plugins covering cloud costs and compliance, and built an adapter framework so other teams across the org can sync data into Cortex without touching the API directly.
Looking ahead, Tyler said there's a long list of teams at Canva already interested in building scorecards, and the focus is on rolling them out carefully rather than quickly. Cloud cost and compliance are the next natural candidates, given that Canva already has entity-level plugins tracking both. He's also excited about entity-level workflows, or the ability to take targeted actions against a specific service directly from Cortex, rather than having to go elsewhere to act on what the catalog is telling you.
"Just remember awareness and incentive," Tyler said as he closed his talk. "If you need help remembering, AI is a really easy acronym."
Watch Tyler's full session and other talks from IDPCON 2025 on demand here.


