Establish best practices and keep teams accountable

Scorecards allow your team to define standards like production readiness and development quality, and enforce them without building scripts and maintaining spreadsheets.

Thank you!

Your submission has been received.
Oops! Something went wrong while submitting the form.
Cortex

“Scorecards have made it incredibly easy to track the status of migrations across different services and teams.

We now have real data on which services are at risk and no longer need to manually check with teams, run scripts, or dig through several tools to find the right data. No one has to go in and update anything manually - it’s all automated and synced with Cortex.”

Cortex
Rafael Garcia - Cofounder & CTO, Clever

Scorecard Examples

Operational Maturity

Are services meeting SLOs? Are on-call metrics looking healthy? Are post-mortem tickets closed promptly? Are there too many customer facing incidents?
Sample Scorecard Rules
Cortex
oncall.analysis.meanSecondsToResolve < 3600
Make sure that issues are resolved in a reasonable amount of time. If they’re not, you can dig into the root cause.
Cortex
10
Cortex
oncall.analysis.offHourInterruptions < 3
If engineers are being paged off hours, it will lead to alert fatigue and low morale. By catching services that are causing high numbers of off hour interruptions, you can improve developer happiness.
Cortex
30
JIRA: post mortem tickets opened in the last 6 months that are still open
Developers constantly creating action items for services and not actually closing them is an organizational risk. Either the team is not prioritizing incident-related issues, or the team is not equipped with the right resources.
Cortex
10
Cortex
jira.issues(“labels=customer and created > startOfMonth(-3)”)< 2
A reliable service should not be a source of frequent customer facing incidents.
Cortex
20
jira.issues(“labels=compliance”)< 3
Make sure there are no outstanding compliance/legal issues affecting the service.
Cortex
10

Operational Readiness

Are services ready to be deployed to production? Are there runbooks, dashboards, logs, on-call escalation policies, monitoring/alerting, and accountable owners?
Sample Scorecard Rules
Cortex
owners.count > 2
Incident response requires crystal clear accountability, so make sure there are owners defined for each service.
Cortex
10
Cortex
oncall.escalations.count > 1
Check that there are at least 2 levels in the escalation policy, so that if the first on-call does not ack, there is a backup.
Cortex
30
runbooks.count >= 1
Create a culture of preparation by requiring that there are runbooks in place for the service.
Cortex
10
Cortex
links(“logs”).count> 1
When there is an incident, responders should be able to easily find the right logs (usually load balancer logs + application logs).
Cortex
20
dashboards count >= 1
Responders should have standard standard dashboards quickly accessible for every service for speeding up triage.
Cortex
10
custom(“pre-prod-enabled”) = true
Use an asynchronous process to check whether there is a live pre-prod environment for the service, and send a true/false flag to Cortex using the custom metadata API.
Cortex
10
sonarqube.metric(“vulnerabilities”) < 3
Ensure that production services are not deployed with a high number of security vulnerabilities
Cortex
10

Development Maturity

Is code coverage adequate? Do the right lock files and READMEs? Are the right package versions being used? Is ownership properly defined?
Cortex
Sample Scorecard Rules
owners.count > 2
Catch organizational risk by detecting orphaned services.
Cortex
10
Cortex
git.fileExists(“package-lock.json”)
Developers should be checking in lockfiles to ensure repeatable builds.
Cortex
30
sonarqube.metric(“coverage”) > 80.0
Set a threshold that’s achievable, so there’s an incentive to actually try. This also serves secondarily as a check that the service is hooked up to Sonarqube and reporting frequently.
Cortex
10
Cortex
git.lastCommit.freshness < duration(“P30D”)
Services that are committed to infrequently, counterintuitively, are actually at more risk. This is because people who are familiar with the service may leave the team, tribal knowledge accumulates, and from a technical standpoint, the service may be running old/outdated versions of your platform tooling.
Cortex
20
git.fileExists(*Test.java”)
Use a wildcard search to make sure there are unit tests enabled.
Cortex
10
git.numRequiredApprovals >= 1
Ensure that a rigorous PR process is in place for the repo, and PRs must be approved by at least one user before merging.
Cortex
10
git.fileContents(“circleci/config.yml”).matches(“.*npm test.*”)
Enforce that a CI pipeline exists, and there is a testing step defined in the pipeline.
Cortex
10

Migrations

Have teams moved to right platform library version? Is the migration to the new Kubernetes cluster complete? How many teams have the right CI file checked-in?
Cortex
Sample Scorecard Rules
custom(“ci-platform-version”) > semver(“1.1.3”)
Having every CI pipeline send a current version to Cortex on each master build lets you catch services that are on outdated versions of tooling (like CI, deploy scripts, etc).
Cortex
10
Cortex
package(“apache.commons.lang”) > semver(“1.2”)
Cortex automatically parses dependency management files, so you can easily enforce library versions for platform migrations, security audits, and more.
Cortex
10
Cortex

One-click integration with third-party tools

Scorecards fetch data automatically from your integrations without manual work, letting you easily enforce standards across all your tools.

Make sure each service has accountable owners, an oncall rotation, high test coverage, and much more.
Learn more
Cortex
Cortex

The flexibility to meet your organization’s needs

Our robust APIs make it easy to use data from custom sources in your scorecards.

Cortex Query Language (CQL) enables you to create complex rules that can compare data across multiple sources or write expressive logical statements.

Enable leaders to make informed decisions

Historical data and organizational summaries give leadership deep visibility into progress, bottlenecks, and areas of risk.
Cortex
Cortex

Drive organizational progress with ease using Initiatives

Within any Scorecard, assign owners and due dates to drive any best-practice, platform migration, and audit need.
Learn more