Methodology · PreFlight · 14 minute read

A project quality score for SAP delivery: why we built one, and how it works.

Most SAP programmes discover quality problems at UAT. By then the cost of resolution is multiples of what it would have been in build. PreFlight is the instrument set we use to surface those problems earlier — and to score them so a sponsor can read project health at a glance.

Raptors PreFlight team·Delivery methodologyMay 202614 min read

Why most SAP projects discover quality problems at UAT

The pattern is consistent across programmes. Build phases close on time. Configuration is reviewed against requirements; integrations are unit-tested; data migrations pass the dry-run. The team moves into User Acceptance Testing — and the dashboard goes red.

What surfaces at UAT is rarely a single defect. It is a cluster — gaps in configuration coverage, integration paths that work in isolation but fail at scale, data quality issues that survived migration, governance debt that accumulated through unrecorded scope changes. The cluster forces a re-plan, a slip, and — sometimes — a partial re-implementation.

None of this is new. Every methodology document warns against it. SAP Activate spends entire pages on entry and exit criteria for each phase. The reason the pattern persists is not that practitioners don't know — it's that the measurement is qualitative and the rhythm is wrong. "Configuration looks complete" is not a measurement. A phase-gate review held quarterly is not a rhythm.

PreFlight is our attempt to fix both. It is a set of four quantitative instruments, a composite scoring model that rolls them into one number, and a threshold rule that ties the score to a recommended action. We use it on every Raptors programme. We also offer it as a diagnostic to customers running programmes someone else delivered.

The four instruments

Each instrument measures one dimension. The four are deliberately orthogonal — a programme can score well on three and badly on one, and the threshold rule will catch it.

One: Configuration coverage. The percent of in-scope configuration objects that have been built, reviewed, and tested against requirement. Measured per SAP Activate phase. Targets requirements traceability — what was specified, what was built, what was verified.

Two: Integration readiness. The percent of in-scope integrations that have been built, end-to-end tested with realistic data volume, and exception-tested against known failure modes. Targets the integration layer that fails most often at scale.

Three: Data quality posture. A composite of migration-ready percent, post-migration validation coverage, and known issue residual count. Targets the data estate that has to land clean on go-live day.

Four: Governance discipline. A score on scope-change documentation, decision-log completeness, risk register currency, and phase-gate evidence. Targets the audit trail that has to exist when the steering committee asks why.

The PreFlight score was the first number on the programme that I trusted. Everything else was a story.
Customer sponsor, KSA retailer

The composite score — and why a single number matters

The four instruments roll into one composite, scored 0-100. The composite is weighted: configuration coverage 30%, integration readiness 30%, data quality posture 25%, governance discipline 15%. The weights are deliberate — they reflect the relative cost of each dimension when it goes wrong at UAT.

The argument for a single number is not that the number is sufficient. It isn't — the instruments matter individually, and a triage conversation reads them individually. The argument is that a single number is sufficient at the steering-committee surface. A programme's executive sponsor does not want a five-dimensional radar chart. They want a current health reading they can act on.

The composite is recalculated weekly. The trajectory matters as much as the level — a programme at 78 trending down is a different conversation from a programme at 78 trending up.

The threshold rule (≥85, 70-84, 50-69, <50)

The score-to-action mapping is the part that makes the measurement operational. Four bands, four recommended actions.

  • ≥85 — green. The programme is on track against the entry criteria of the next phase. The next phase gate can be approved on the current evidence.
  • 70-84 — amber, monitor. Specific instruments are below target but the composite is defensible. A weekly remediation plan is required. Phase gate may proceed with the remediation plan as condition.
  • 50-69 — amber, intervene. The composite is below the threshold a sponsor can defend in steering. Phase gate is held. The intervention is documented; the gate reopens when the intervention closes the gap.
  • <50 — red, re-plan. The programme is materially behind. A re-plan is the only honest path. The re-plan starts from a fresh PreFlight reading on the new dates.

The rule is harder to game than narrative reporting. A programme cannot talk its way past 50.

What gets measured per Activate phase

PreFlight is built around the SAP Activate phase model. Each phase has its own instrument-level targets and its own composite threshold for the gate.

Discover and Prepare. Governance and data quality posture lead. Configuration coverage targets are light — most config hasn't started — but the requirements register has to be measurable.

Explore. Configuration coverage scales into the dominant instrument. Integration readiness starts to register; data quality posture matures with the migration prep.

Realize. All four instruments are at full target weight. This is the phase the PreFlight composite is optimised for — every dimension has to score by the end.

Deploy. Configuration coverage and integration readiness are sustained at target; data quality posture targets shift to cutover readiness; governance scores include the go-live decision pack.

Run. The instruments shift to operational metrics. PreFlight is more diagnostic than gate-keeping here — but it is the input to the AMS Wave Health view.

Applying PreFlight to a programme that's not yours

The diagnostic mode is where customers usually first meet PreFlight. A programme is in trouble. Another SI is running it. The customer wants a second reading that isn't from the team in the chair.

We instrument the programme without disrupting delivery. Four to six weeks of access, two senior practitioners, one composite score and four instrument readings at the end. The output is a recommendation — proceed, intervene, re-plan — with the evidence trail.

We've run diagnostic-mode PreFlight on programmes we then took over and on programmes we did not. The honesty of the framework is the point. A diagnostic that always recommends "you should switch to us" is not a diagnostic.

Where to go next

If you're running an SAP programme — or sponsoring one, or about to sponsor a recovery of one — PreFlight is the instrument set we'd use ourselves on day one. Two artefacts start the conversation: your current phase plan and the most recent dashboard report. We can run a two-hour PreFlight read against them and tell you where the signal is.

See the PreFlight service page for the instrument detail and dashboard examples, or talk to our team directly.

If you're running an SAP programme — or rescuing one — PreFlight is the diagnostic.

See PreFlight