Practice 03

Reference Data & Governance

Data quality problems are almost always data ownership problems. When nobody is accountable for the accuracy of a record, every downstream consumer adjusts it before using it — and no two consumers adjust it the same way. We establish the ownership, the lineage, and the governance so the data becomes something people trust.

Engagement entry point
Fixed-fee diagnostic, 2–4 weeks
Typical clients
Mutual lenders, fund administrators, platform businesses with customer data fragmentation across core systems, CRM, and reporting
Tooling
Informatica, Alteryx, Dataiku, PostgreSQL, Supabase. Platform-agnostic governance frameworks.

The single customer view that does not exist

Most mid-market financial institutions have a single customer view in their strategic roadmap. Many have had it there for three or more years. The obstacle is rarely technology — it is the unglamorous work of establishing what the authoritative version of each data element is, who is accountable for it, and how conflicts between systems are resolved when they arise.

Without that work, every data quality project becomes a project to clean the data rather than fix the conditions that produced it. The data is clean for six months and dirty again by year end.

Multiple versions of the same customer

Core banking, CRM, and reporting systems hold different versions of the same record. Addresses, contact details, and product holdings do not align. Nobody is sure which is right.

Regulatory reporting adjustments

Reports submitted to regulators are manually adjusted before submission. The adjustments correct for known data quality issues that have never been addressed at the source.

No lineage for critical data

When a figure in a board report is questioned, tracing it back to its source is a half-day exercise involving multiple people. Audit requests generate disproportionate internal effort.

Instrument reference data drift

For institutions with investment or treasury exposure, instrument data — pricing sources, classification hierarchies, identifier mappings — drifts over time without systematic stewardship.

Ownership gaps

It is unclear who is responsible for the accuracy of key data elements. Data quality issues are raised to IT, who cannot fix them because the problem is in the business process, not the system.

AI readiness blocked

AI and automation initiatives are stalled because the training data or the input data cannot be trusted. The technology investment is ready. The data is not.

Phase 1 — Data landscape diagnostic (weeks 1–4)

We map the authoritative sources, the conflicts, and the trust gaps across your critical data domains. Customer reference data, product and account data, instrument data where relevant. We identify which data quality issues are structural (fixable at source) and which are process-driven (requiring ownership and stewardship changes).

We verify against your actual schema — not your data dictionary, which in most environments does not reflect the live database.

Phase 2 — Governance framework and stewardship model

A governance framework that assigns ownership, defines authoritative sources, establishes conflict resolution protocols, and creates the stewardship function that maintains data quality over time. Proportionate to the size of the organisation — we do not impose enterprise data governance bureaucracy on a 100-person institution.

Phase 3 — Lineage and observability

Data lineage documentation for critical reporting flows. Observability tooling that surfaces data quality degradation before it reaches a regulator or a board report. Designed to be maintained by the people who work with the data — not by the consultants who built it.

  • Framework designed to scale with the organisation, not to require ongoing consultancy to maintain
  • Regulatory reporting readiness as a first-class outcome, not an afterthought
  • Compatible with AI and automation initiatives that depend on trusted data as an input

Customer reference data

The authoritative customer record: identity, contact, product holdings, relationship history. Alignment across core banking, CRM, and any third-party systems that hold customer data. Deduplication logic that addresses the cause of duplicate records, not just the symptoms.

Instrument and product reference data

For institutions with investment, treasury, or lending product exposure. Classification hierarchies, identifier mappings (ISIN, CUSIP, internal codes), pricing source governance, and the stewardship model that keeps instrument data current without manual intervention.

Data lineage for regulatory reporting

End-to-end lineage for the data elements that appear in APRA, ASIC, or ATO submissions. Traceable from source system to submitted figure. Auditable without a half-day investigation.

Data quality observability

Automated monitoring of data quality against defined rules — completeness, consistency, timeliness, and accuracy thresholds. Alerts that reach the right owner before the problem reaches a report.

Data stewardship operating model

The roles, responsibilities, and cadence that sustain data quality after the engagement ends. Designed to fit into an existing operating model rather than require a dedicated data governance function that the organisation cannot afford.

Zero

Manual adjustments to regulatory submissions — when reporting data traces cleanly to authoritative sources

Single

Authoritative customer record across all systems — the single customer view that has been on the roadmap for years, delivered structurally rather than through periodic data cleaning

Audit-ready

Data lineage that supports regulatory examination, internal audit, and board-level questions without disproportionate internal effort

The longer return

Data governance does not pay back in the first quarter. The return is in the AI and automation initiatives that become possible when the input data is trusted, in the regulatory examinations that produce findings about your competitors rather than you, and in the operational cost that stops accumulating in manual data reconciliation and rework.

The organisations that invest in this work are not doing it because it is exciting. They are doing it because they have calculated what it is costing them not to.

Start with a conversation

Tell us which data domains are causing the most pain. We will respond within one business day.

Direct contact