The gap between ISDA CDM and trade matching

ISDA's Common Domain Model is a schema. Trade matching is an operation on data. The first does not give you the second.

Every discovery call I've had about post-trade recon in OTC rates includes the same question, sometimes inside the first ten minutes: "Won't CDM solve this?"

The answer is no. The longer answer is interesting, and worth working through if you're an ops head considering a project that depends on the assumption.

ISDA's Common Domain Model is a data model. It defines, in machine-readable form, the structure of an OTC trade: what fields exist, what types they have, how they relate. It first dropped publicly in 2018 under ISDA's stewardship, with the CDM working group doing most of the early heavy lifting. Since 2022 the project has lived at FINOS, the Linux Foundation's fintech open source foundation. The working group meets regularly, with public meeting notes on the FINOS site. The model lives at github.com/finos/common-domain-model.

The CDM is a serious piece of work. The model captures, with care, the structures that anyone who's tried to write a generic OTC trade representation has wrestled with: legs and their roles; product taxonomy; lifecycle events (resets, fixings, knock-outs); collateral linkages; clearing relationships. The Java and Python distributions ship a runtime that lets you instantiate, validate, and serialise CDM objects. The DSL (Rosetta) is its own ecosystem of tooling, with a transpiler that emits Java, Python, and TypeScript from a model definition.

What it doesn't ship is a canonical-form derivation.

That distinction is the whole gap.

What "canonical-form derivation" means

If you give two parties the same trade and ask them each to produce a CDM-shaped representation, you'll get two structures that are economically identical but syntactically different. One side might write the legs in [fixed, floating] order; the other writes [floating, fixed]. One uses ACT/360; the other writes Actual/360. One serialises decimals as 0.035; the other as 0.0350. One puts the calendar as a single string; the other as a list. One declares the floating spread as 0.0; the other omits it because it defaults to zero.

All of these are CDM-valid. The model doesn't say "always write legs in this order, always pick this day-count spelling, always serialise decimals to 10 places, always include defaults explicitly". It can't, really, without prescribing a serialisation that the working group has consciously left out of scope.

Ask CDM-the-model whether two CDM objects are equivalent and you get the answer you give it. There's no built-in compare(a, b) that ignores syntactic non-uniformity. The Rosetta DSL has equality by structural recursion, but structural recursion treats legs[0,1] and legs[1,0] as different objects. So you don't have economic equivalence; you have structural equivalence in whatever serialisation happened to come in.

For affirmation that's catastrophic. Two ops teams running CDM-shaped representations of the same trade will hash to different bytes. They'll show up at the matching engine as a break.

The fix is to define a canonical form: a deterministic, normative reduction of any CDM object (or any trade representation, full stop) to a single byte sequence. Once you have that, the rest of the architecture is downhill. SHA-256 of the canonical bytes is the trade's identity. Two parties who agree on the trade get the same identity by construction. Two parties who disagree get different identities and a field-level diff falls out for free.

The canonical form is a separate piece of work from the data model. CDM doesn't include it. ISDA hasn't published one. FINOS hasn't either. Several private projects have, in narrow scopes (the FpML canonical form for an earlier era; bilateral matching schemes inside individual sell-side firms; the Sigrid spec, declaratively).

I know this will annoy the CDM working group, who've put a decade of careful thought into the schema, and I'm partly playing devil's advocate, but I can't help noticing that the largest gap between CDM-as-shipped and a working bilateral matching system is exactly the part that the working group has consciously declined to standardise.

Why CDM stops where it stops

There's a reason the Rosetta DSL doesn't prescribe canonical bytes, and it's not negligence. The CDM is a description of contract economics. A canonical serialisation is a description of how to write those economics down for one specific operational purpose. Different purposes need different canonicalisations. A canonical form for hashing wants determinism above readability. A canonical form for human review wants readability above determinism. A canonical form for compression wants minimum byte count.

Pick one and you've privileged a use case. The working group's discipline on this is a feature, not a bug. They've built a model that supports any number of operational standards on top.

What they haven't done is build the operational standard the recon use case needs. That's a separate project. It looks like: a normative ordering of legs (currency precedence, then leg type, then index); a normative serialisation of decimals (fixed precision, no scientific notation); a normative resolution of synonyms (Actual/360 to ACT/360); a normative treatment of defaults (always made explicit); a single deterministic JSON encoding (RFC 8785 if you're lucky).

About 200 pages of work, depending on coverage. It needs governance: who decides the precedence of currencies, who maintains the index alias map, who's the arbiter when a new index gets standardised.

In the absence of an industry-blessed canonical form, the recon use case is going to be solved bilaterally first. Two firms will agree (or implicitly converge) on a derivation function, and use it between themselves. That's how most of post-trade infrastructure has actually evolved: bilaterally, then by accretion.

The wrong way to read this

The wrong reading of the above is "CDM has failed".

It hasn't. The model is right. The gap is the layer above it.

The right reading is that anyone telling you "we use CDM" or "we'll use CDM" as a recon answer is conflating two different problems. Schema agreement is necessary for matching. It is not sufficient. The piece between schema and match, canonical bytes, is where the work is.

If you're a buy-side ops head being pitched a recon platform that says "based on ISDA CDM" without saying anything about how it derives canonical bytes, you don't yet have an answer to the question of whether two parties will agree on the same hash for the same trade. Ask them. The reply you want to hear is something like: "Currency-precedence-sorted legs, fixed-10-decimal-place notional, ACT/360-canonicalised day counts, RFC 8785 JSON, SHA-256 first 100 bits." If they look blank, you have your answer.

CDM is the language. The canonical form is the grammar. You can speak the same language with different grammars and find that you don't, in fact, agree.

A small note on what's coming next

There is industry appetite for fixing this. The CDM Working Group has, in the past 18 months, started discussing canonicalisation in side meetings, not as a CDM deliverable, but as a separate convention. ISDA's recent papers on operational risk in post-trade have mentioned the gap. The OSTTRA roadmap, in the bits you can read in their public material, hints at moving toward field-level matching even if their existing platform's architecture doesn't easily support it.

Whoever builds it first writes the de facto standard. That's how FpML happened. That's how SWIFT MT happened. That's how almost every operational standard in capital markets infrastructure has happened.

It won't be CDM that does it. CDM is the schema. The thing on top is somebody else's project.

The gap between ISDA CDM and trade matching

What "canonical-form derivation" means

Why CDM stops where it stops

The wrong way to read this

A small note on what's coming next

Get Sigrid notes in your inbox.

What is deterministic trade identity?

31 minutes on a Tuesday: anatomy of a rates affirmation break

What MarkitWire actually does (and what it doesn't)