Field Notes on Workflow & AI

The catalogs you can't turn off.

Why enterprises run five overlapping cataloging platforms, why everyone already knows it's wasteful - and why the thing that finally makes consolidation possible is not a technology, but the removal of an excuse.

A field note from Moschetti Consulting ~13 min read

There is a category of enterprise software with no agreed name, so let us give it one: catalogish. A catalogish system is an inventory of assets - data, APIs, services, models, configuration items, lineage; almost everything in the management plane, as opposed to the business transaction plane. Its defining economic trait: its value to the vendor grows as the inventory grows. Collibra, Alation, Atlan, Informatica, Purview, DataHub on one side; Apigee, MuleSoft Anypoint, SwaggerHub, Postman, Kong on another; ServiceNow's CMDB sitting quietly at the center, the purest specimen of all. The list is long, and it gets longer every year.

Here is the uncomfortable fact the rest of this note unpacks: a large enterprise typically runs three or more of these. Each was a reasonable purchase. None of the vendors or the technology evaluation team did anything wrong. And the aggregate is a structure almost nobody would design on purpose and almost nobody can dismantle.

The reason it can't be dismantled was, until recently, economic. That reason just evaporated. What is left standing in its place is more interesting, and harder to talk about, than cost.

The thing each one does not do

Start with why the purchases were sensible, because the argument is worthless if it pretends they weren't. Each catalogish product does one or two things genuinely well; that is why it was purchased. The problem was never what these products do. The problem is what each one does not do - and the fact that the set of things your enterprise actually needs is always larger than the set any single product covers.

Call it eight needs. It might be five, it might be twenty. At the start, these needs are covered only marginally - by whatever existed before, plus the inevitable understory of spreadsheets and scripts that quietly do the real work. Call that the legacy. Hold the number eight in your head, because the entire trap is a story about a gap between eight and two that never closes - it only moves.

§01 The reasonable beginning.

Platform P1 is purchased. It satisfies two of the eight needs, and satisfies them well. The other six don't go away. They get handled three ways, all of them locally rational:

Wedged in. A need gets "temporarily" stuffed into P1's custom fields, description fields, free-text notes - the places a system offers when it can't model what you actually need but won't stop you from typing it.
Micro-systemed. A spreadsheet, a script, a small database somebody's team owns. It works. It is invisible to governance. It is load-bearing.
Deferred. Declared lower-value and pushed to a someday that has no date.

Nothing here is a mistake. This is a sound posture. But notice what already exists at the end of the first purchase: latent problems with a future maturity date, sitting in custom fields and spreadsheets that no architecture diagram will ever show. They are this practice's gap fillers in a new costume - the undocumented human-and-spreadsheet tissue that silently closes the distance between what the system does and what the business needs.

§02 The platform-of-truth problem is born.

Platform P2 is purchased. It satisfies one of the deferred needs. Grant the best case: P2 covers a genuinely deferred need, so nothing has to be torn out of P1 to make room. No migration, no decommissioning. The friendliest possible second purchase. It still doesn't get you out of the woods:

A platform-of-truth problem now exists. P1 and P2 both hold things that overlap, and something has to be authoritative. Usually P1 becomes master and feeds P2. If you're unlucky, parts of the legacy feed both, and then P2 has to be reconciled back to P1 - a three-way reconciliation before you've finished the second purchase.
A cottage industry of feeds and reconciliations begins. Quietly. Nobody approved "build a reconciliation practice." It accreted.
P1 and P2 begin to compete for the five needs still uncovered. Each vendor's roadmap, each internal team's ambition, now points at the same open territory.

That third point matters more than it looks, because of what it becomes one purchase from now.

§03 The math turns, and so do the teams.

Platform P3 is purchased. Two more needs covered. Now the cost of the posture starts to overshadow its benefit, and it does so along two completely different axes - one mechanical, one human.

The mechanical axis: reconciliation is combinatorial. If you want the common data across the platforms to actually agree, the number of reconciliations between N platforms is not N. It is every pair:

reconciliations(N) = N · (N−1) / 2
N=2 → 1 · N=3 → 3 · N=4 → 6 · N=5 → 10
the cost you can see is a fraction of the cost you depend on.

Let me be honest about what actually happens instead: it does not get done with precision. In practice it collapses to "legacy-to-P1, then P1-to-P2 and P1-to-P3, with caveats" - and the caveats, the ones about create / update / delete cadence and which system learns about a change first and how late, are undocumented. They live in the heads of the people who run the feeds. More gap fillers.

Meanwhile, some of the earlier micro-systems should now be retired. They are not. The attention is all on platforms, vendor spend, "technology efficiency" initiatives - the line items with logos. Nobody wants to open the spreadsheets, because the spreadsheets have no budget code and no owner who'd survive an audit of what they actually do. The hidden cost stays hidden because looking at it is itself unbudgeted.

And then the irony: new micro-systems are born to do the things P1, P2, and P3 can't do in combination - span their data domains, bridge their update cadences, handle the data-sensitivity rules no single platform was built for. The understory doesn't shrink as you add platforms. It grows to fill the seams between them.

The human axis - and this is the one that decides everything. In the previous chapter the platform teams competed for open needs. Now that the territory is carved up, the behavior reverses. The teams circle the wagons. Each one entrenches around the one or two things its platform does well, reinforces that moat, and stops caring about the rest. Not our scope. Let someone else figure out the cross-platform need - but do not encroach on ours.

Circling the wagons is not dysfunction by bad people. It is the most rational possible response to how these teams are funded, staffed, and measured. A platform team's existence is justified by its platform. Anything that spans platforms threatens the logic that keeps the team whole. So the moment consolidation is even whispered, it is heard - correctly - as an existential question by precisely the people whose cooperation consolidation would require.

This is the human twin of a fact everyone in the building already knows about the vendors.

Un-integratability is not a flaw the vendor failed to fix. It is the product functioning as designed - because a catalog that integrated away cleanly would be a catalog you could leave.

The export exists. It is even documented. It somehow never quite produces the artifact that would let you consolidate. No conspiracy is required and none is alleged - the incentive does all the work in broad daylight. Stickiness, just enough un-integratability, is simply how the category stays in business. Everyone knows it. The vendor's own product manager knows it. The procurement officer signing the renewal knows it. It is the open secret the whole market is organized around not saying out loud.

So you have two equilibria, stacked. The vendor won't integrate outward; the internal team won't integrate outward. Both are behaving rationally given how they are paid. That is what makes this a trap and not a scandal - and it is why, for thirty years, the technically buildable consolidation was never built.

§04 AI arrives, and amplifies everything non-linearly.

Platform P4 is introduced - the modern one, AI-enhanced, able to holistically scan for sensitive information. It satisfies Need #6. Every issue from the previous chapter is now amplified, and not linearly. P4 brings new APIs, so it brings yet another set of feeds. But the sharp new problem is a turf war the earlier chapters only rehearsed.

P4's capabilities don't stay in their lane. It has predictive and analytical reach that touches not just Need #6 but Needs #1 and #2 - the needs P1 owns. Team P1 immediately objects: P4 should not be doing this analysis; the AI capability should be added to their platform, because they have the data.

Team P4 replies that it has the same feed data everyone else has - at which point something gets exposed that nobody had looked at closely: P1 was never feeding everything. It couldn't, or didn't, and no one checked, because checking was nobody's job. Team P1 now argues the cost of remediating its own feeds exceeds the value of doing the AI locally. And when asked about the value of insight that spans Needs #1, #2, and #6 together - the whole point of fluid, AI-assisted analysis - Team P1 demands to see the use cases, the budgets, the business-approved impact statements first.

Sit with that demand, because it is the honest objection to everything this note argues, and it deserves a straight answer rather than a dismissal. Demanding pre-approved use cases and quantified impact before allowing cross-domain analysis is not obstruction by unreasonable people. It is, again, the rational posture of a team measured on its own platform's defined deliverables. But it is fundamentally incompatible with what AI-assisted insight is for - which is discovering the questions you didn't already know to ask, across data that was never meant to sit together. You cannot write the business case for an insight you haven't had yet. The demand for one is not a fair test; it is the wagons, circling, wearing the respectable uniform of fiscal discipline.

And this is the answer to the obvious counterargument against everything that follows. Someone will say: your five catalogs are five fiefdoms, and the consolidated system you propose has to survive the same organizational forces that produced the five - so you'll just rebuild the mess under one roof. Correct. That is exactly the risk, and it is why this was never a technology problem. The single system was always buildable. What wasn't available was the authority to make the wagons stand down - plus an excuse expensive enough to justify never trying. AI removes the excuse. It does not remove the wagons. Anyone who sells you consolidation as a purely technical or purely financial exercise is selling you the same naïveté that produced the original sprawl. Consolidation is an authority problem wearing a technology costume - which is the whole thesis of this practice, and the reason the cost argument below is necessary but not sufficient.

§05 The bolt-on that proves the point.

One more, because it's the most ordinary and therefore the most damning. It's determined that P3 needs historical lookback - the ability to ask what the data said as of some past date. The vendor product offers no such thing. So a project, call it P3A, is launched: nightly copies of P3's data, stamped with an asOf date and a version integer, bolted onto the information architecture. Predictably, this spawns:

A small team allocated to build and run it - more cost.
Another set of ETL scripts to maintain - more cost.
Another set of reconciliation scripts to maintain - more cost.
More security operations, more storage, more HA/DR, more data-entitlement administration - more cost.

None of this appears on P3's invoice. P3 still looks like a clean line item with a logo and a renewal date. The P3A apparatus - the people, the scripts, the storage, the risk - is the real cost of P3, and it is structurally invisible because it lives in the gap between what P3 does and what you needed P3 to do. The same gap as the first chapter. The same gap, every chapter, just more expensive each time it moves.

§· What was actually hard, and what no longer is.

Step back and look at what all five chapters have in common. At every stage, someone could have stopped and consolidated, and didn't - and the reason was never that the unified system was impossible to build. The reason was always one of two things: the cost of building all the integration, migration, and reconciliation scaffolding was prohibitive, and the people who'd have to build it were also the people paid to maintain the sprawl. The first reason - the scaffolding cost - is the one that just collapsed.

It is not easy or obvious, but it is nevertheless tractable to design a single information architecture that covers:

Core entities: Software and hardware, networks, components (the executable units: a database, a web server, an app), data shapes, and workflow.
Bi-temporality: Versioning and bi-temporal history built into the model, not bolted on as a P3A. The thing five separate platforms couldn't give you becomes a property of the architecture itself.
Narrative metadata: Robust narrative metadata linked directly to the data and functions - self-documenting, and consumable by LLMs as specification rather than decoration.
Scope fluidity: One representation that flexes across the three scopes that matter: in-memory (thousands to hundreds of thousands of things, ultra-fast lookup, no modify); database (millions, fast lookup, ACID modify); analytics (billions, scalable grouping, no modify).

This was always buildable in principle. What changed is that AI makes the temporary, throwaway part - the transformation layers, the migration tooling, the reconciliation harnesses, the compatibility wrappers, the test scaffolding - nearly free to generate and free to discard when the migration is done. The scaffolding was the thing nobody wanted to build, because it was expensive and you throw it away at the end. The necessary but "boring" and tiresome-at-scale bits - the adapters, the format converters - were too exhausting and low-value to build rather than buy. And truly nobody saw the value in reengineering dozens or hundreds of screens to connect to a new substrate. AI does not care about scaffolding. It is very good - and getting even better - at creating screens with complex dialogs and interactions, and it arguably can produce a whole set of adapters, including a modular underlying design, better and faster than the vendors themselves. It will generate it all.

Which inverts the calculation that has held for thirty years. Build-versus-buy stops making sense the moment you are buying three or more catalogish products that all have to be integrated anyway - because at three or more, you are no longer buying products. You are buying the products plus the N·(N−1)/2 reconciliation mesh, plus the micro-system understory that fills the seams, plus the P3A bolt-ons for everything the products don't do, plus the vendor management, the multiplied audit surface, the multiplied attack surface, the duplicated HA/DR, the headcount of all of it - and none of that "plus" is on any invoice you can point to.

A simple, honest self-test, since the threshold is the whole point:

One: You do not need this. Keep it. Move on.
Two: You could likely benefit, but the case is not yet overwhelming.
Three: There is real, nameable opportunity.
Four+: The build-versus-buy math is, for practical purposes, settled in favor of build - not because the products are bad, but because the interstitial cost between them has quietly become the largest line item nobody is allowed to see.

— Moschetti Consulting

If you are at three or more and the interstitial cost has started to feel like a weather system rather than a line item, that is the conversation worth having.

inquiries@moschetticonsulting.com