What does "dead registry" mean in the context of a CMDB?

A "dead registry" is a CMDB that is maintained on paper or formally exists but is not actually used because the data no longer matches reality. Typically, during incidents and audits information is collected in chats and spreadsheets instead of taken from the CMDB.

Why does a CMDB often become outdated even if people are "trying"?

Because manual updates almost always lose to the pace of change: patches, VM migrations, disk replacements and emergency fixes happen constantly. If CMDB updates are not built into a recurring process, they get postponed until the data loses its usefulness.

Who should own data in the CMDB and why is an owner needed?

An owner is needed not to "control everything", but to resolve disputes and maintain data quality rules. In practice owners are assigned by domain: who is responsible for hardware and virtualization, who for security, and who for services and their criticality.

What does "CMDB as code" mean in simple terms?

CMDB as code is an approach where configuration data is collected automatically from fact sources (for example, from servers), normalized, and stored with change history. The key idea is that the CMDB is updated regularly and auditable, not only "when someone has time."

What fields should be mandatory at the start?

A minimal starting set typically includes a stable identifier (serial number, UUID or asset tag), hostname, OS and version, role/purpose, environment (prod/test/dev), basic resources (CPU/RAM/disks), IP/MAC and the owner or support team. This is enough to speed up incident triage and detect configuration drift.

How to correctly identify a server so you don't lose its change history?

First define what you consider to be the "same machine" and which identifier is primary. It's more reliable to rely on attributes that rarely change; hostname and IP should be stored as supplementary attributes because they are often renamed or reassigned.

Why separate "facts" and "intentions" in the CMDB?

You should separate "what is now" from "what should be." Facts show the current state on the host, while intentions describe the desired state and policies. Mixing them makes it unclear whether a discrepancy is drift or a deliberate change.

Why can't you just write all Ansible facts directly into the CMDB?

Because raw facts are often noisy: inconsistent version formats, unstable interface names, transient fields and fluctuating values. Normalization converts them to a canonical form so servers can be compared and changes tracked without false positives.

How to prevent garbage from getting into the CMDB?

Validate before writing: required fields must be present, types and value ranges must be checked, and errors should not pass silently. Also keep a change report so you can see what was added, changed or removed since the last run.

Which changes should be considered important and require attention?

Review is required only for changes that affect risk: OS and key package versions, open ports and firewall rules, disks and RAID, domain membership and admin rights, and critical service settings. Noisy fields like uptime or interface order should be excluded from significance by agreement.

CMDB as Code: Ansible Facts, Normalization and Change Control

Why a CMDB stops being useful

A "dead registry" is a CMDB that exists on paper but is not trusted. Records are outdated, some fields are filled in only to tick a box, and decisions are still made in chats, spreadsheets and from people's memory.

This usually starts with manual entry. While there are only dozens of servers, someone still has time to update records after changes. But when regular patches, disk replacements, VM migrations and urgent night-time fixes begin, the registry falls behind. People are busy with incidents and projects, and updating the CMDB doesn't feel urgent until an audit or an outage happens.

The situation worsens when there is no data owner. Admins think it's the operations team's job, operations think it's security or architects, and service owners see the CMDB only in reports.

The symptoms are usually obvious:

a single server has different names in monitoring, inventory and CMDB
fields like "OS", "owner", "environment" are missing in places or use inconsistent wording
change timestamps don't match actual maintenance windows
new servers appear on the network but not in the CMDB
nobody can say which record is "correct"

This affects work downstream. During an incident it's harder to find the owner and dependencies; recovery takes longer. On audits you have to collect evidence manually. Procurement and planning get distorted: it looks like there is enough hardware while some nodes have long been replaced or are overloaded.

CMDB as code is valuable not by itself, but as a way to make updates regular, verifiable and tied to facts rather than manual promises.

CMDB as code in plain language

CMDB as code is an approach where information about servers and their configuration lives not in a manually edited database but as data that can be automatically collected, validated by rules and stored with change history. If infrastructure changes daily, the CMDB must be updated daily without manual entry.

A typical spreadsheet breaks in two places. First, it's unclear who and when edited a record, and whether it can be trusted. Second, updates keep being postponed: no time, no owner, no clear rules. The CMDB becomes an archive of guesses.

In the "as code" approach the machines themselves become the source of truth (for example, via Ansible facts). You add a layer of rules: which fields matter, how they are named, how to compare values and what counts as an error. Then changes are recorded as strictly as code changes: you can see what changed, where and when.

Success is measured not by how many fields are filled, but by three things:

accuracy: data matches reality on the servers
coverage: the needed groups of systems are included, not just a few critical ones
update speed: changes reach the CMDB quickly, not once a quarter

To start without overload, choose a minimal set: a unique host identifier, OS and version, roles or purpose, main resources (CPU/RAM/disks), network identifiers (IP/MAC), and an owner or support team. That is enough to detect configuration drift and avoid a "dead registry" in the first month.

Source of truth and boundaries of responsibility

To prevent CMDB as code from becoming a contested directory, decide in advance where the "truth" lives for each type of data. The same server record may have different official sources: hardware and serial numbers are best taken from procurement or the vendor, network parameters from IPAM, credentials and policies from security systems, and live indicators like OS and installed packages from Ansible facts.

It's useful to separate two layers: facts and intentions. Facts answer "what is now": kernel version, list of disks, running services. Intentions answer "what should be": target OS image, allowed packages, mandatory agents. When these layers are mixed, the CMDB begins to lie. An admin updated a package but the "should be" value wasn't changed, and it's unclear if this is drift or a new rule.

Boundaries of responsibility are easier to enforce with data owners. A typical scheme works like this:

infrastructure: hosts, virtualization, network, basic OS parameters
security: encryption, account policies, EDR agents, criticality
applications: service, environment (dev/test/prod), dependencies, maintenance windows

Then agree on a minimal mandatory set of fields. For example: unique identifier, owner, environment, role, criticality, data source and a timestamp of last confirmation. Make everything else optional. Otherwise people will fill fields "for the sake of it" and you'll get a "dead registry" again.

Data model: what exactly you store in the CMDB

To avoid a "dead registry", agree first on the data model, not the tool: which objects you describe, which fields are mandatory, and what you consider to be "the same machine." In CMDB as code this is especially important because automatic collection will quickly populate "everything everywhere" without delivering useful outcomes.

A pragmatic start is to store only what is needed for operations and change control, and fetch the rest on demand. Usually the following set of entities is sufficient:

server (physical or virtual) as the main inventory object
OS and basic parameters (distribution, version, kernel)
network (interfaces, IP, VLAN/subnet, DNS name)
disks and filesystems (size, type, mount points)
services and applications (what is running and who owns it)

Next define identification. Hostname is convenient but unreliable: it gets renamed, cloned, or mistyped. Prefer a "hard" key (serial number, UUID, asset tag) plus one or two "soft" keys (hostname, primary IP). It is useful to store all variants in the CMDB but choose one as the primary.

Connections give data meaning: a server is linked to a site and rack, to a role (e.g., DB, web), to an owner (team or department) and to criticality. Then when, for example, a network interface changes, you can see which service and owner are affected.

Fix representation rules. They may seem small but they save you from messy reports:

sizes only in GiB (or only in GB), avoid mixing
CPU: separate "cores" and "threads", not a single combined field
names: a single hostname and environment format (prod/test/dev)
dictionaries: roles, sites, owners drawn from controlled lists
empty values: prefer null over the text "unknown"

Step-by-step flow: from collecting Ansible facts to writing to the CMDB

A CMDB won't turn into a living source of truth without a repeatable data flow: from inventory to recording changes. The flow must operate the same for production and testing, and across different sites and branches.

1) From inventory to observed reality

Start with a clear list of which hosts are "in scope" and how they are grouped. Usually two or three dimensions are enough: environment (prod/dev), site (DC1/DC2), role (web/db/backup). If a server isn't in the Ansible inventory, it won't end up in the CMDB. That's fine if everyone knows the rule.

Enable fact collection and add what standard facts don't cover: monitoring agent presence, versions of critical packages, RAID parameters, virtualization tags. Implement these as small, separate checks so they're easier to maintain.

2) Transforming facts into a CMDB record

After a playbook runs, export results to a unified format (commonly JSON) so they can be processed uniformly regardless of source. Then run validation: required fields (hostname, serial or UUID, OS), types (number, string, list), allowed ranges (e.g., disk size cannot be negative). Validation errors must not silently pass, otherwise the CMDB will contain garbage.

Record the result as an artifact and produce a change report. A working minimum looks like this:

select target host groups and tag environment and site
collect standard facts and add 2–5 custom checks
convert the output to a single JSON template
validate schema and required fields, reject invalid records
store a version of the data and show a diff: what was added, changed, removed

If a server in a branch was replaced with a new one, the CMDB should show the identifier and characteristic change as a controlled modification, not as a "lost" host.

Normalization: turning facts into understandable records

Primary and Secondary Site Design

We will design primary and secondary sites with clear service dependency mapping.

Discuss DR

Ansible facts are useful, but raw they are noisy telemetry: different names, varied version formats, lots of transient fields. If you write that directly into the CMDB as code without processing, you'll quickly end up with a view where servers can't be compared and changes can't be tracked.

Normalization starts with a simple rule: for each CMDB entity there must be one clear canonical form. Map all variants from sources to that canonical form.

Unified names and dictionaries

Comparisons usually fail because of inconsistent naming. An interface may appear as ens192, eth0 or only as a MAC without a name; a disk as /dev/sda, naa.* or a RAID serial. You need a mapping layer: decide what is the "primary name" and how to find the same device in the next collection.

Do the same for OS: agree on a canonical vocabulary. Don't store "Ubuntu 20", "20.04" and "Ubuntu20.04" as different values. Normalize to fields like os_family=ubuntu, os_version=20.04, kernel=5.15.0-....

To avoid duplicates, define an order of identifiers. A good practice: primary key is inventory ID or serial, then UUID, then FQDN, then IP (the least reliable). If two servers claim the same identifier, that's a data-quality incident, not an accounting peculiarity.

Tags and raw facts separately

Normalized records are useful when they have consistent tags: environment (prod/test), criticality, owner, support level, site. These fields should come from inventory and processes, not guessed from hostname.

Store raw Ansible facts separately from normalized data. Raw facts are useful for audits and troubleshooting, but use the normalized layer for drift checks and reports. That way changes are comparable and the CMDB doesn't become a dump of fields nobody understands.

Recording changes and drift control

Collecting Ansible facts alone does not prevent drift. Value appears when changes are recorded, explained and regularly reviewed. In CMDB as code this is achieved by separation: Git stores rules and transformations, while the CMDB stores the state and history.

Versions, diffs and noise

In Git you typically keep playbooks, roles, normalization schemas, field mappings and exception lists. In the CMDB keep the current normalized state of the object and a change log (snapshots or events) so you can answer: what changed and when?

To make diffs useful, agree in advance which fields count as changes and which are noise. Noise usually includes temporary metrics, counters, uptime, interface ordering and fields that fluctuate on reboot.

Review is not needed for every small change, only for items that affect risk. For example:

OS and kernel versions, packages and patches
open ports and firewall rules
disk composition and RAID, filesystem parameters
domain membership, local admins, sudoers
critical service parameters (DB, web, backup)

History and regular reconciliations

A good change record contains "who, when and why." "Who" is the executor (CI/CD, engineer, contractor), "why" is a ticket or short reason, "what" is the diff of normalized fields.

Operational practice: run fact collection at night and produce a short daytime report of discrepancies. If a bank server suddenly has a new open port, you see it immediately. You can require approval or rollback instead of discovering the problem a week later.

How not to end up with a "dead registry": practical rules

Solution Tailored to Your Stack

We integrate servers, software and vendor services to meet your requirements and regulations.

Assemble a solution

A "dead registry" appears when a CMDB is populated manually, rarely updated and expected to be perfect from day one. CMDB as code has a different goal: store the minimum that helps make decisions and update it automatically.

Start small but useful

Don't try to document the entire server estate on day one. Pick the 20% of fields that deliver 80% of value: stable identifier, environment (prod/test), role, owner, OS and version, CPU/RAM, critical services. Add other fields only when there's a concrete operational or business need.

Then enforce a rule: if a field is not used in any report, check or decision, it is a candidate for removal.

Updates must be regular and automatic

Freshness won't stick if updates run "when someone remembers." Set schedules and triggers: after deploys, after patches, on reboot or nightly. It's better to run frequently in small batches than seldom in large ones.

Helpful simple rules:

polling and recording basic facts is mandatory before putting a new host into production
any configuration change is considered complete only after the CMDB update
if a host hasn't been updated within an allowed window, mark it as "suspicious" and exclude it from critical decisions
different data types have different freshness expectations: hardware less often, packages and open ports more often
pre-agree exceptions: how to handle "non-pollable" nodes (isolation, manual owner, temporary status)

For new servers you can enforce a gate: without a successful fact collection, the host does not get production status. This disciplines the process better than reminders.

Common mistakes and traps with CMDB as code

The most frequent reason a registry "dies" is simple: everything gets dumped into it. Ansible facts provide hundreds of fields, but if nobody uses them for decisions (inventory, licensing, capacity, security) they become junk. After a couple of months the data ceases to pass sanity checks and the CMDB is ignored.

The second trap is lack of a stable identifier. The host is renamed, IP changed, VM recreated, and history "gets lost." For servers and workstations rely on attributes that rarely change: serial number, asset tag, UUID, inventory number. Keep hostname and IP as auxiliary attributes, not the primary key.

Processes break when manual edits and auto-collection are mixed without a clear source-of-truth priority. For example, an engineer corrects a model by hand and the nightly fact collection overwrites it back. That creates endless disputes about "whose truth" matters. Predefine which fields are auto-only, manual-only, or mixed with a fixed precedence.

Another subtle problem is identical field names with different meanings across teams. "Environment" can mean prod/test to one team, site to another, and criticality to a third. Without a vocabulary, normalization generates noise and reports conflict.

Don't ignore access control. If "anyone can edit," the CMDB quickly becomes a battleground. A minimal set of rules typically includes:

restrict who can edit dictionaries and normalizers
separate rights for manual edits and auto-updates
log who changed what and why
protect critical fields (identifiers, links, owner)
assign data owners per domain (OS, hardware, network)

In large estates these mistakes surface quickly. It's better to set rules before the first bulk import.

Short checklist before production rollout

Before turning CMDB as code on in production, do a quick readiness check. You should already know what is an error, how to distinguish one server from another and who decides when data conflicts occur.

Check the basic data quality supports:

required fields are described and validated (e.g., hostname, environment, owner, criticality). Empty values return clear errors rather than being silently written
stable identification keys and duplicate rules exist: which is primary (serial, UUID, asset tag), how to handle conflicts, how to "merge" records after renaming
data discovered automatically (Ansible facts) is separated from data entered by people (purpose, owner). Manual fields have a simple update path without changing code

Then ensure freshness is maintained without heroics:

a schedule for runs is defined, not "when remembered." There is a report: what changed, what failed, which hosts didn't respond
drift thresholds and reactions are in place: which changes are acceptable and which require a ticket or approval

Assign a process owner and a single point of contact for changes. For example: when operations add a new class of servers (e.g., S200 for racks), who updates the fact-to-model mapping and who approves new mandatory fields.

If these points are covered, a production rollout will look like a controlled process, not a registry-for-the-sake-of-registry.

Example scenario: keeping data current in practice

Seamless Systems Integration

We will align hardware, network, security and services into a manageable accountability model.

Discuss integration

A company with two sites (primary and backup) runs about 60 servers. One team manages virtualization, another handles application services, and some legacy machines were maintained manually. The previous CMDB aged quickly: a record was created at purchase and then only updated "when someone remembered."

They decided to implement CMDB as code and began with simple rules. In the Ansible inventory hosts were grouped by site and role (db, app, infra), and each host had basic tags: service owner, criticality (high/medium/low) and environment (prod/test/dev). Importantly, owner and criticality are set by people and go through approval, while hardware and OS facts come automatically.

At first they collected only what helps day-to-day work and affects risk:

OS and version, kernel, architecture
CPU, RAM, disk sizes and free space at key mount points
network addresses and primary interface
serial number or VM identifier, model (if available)
a set of "control" packages and the monitoring agent status

They then normalized names: OS names to a single catalog (e.g., "Ubuntu 22.04 LTS" without variants), units to one format (GiB) and network interfaces to a primary_ip rule. They agreed on what counts as a "significant change": OS change, resource increase or decrease, addition of a new disk, primary IP change, monitoring agent deactivation, and changes only in packages from an allowlist (e.g., ssh, nginx, postgres).

After a month the effect was visible. Incidents closed faster: the on-call engineer immediately sees what changed since the last successful run and who owns the service. Approvals became simpler: a request to "temporarily add a disk" is recorded automatically and not lost. Audits were easier: they could show which systems are in production and who is responsible. Planning improved: upgrades and migrations relied on actual parameters rather than a year-old registry entry.

Next steps: pilot, process enforcement and support

Start small but formally. A "dead registry" usually appears not because of tools but because nobody owns the data or the changes. The first step is to agree what is truth and who confirms disputed fields.

Lock in a minimal data model: 15–30 fields without trying to describe everything at once. Usually enough are node identifier, environment (prod/dev), role, OS, kernel version, CPU/RAM/disks, network interfaces, key installed packages and an owner tag.

Set up scheduled Ansible fact collection and a unified export format. It's important that this becomes a single "data contract" for all teams: identical field names, units and rules for empty values.

Before writing, add three safety layers: validation, normalization and change reporting. Validation catches invalid values (e.g., negative disk size or unexpected architecture). Normalization converts data to a single form (GiB vs GB, OS name variants, different serial formats). The change report shows what exactly changed since the previous run and whether it requires action.

To avoid scope explosion, run a pilot on a group of 20–50 similar hosts where feedback is easy to collect. Expand coverage in phases: infrastructure services first, then business systems.

Make sure the process is institutionalized:

there is a data model owner and a collection process owner
collection runs on schedule and is monitored
changes are not lost: there is a log and a clear escalation path
the model is reviewed monthly and cleaned of unnecessary fields

If you need help with architecture, integration and infrastructure for this process, you can involve GSE.kz as a systems integrator: from selecting and supplying servers to implementation and 24/7 support within a single accountability boundary.