What actually happens when a certificate expires?

An expired certificate breaks trust in TLS: clients stop connecting, integrations start failing, and some users see warnings or connection refusals. Availability suffers, and rushed “fixes” often introduce insecure temporary workarounds.

Which systems usually fail first because of expiration?

External websites and portals, public APIs, load balancers and proxies, VPN/remote access, and mail gateways using TLS are the ones that usually fail first. Internally, mTLS between services, Kubernetes components, message brokers and authentication elements also commonly break when certificates expire.

What must be stored in a certificate inventory for it to be useful?

At minimum: where it’s installed, what it’s used for, expiration date, CN/SAN, certificate chain, environment (prod/test), service criticality, owner and escalation contact, and where the private key is stored. Without owner and installation point the inventory turns into a checkbox list that won’t help in an incident.

Where to start inventorying to find "forgotten" certificates?

Start with external domains and perimeter nodes, then check server stores on Windows/Linux, JVM keystores, network devices (WAF/ADC/proxies) and Kubernetes Secrets/Ingress. Consolidate the data into one format and deduplicate by fingerprint or serial number so you don’t count the same certificate multiple times.

How to assign a certificate owner so they don’t “disappear”?

Assign the owner as a role or team responsible for the specific service, not a single person. When ownership changes, update the registry and notifications—otherwise emails go to empty inboxes and the issue surfaces only on the day of downtime.

Which certificate policies are essential to start with?

Don’t start with everything at once. Set basic enforceable rules: allowed algorithms and key lengths, standard validity periods, SAN requirements, ban self-signed certs in production, and a defined exception process with an expiry date. The key is that rules must be auditably enforceable via the registry and reports.

Why doesn’t auto-renewal always prevent downtime?

Auto-renewal must include not only issuance but delivery to the endpoint, application and verification that the service uses the new certificate. If the CA issued a new cert but the web server or load balancer didn’t reload it, externally it looks like “everything was done” while users still see errors.

How to avoid issues with the chain or the new certificate not being picked up?

Use a two-step control: verify policy compliance at issuance and then verify actual application on the service after installation. Separately validate the chain (intermediates), trusted stores and required component restarts; otherwise a technically correct certificate can still cause surprises.

What certificate reporting is actually useful for security and audit?

Useful reports for security and audit are: expirations (what expires soon), policy violations (weak algorithms, short keys, outdated protocols), orphaned certificates without owners, and an audit trail of who requested, approved, installed or revoked a certificate. If these reports require manual aggregation, your registry and management process aren’t unified yet.

How to choose the pilot scope and who can help with implementation (for example, GSE.kz)?

Run the pilot on a clear contour where the pain is visible: a limited set of one certificate type and known owners. If you need help not just setting the platform but integrating it, deploying to infrastructure and ensuring stable rotation in production, it’s practical to work with a systems integrator such as GSE.kz to reach predictable results.

Certificate Lifecycle Management: Venafi, Keyfactor

Why expired certificates become a risk

A certificate often feels like a “technical detail” until it expires. After that, some services behave as if they were switched off: clients fail to connect, integrations break, users see errors, and teams scramble to locate the problematic certificate.

Expiration hits two areas at once: availability and security. For the business it’s downtime and missed deadlines. For security teams it’s a sign of weak control that can easily turn into a real incident. A common scenario: a temporary fix is applied “urgently” without proper checks or approvals.

The first components to fail are usually those tied to TLS and mutual authentication: public websites and client portals, APIs between systems and microservices, VPN and remote access, mail gateways (TLS for SMTP), and internal services like AD FS, proxies and load balancers.

Manual tracking works when you have dozens of certificates. But when you have hundreds or thousands (TLS, internal PKI, containers, clouds), a spreadsheet quickly becomes a lottery: some certificates aren’t recorded, owners change, contractors leave, notifications go to old addresses. Managing the certificate lifecycle is therefore not a matter of convenience but of risk reduction.

It’s time to get organized if any of these sound familiar: renewals happen at the last minute under fire; you can’t answer “how many certs do we have and where”; it’s unclear who owns a specific certificate and who approves replacements; teams issue certificates in different ways without consistent rules; security asks for reports you spend weeks compiling manually.

A typical incident: an integration between two systems fails at night because the certificate expired on an intermediate proxy. The service is formally "up," but all requests are rejected. Logs show many errors, but the root cause is a single expired cert. Finding it without an up-to-date registry and reminders is difficult.

Where certificates live in a company and who is responsible

Certificate problems almost always start the same way: they’re spread across teams and systems, and many have no clear owner. So begin not with selecting a tool but with a map: where certificates actually live and who is responsible for renewing them.

External TLS certificates typically sit at the edge: websites, public APIs, load balancers, WAFs, CDNs and ingress gateways. Formally they’re owned by the team responsible for service availability (web/DevOps), but in practice renewal often depends on one person who once “enabled HTTPS.”

Internal certificates are even more common: mTLS for service-to-service, Kubernetes clusters, message brokers, databases, internal API gateways. Responsibility blurs between the platform team and application teams, and during migrations or domain changes certificates easily get "lost."

A separate category is user and device certificates: VPN, Wi‑Fi 802.1X, MDM, EDR, and client certs for portal access. IT admins and support usually handle these, but policy and expiry control typically belong with security.

There are also signing certificates: code signing for builds and updates, document signing in internal processes and with partners. Owners here are development, PKI admins and business-process owners.

Owners and contacts usually "disappear" when a certificate was issued by a contractor or former employee; a service moved to a new server or the cloud and the registry wasn’t updated; a certificate is baked into an application or container image; the same certificate is used by multiple systems; or there are no rules on who approves issuance, who stores the key and who is responsible for renewal.

A practical example: a company has a public API and DevOps renews the gateway certificate. The same domain is used by a mobile app and a partner integration, and renewal notices go to a shared mailbox. Everyone assumes “it’s not my problem,” and the risk of outage becomes real.

The certificate lifecycle in simple terms

A certificate is like a passport for a website, server or application: it verifies identity and encrypts traffic. It always has an expiry date. If missed, availability drops: portals won’t open, integrations break and services stop.

Think of certificate management as a repeating process:

inventory: which certificates exist, where they’re installed and when they expire;
ownership: who is responsible and under what rules;
issuance: CA request, approvals, issuance;
installation and rotation: replacement on servers, load balancers, containers, applications;
revocation and archiving: if compromised or no longer needed.

The key element is the owner. Without it you get a gray area: IT assumes security owns it, security assumes it’s the admins’ responsibility, and the business learns about issues through downtime. The owner is not necessarily a single person—often it’s a role or team (e.g., the service operations team). Policies define deadlines, templates, allowed algorithms and key lengths.

Next you need controls that show the situation to different audiences. IT needs to see what expires in 7/30/90 days and where. Security needs to know who issued a certificate, under which policies, and where exceptions were made. Auditors need the history of changes, rotations and revocations with reasons.

If your company runs dozens of internal portals and servers, spreadsheets almost always diverge from reality. A registry, owners and automatic renewal remove the main risk: “it suddenly expired.”

Inventory: how to build a complete certificate registry

Most problems start with one simple fact: nobody knows exactly how many certificates the company has or where they live. The first step is to create a registry that covers both external and internal services.

Start by scanning for TLS certificates visible externally and inside the network: domains, load balancers, web servers, API gateways, mail and VPN nodes. "Forgotten" certificates are often on test stands, branch offices, or old IP addresses still used by applications.

Then collect data from places where certificates are stored inside systems: Windows stores and IIS, Linux PEM files and web server directories, Java keystores (JKS, PKCS#12) and JVM apps, Kubernetes (Secrets, Ingress, service mesh), and devices like ADCs, WAFs, proxies and network gear.

To make the registry complete, import data from your enterprise CA and, if relevant, from public CAs using issuance logs for your domains. This helps spot certificates issued outside of processes.

The final step is normalization. Convert data into a single format: CN/SAN, expiry, chain, algorithm, installation point, owner, environment (prod/test), service criticality. Take a baseline snapshot and deduplicate by fingerprint or serial number.

Policies and accountability: avoid gray areas

Most incidents start from uncertainty: who owns the cert, what rules apply, what’s normal and what’s an exception. You need simple, verifiable policies.

A minimal rule set typically includes: validity periods (e.g., 90–180 days for external and 180–365 for internal, if preferred), allowed algorithms and key lengths, SAN requirements (which DNS names are allowed and who approves them), and templates for common cases (web service, mTLS, mail, VPN). State what is forbidden (e.g., self-signed in production, deprecated algorithms) and what is temporarily allowed.

To make rules work, split responsibilities:

service owner: responsible for SAN names and whether the certificate is needed;
PKI/platform admin: issues certificates and configures auto-renewal;
security: defines crypto requirements, validity and exceptions and enforces compliance;
audit/control: verifies the registry and reports match reality.

Approval workflows aren’t always needed. For standard templates, prefer automated issuance (e.g., for internal services in Kubernetes or standard web servers). Reserve approvals for atypical cases: new domains, public exposure, nonstandard lifetimes, or regulatory needs.

In the registry keep a few mandatory attributes: system/service and environment (prod/test), criticality, owner and escalation contact, key storage location (HSM/file/secret store), the reason for any exception and a review date.

Exceptions and temporary certificates are acceptable but must have an expiry and a plan to exit the exception. Otherwise “temporary” quickly becomes a permanent risk.

Step-by-step: how to automate issuance and renewal

Auto-renewal without surprises

We’ll configure auto-renewal with application verification so the new certificate is actually used.

Set up renewal

Automation starts with understanding where you issue certificates and how they get to the systems that use them. The goal is a predictable process: issuance, installation, renewal, rotation and verification.

A typical sequence for platforms like Venafi, Keyfactor and similar:

Define issuance sources: enterprise CA for internal services and, if needed, public CAs for external sites and APIs. Decide which teams are authorized to request certificates.
Configure templates: validity, key parameters and naming rules. It helps when names indicate owner and purpose (service, environment, domain).
Connect deployment to target systems via agents or integrations: web servers, load balancers, Kubernetes, mail gateways. The platform must not only issue a certificate but also deliver it where it’s used.
Enable auto-renewal on thresholds (e.g., 30/14/7 days) and define maintenance windows when updates can be applied safely.
Set up notifications and emergency procedures: who to contact, steps for failed installations, and how to issue a temporary certificate quickly.

Separately, verify key rotation and rollback plans. If TLS fails after an update, there must be a clear procedure to revert to the previous version and document the incident. These operational details distinguish “auto-issuance” from a truly safe process.

Integrations: CAs, HSMs, infrastructure and processes

For certificate management to work in practice, the platform must be integrated into your infrastructure. Otherwise you’ll get a nice registry while renewals remain manual and keys stored inconsistently.

Start with sources of authority and owners. Integrating with AD/LDAP helps populate owners by service, team or OU and grant access via groups. This reduces the chance of a certificate existing without a responsible owner.

What to connect first

Three zones matter most: issuance, key storage and deployment.

A CA (internal or external) is needed for automated issuance, renewal and revocation according to policy. HSMs are required if there are strict key storage requirements, especially for critical systems. Deployment points (load balancers, web servers, API gateways) are important so certificates are not only issued but rolled out without manual copying.

For Kubernetes and service mesh, agree in advance where certificates live: in Secrets, external stores, a sidecar or the ingress controller. Equally important is who performs rotation: the platform, a cluster operator or the mesh itself. Without clarity updates may occur in one place but not be applied elsewhere.

How not to break processes

Automation depends on processes. A good practice is to connect certificate events to ITSM/Service Desk: issuance requests, approvals, expiration warnings, and incidents on rotation failures.

A working flow example: an external portal certificate auto-renews 30 days before expiry, the new cert is deployed to the load balancer and web servers, and a Service Desk ticket is created with the result and owner confirmation. If installation fails, an incident is opened and the old certificate remains active until fixed so the service doesn’t go down.

Reporting for security: what to show without drowning in data

Certificates and Service Desk

We’ll help link certificate management with ITSM and change processes.

Plan integrations

For security teams, certificates matter as managed risk. Reports should answer clear questions: what will fail soon, where crypto policies are violated, and who owns each certificate.

Minimal report set for security and audit

A few regular slices usually suffice:

expirations: what expires in 7/30/90 days with system and criticality;
policy violations: weak algorithms, short keys, outdated protocols, incorrect SANs or purposes;
owners: unknown, unconfirmed or departed owners and orphaned certificates;
audit trail: who requested, approved, installed, revoked, when and why;
inventory coverage: how many found, where found and what remains uncontrolled.

How not to drown in data

The issue is often too many reports rather than too few. Agree on 5–10 regular metrics to monitor, tag certificates (system, prod/test, criticality, owner, business service), set thresholds and separate exceptions (e.g., test certs from production) and build reports top-down from summary to detail.

Example: security asks for “everything expiring in 30 days” and “all RSA 1024.” If the platform holds a unified registry and audit trail, these reports are generated automatically without manual pulls from different CAs and spreadsheet reconciliation.

Example scenario: how implementation looks in a real company

A typical mid-size company with branches often has internal web services and mail in headquarters, VPN gateways and terminal farms in regions, plus a couple of public sites and client APIs. Certificates are issued by admins, network teams and contractors. Some live on load balancers, some in Windows and Linux, some in containers, and a few old domains are forgotten.

Start sensibly from a single source of truth: a unified registry. In the first 2–4 weeks teams usually find critical points (public domains, VPNs, external integrations, payment and medical systems), record owners and expirations, and configure notifications. This alone significantly reduces the risk of sudden expiry.

Next run a pilot to avoid scope creep: one domain, one server type (e.g., Nginx or IIS) and one approval process. Often this looks like: weeks 1–2 collect the registry and assign owners; weeks 3–4 run pilot issuance and renewal for one template; month 2 connect other web servers and VPNs and move manual certs into managed flow; then expand to Kubernetes Ingress, load balancers and, if needed, integrate HSM and tighten policies.

Measure results: fewer manual operations and “urgent renewal” tickets, fewer expiry incidents, clear security reporting and faster issuance for new services without losing control.

Common mistakes and traps in certificate management

Most painful failures are not caused by issuance itself but by surrounding details.

One common case: a certificate was auto-renewed but the service still uses the old one. The new cert is present in the CA or store, but the web server didn’t reload it, the container wasn’t restarted, or the load balancer still serves the old chain. From outside it appears the new cert exists, but users still see errors.

Worse is when a certificate has no owner. Notifications go to a shared mailbox, a departed employee, or an abandoned ticketing project. Renewal becomes “someone’s problem” only on the incident day.

Frequent root causes:

renewal without deployment (CA logs look good but the endpoint expiry didn’t change);
mixed public and internal CAs without clear rules (unclear source of truth);
lack of a test environment and rollback plan (updates applied directly in prod);
ignoring dependencies (chain, intermediates, trust stores on servers and endpoints);
relying only on reminders without verifying application.

A practical approach is two-step control: first confirm issuance (certificate exists and meets policy), then confirm application (the service actually uses the certificate and the chain is valid). In critical environments these become mandatory checklist items for change management.

Short checklist: readiness for automation and control

24/7 support and maintenance

We can provide engineers and 24/7 support for critical contours and production.

Request support

Automation pays off when basic order exists: you know where certificates live, who’s responsible, and what’s critical.

5 checks in 15 minutes

Answer yes or no:

Is there a single registry showing where a certificate is used (system/service), the owner and expiry, and the criticality?
Are reminder thresholds set (for example, 60/30/14 days) and are responsible people assigned who confirm actions rather than just receiving emails?
Is auto-renewal enabled for main certificate types (web, inter-service TLS, VPN, mail) with documented exceptions?
Is it verified that a certificate can be not only issued but also correctly installed on required nodes, in required stores, with chain updates and service restarts where needed?
Is there security reporting and an audit trail: who issued, renewed, revoked, changed policies; and are logs retained for 6–12 months?

If you answer “no” to at least two items, start with the minimum: registry, owners and reminder thresholds.

Minimal implementation plan by stages

To avoid project creep, define stages and responsibilities: cover critical services and common cases first, then expand. A practical order is: pilot on 1–2 systems → expand by certificate types → connect additional CAs/stores → standardize policies.

Next steps: how to pick a platform and start a pilot

Begin with requirements, not vendors. For certificate lifecycle management clarify: where certificates originate (internal CA, public CAs, devices), where they are used (web, VPN, mail, DB, IoT), and what reports are needed by security and auditors.

Document requirements in 10–15 lines: mandatory integrations (CAs, AD, CI/CD, clouds, HSM), roles (security, admins, service owners), and desired automation level (auto-issuance, auto-renewal, key rotation, approvals).

When comparing Venafi, Keyfactor or alternatives, focus on constraints: support for your CAs, quality of inventory, policy enforcement, role separation, and reporting. A good sign is a platform that shows a certificate owner and service impact, not just a list of files.

Run the pilot in a typical contour where the pain is visible: e.g., 20–50 TLS certificates for internal web services and load balancers. A pilot often fits into 4–6 weeks; success criteria are simple: most renewals happen without manual steps, a complete registry in one place, owner notifications, and an audit report for security in minutes.

If you need help with integrations and process design, implementations are usually done jointly by security and operations, sometimes with a systems integrator. For example, GSE.kz (gse.kz) as a manufacturer and systems integrator can join infrastructure and support work when it’s important not only to set up the platform but also to bring deployment and rotation to stable production operation.