What exactly does the "pull one cord" test check?

This test checks whether a server survives the loss of one power feed without rebooting or stopping services. The value of the test is that it shows the real behavior of the hardware, UPS, PDU and monitoring, not just how it should work on paper.

When is it best to run this test and where to start?

Run the test in a pre-agreed low-activity window with a clear rollback plan. Start with a single non-critical server or a single rack to avoid turning the check into a widespread incident.

Which cord is better to disconnect and where?

It's usually safer to disconnect the cable at the PDU side rather than the PSU to avoid disturbing the PSU module. Pick the line you are sure is labeled, and agree in advance who confirms you are pulling A or B.

How to quickly tell that lines A and B are truly independent?

At minimum—confirm the cables go to different PDUs; ideally those PDUs are fed by different protective devices or separate UPSs. If switching off one breaker or UPS kills both lines, then it isn’t true A/B redundancy.

What does a "successful" test result look like?

A successful test means the server keeps running without reboot, and redundancy status changes predictably: one input lost, the other holds the load. Events should appear in the management controller and monitoring, and the system should return to normal after power is restored without new errors.

What does it mean if the UPS goes to battery when one cord is disconnected?

Often this means you lost input to the UPS or a common section, not just a single server cable. It usually indicates the lines are not actually separated and that you’ve tested a source failure rather than one server cord.

Where to look for events and which logs matter most?

Look at the server management controller events, OS events and monitoring records to get the full picture. Make sure timestamps match—otherwise it’s hard to tell what happened first: power loss, PSU fault or reboot.

What to do if the server didn’t crash but monitoring is silent?

If no alert arrived, this is an observability problem: a real single-line failure can go unnoticed. Enable alerts for input loss, redundancy degradation and PSU state changes, then retest to ensure notifications reach the on-call channel.

What to do if a reboot occurred or disks/network disappeared during the test?

Immediately restore power, verify both PSUs are present and healthy, and confirm the server didn’t enter a reboot loop or degraded state. Then investigate: common causes include overload of the remaining line, a failed secondary PSU, a shared PDU/breaker or a weak link in the chain causing voltage sag.

How often should the test be repeated and how to lock in the result?

Common practice: quarterly for critical racks, semiannually for others, and always after any work on UPS, PDU, electrical systems or server replacement. Record results in a short protocol so the test is repeatable and changes in the rack won’t silently break redundancy.

Server Power Reliability: the "Pull One Cord" Test

What the "pull one cord" test shows

The "pull one cord" test checks a simple question: will a server survive the loss of a single power feed without service interruption. It's a quick way to verify redundancy works in practice, not just on a diagram.

This check helps uncover issues ahead of time that usually appear at the worst possible moment. For example: both PSUs plugged into the same PDU, rack outlets mislabeled, or a UPS that holds load worse than its indicators suggest. On servers with dual power supplies the test shows whether the system can smoothly switch to the second supply and keep running.

Most often the test reveals three groups of problems: no real A/B separation, a weak link in the chain (cable, connector, PDU, breaker, UPS), or monitoring that doesn't log the event or notify the on-call engineer.

Success is not just that the server "didn't crash", but predictable behavior of the whole system. A good result looks like this:

services and the OS continue running without reboot;
the second power supply takes the load and statuses change correctly on the front panel and in systems;
a clear log entry is created and a notification is sent;
after the cable is returned the scheme returns to normal without new errors.

If disconnecting the cord causes a reboot, disks disappear, the management network drops, or alerts remain silent, this resembles an incident—only in controlled conditions.

Preparation: safety, maintenance window and rollback plan

The test sounds simple only in theory. Preparation determines whether it will be a short check or a real outage. Start with a pilot run: one server or one rack where load is predictable and the impact of a stop is minimal.

Arrange a maintenance window. Prefer low-activity periods: night, early morning or a pre-approved technical window. If possible, temporarily reduce load: move heavy jobs, pause backups, limit batch operations.

Notify in advance everyone who might notice changes: the on-call admin and the onsite engineer, the service owner, facilities (electrician/rack engineer), the monitoring or SOC team, and, if necessary, users.

Plan a rollback. It should be short and doable under stress: restore power, check PSU status, confirm the server didn't reboot, and bring the system back to normal wiring. For critical nodes, prepare a fast failover method (cluster, spare server, replica).

Before you start, make sure you have:

physical access to the server and rack, adequate lighting;
clear labeling of A and B power runs (or at least precise knowledge which is which);
access to the server console and monitoring systems;
contact for someone who can quickly assist with electrical or UPS work;
time to observe after the action (5–10 minutes), not just the pull itself.

And safety first: do not pull cables while supporting them in the air, don’t yank connectors, don’t work with wet hands, and don’t open distribution boards without authorization. If in doubt, stop and call a specialist. Power reliability starts with discipline, not bravery.

How power redundancy should be arranged

Redundancy is usually built around two independent feeds, conventionally A and B. In plain terms, two separate paths bring electricity to the server. Ideally they do not cross until the server: different breakers, different cables, different outlets, and often different UPS units.

Two PSUs in a server exist so it can keep running if one feed is lost. Usually PSUs share the load: each carries part of the consumption, and if one feed fails the remaining PSU quickly picks up 100%.

For redundancy to be real, separate PDUs and independent inputs matter. If both server cords are plugged into the same PDU that’s fed from the same breaker or UPS, it’s not redundancy — it’s two cords into a single point of failure. A common issue is accidental bridging of A and B with one extension, a shared power strip or two outlets that are actually on the same circuit.

Common schemes you will encounter:

2 PSUs and 2 feeds (A to PDU-A, B to PDU-B) — correct approach;
2 PSUs and 1 feed — the server survives a PSU failure but not a feed loss;
1 PSU and 2 cords — you can only use one feed at a time; no real redundancy;
2 PSUs but both plugs in one PDU — looks redundant but fails with the PDU.

Step-by-step: how to run the test with minimal risk

The idea is simple: disconnect one power cord and confirm the server runs on the other feed. But do it carefully or you can get a false success or an unexpected outage.

First, make sure the two cords are really separate. A common trap: both plugs go to one PDU or to two outlets fed by the same breaker. Minimum check: trace cables to the PDU, then to the UPS or switchboard, and mark which is A and which is B.

Before disconnecting record the baseline state. Check current load (CPU, disks, network), PSU status on the server panel, PDU and UPS indicators, and confirm no active critical alerts. If a heavy operation is in progress (backup, migration, update), postpone the test.

Safe sequence of actions:

confirm the maintenance window and rollback plan;
verify both PSUs are on and show normal status and that power is truly split across different feeds;
disconnect one cord (usually easier at the PDU) and note reaction time;
verify there was no reboot: open the management console, check connections and services;
return the cord and confirm both PSUs are normal again without new errors.

Listen to the server while testing. A short fan speed increase is acceptable, but hangs, network drops or reboots indicate problems.

What to watch on the server and PSUs

Power alerts

We will configure alerts for AC lost, redundancy degradation and power restore events.

Submit request

During the test it's important not only to see the server stay on, but to understand how it tolerated the loss of one feed. Sometimes the system stays up but experiences a short dip, loses one PSU, or goes into an alarm state that later goes unnoticed.

First, check front panel LEDs and the PSUs themselves. It’s normal that after pulling one cord the remaining PSU stays lit as "OK" while the pulled one shows "fault/AC lost" (or its input LED goes out). A short beep and a momentary fan speed increase for a few seconds is acceptable.

Warning signs of instability:

blinking LEDs, audible clicks and a noticeable pause like a reboot;
continuous alarm that does not clear after switching to the second feed;
dramatic fan noise and heating of the single PSU carrying all load;
repeated restart attempts, LED color changes, or entry into an emergency mode;
the server stays up but the network or disks drop for seconds (an indirect sign of a sag).

Also check load redistribution. With two healthy PSUs the power is shared; after disconnecting one cord the remaining PSU should take 100% smoothly without oscillation or overheating.

Record at minimum: exact disconnection time, indicators and sounds before and after, whether there was any pause (even fractions of a second), and the final result — the server continued or not. This helps correlate BMC and OS events.

Checking the UPS, PDU and rack electricals

A server test often turns into an audit of the entire power chain. Diagrams often show two feeds, but in reality both may converge at one point: a single UPS, a single PDU or a single breaker.

UPS: switching to battery is not always a good sign

If the UPS goes to battery when you pull one cord, it means you lost UPS input power, not just one server cable. This commonly happens when both server feeds are fed by the same UPS or the same supply.

The normal picture for independent lines: you pull A, the server stays on B, and the UPS on line A (if present) keeps running from mains and does not discharge batteries. Battery usage is acceptable only if the design requires it; then it’s important batteries are healthy and have enough runtime to restore power.

PDU and breakers: does the load "tilt"?

When one feed is removed all load moves to the other. This can reveal an overloaded PDU, cable, outlet or breaker. Often each line looks fine alone, but after switching one line operates at its limit.

Check simple metrics: measure load on both lines before the test (UPS display, metered PDU or breaker readings). After disconnecting ensure the remaining line does not trip and that the breaker does not trip after 10–30 seconds. If the remaining line runs at the edge, that’s a risk for peak loads and equipment spin-ups.

Phasing and distribution: a quick sanity check

In small server rooms people often confuse "two lines" with "two different outlets." A sign of true independence is different sources and different protective devices for A and B.

Without complex tools you can verify at rack level: check cable labels for A/B, see which PDUs and breakers they connect to, and ensure turning off one group does not power down the other. If switching off one breaker kills both lines, the redundancy is only on paper.

Logs and alerts: what should be recorded

It’s important not only that the server didn’t power off. The system should record the loss of one feed, indicate which PSU/input was affected, and send a clear alert. Otherwise a real failure may pass quietly until the second channel fails.

What log entries are normal

After pulling one cord events typically appear in several places. Ideally you see a chain: input lost, switch to single-channel operation, then restore after plugging back in.

Typical expected events:

input AC lost on PSU A or PSU B (AC lost, input lost);
redundancy degraded (redundancy lost, degraded);
warning about running without redundancy but without server shutdown;
AC restored after the cable is returned.

Where to look: server management controller (BMC, iLO/iDRAC/IPMI), OS events (Windows Event Viewer, Linux journald/syslog) and monitoring. Compare all three sources.

Time in logs: a small detail that can break investigations

If BMC, OS and monitoring clocks differ it’s hard to prove the sequence of events: power loss, PSU fault or reboot. Check time sync (usually via NTP) before the test.

What the alert should contain

An alert should be concise and actionable. Minimum: which server and its location (site, rack, service), which PSU/input was lost, whether redundancy remains, event time and who to escalate to (on-call IT and power/facilities contacts).

If you pulled the cable, the server stayed up, and monitoring is silent, a real failure will only be noticed by complaints. That’s why the test is valuable — it reveals gaps in observability.

Common mistakes that turn the test into a problem

24/7 support

We will connect 24/7 support and a service network across Kazakhstan for critical nodes.

Contact

The most common failure is simple: two cords exist physically, but there is no real redundancy.

First trap — both cables feed from the same source: one outlet, one PDU, one breaker group or one UPS. When you disconnect you lose both so the "two" feeds are an illusion.

Second mistake — two PSUs are installed but one is faulty or switched off. Typical causes: PSU toggle off, loosely seated module, wrong cable, degraded PSU unnoticed by ops.

Third problem — power margin is too tight. With two PSUs the load is shared; after removing one cord the remaining PSU may go into protection.

Fourth — UPS behavior: false transfers or overly sensitive thresholds look like an outage. You end up testing UPS behavior rather than a single-cord failure and drawing wrong conclusions.

Fifth — testing "blind." Without alerts from BMC, OS and monitoring you may miss that one PSU is in error and the other is running on the edge.

What to check if the test yields unexpected results

Signs that redundancy was only on paper:

both cords come to the same PDU or the same breaker group;
one PSU reports error (no LED, not visible in IPMI);
total power is close to the maximum of one PSU, especially during peaks;
UPS switches to battery with a delay or shows a short dip;
no monitoring or alerts, so it’s unclear what happened during disconnection.

Short checklist for a successful test

A successful test looks boring: services keep running and you get clear signals that one feed failed and the system moved to redundant operation.

Power cords are truly separated: A goes to one feed/PDU, B to another, confirmed by tracing and rack labels.
The check was done under real load: services and VMs running, no manual reboots "just in case."
UPS and PDUs behave as expected: UPS does not go to battery without reason, breakers don’t trip, PDUs show no overload.
Logs contain clear entries with precise time and description (e.g., "AC input lost / PSU redundant state changed").
Alerts reached the right people: notifications delivered to a readable channel and clearly indicate what happened.

A sanity check: after returning the cord the system returns to the original state without errors and redundancy status is normal. If any item fails, the test is not yet passed.

Real-life example: a rack in an office or small server room

Power inspection

We will inspect the power chain from the panel to the server and give a list of concrete fixes.

Order inspection

A small company stores a rack in a closet: two 1U servers and a single 3 kVA UPS. Servers have dual PSUs and the rack has two PDUs, but connections were done "for convenience" without a wiring diagram. One server handles files and accounting, the other handles backups and domain services.

They ran the test on one server during a quiet time. Both power LEDs were on and fans were steady. The admin unplugged one cord at the PDU, held it and watched: the server should not reboot and its power status should switch to "1 of 2."

After a few seconds the server logged a PSU warning, but monitoring stayed silent. The OS log recorded a PSU loss, and the management controller (IPMI/iDRAC/iLO) also logged the event — but alerts did not arrive.

While investigating they discovered a worse issue: both server cords fed the same PDU, and the second PDU was powered from the same UPS and breaker. Formally two cords existed, but a PDU or breaker failure would cut both.

They fixed it quickly:

routed cords to PDU A and PDU B and labeled them;
verified PDUs are actually fed by different lines or at least different UPS outputs;
enabled power event alerts in monitoring and tested notification delivery;
repeated the test and recorded behavior and reaction times.

On the second run the server stayed up, an alert arrived within a minute, and the UPS log showed the load change. Small fixes like this are what keep power reliable, even in racks with only a couple of machines.

Next steps: lock in the result and maintain reliability

A one-off test is useful, but power reliability depends on habits. After a successful check record exactly what was unplugged, what you observed and what needs fixing. Then future runs will be faster and calmer.

Make the procedure regular: for critical racks quarterly, for others semiannually, and after any UPS, PDU, electrical work or server replacement. Ensure the test is not done by memory — any shift should follow a clear scenario.

Standardize:

labeling of cords and outlets (A and B, color and tags on both ends);
rack power diagram (which PDU is A or B, where inputs originate);
protocol template (date, rack, server, what was disconnected, what happened, who confirmed);
routing rules (A and B cords not bundled together and not routed through a single point of failure);
responsibilities and alert channels (who responds and where alerts go).

Audit and rewire racks if you see red flags: one PDU feeding both PSUs, A and B on the same breaker, overloaded UPS, or servers rebooting after a test. These often appear after moves, expansions or "temporary" cabling that became permanent.

If you need help designing and implementing an A/B scheme, selecting servers with dual PSUs, integrating and supporting the solution, this can be done with GSE.kz (gse.kz) as a manufacturer and system integrator offering 24/7 technical support and a service network across Kazakhstan. This is useful when you need not just equipment but verifiable operational fault tolerance.