What a 3-year forecast gives you and common mistakes

A 3-year forecast isn’t for a pretty table. It helps make decisions that can’t be fixed in a week: what and when to buy, whether there’s enough rack space, power and cooling, which licenses will rise with cores and sockets, how long to sign support contracts, and whether to plan site expansion in advance.

The main mistake is replacing the forecast with a single number like “add +30%.” CPU, RAM, storage and network grow for different reasons and at different rates. A new analytics layer may add almost no CPU but sharply increase RAM and IOPS. Connecting branch offices can hit port and uplink limits even if server count grows only slightly.

Underestimating capacity usually ends with degraded services (peaks, queues, timeouts) and urgent “yesterday” purchases when choices are limited and prices are higher. Overestimating is also dangerous: money is frozen in idle hardware, and license, power and maintenance costs grow.

It’s more convenient to plan by resource: CPU (peak and average), RAM (actual usage, not allocation), storage (capacity, IOPS and throughput), network (bandwidth and port counts). The goal is a clear buffer for growth and new projects without overpaying or risking downtime. For government organizations and large enterprises this is especially important: procurement and approval timelines are often longer than load growth.

What data to collect before calculations

Before calculating, collect data that reflects not only current usage but also real site constraints. Otherwise numbers will be “accurate on paper” and useless for procurement.

Start with an inventory, but not just “by names” — by resources and dependencies. Record servers and clusters, hypervisors and versions, storage systems and connection topology, switches and port speeds, uplinks and their actual utilization. Note separately where racks, ports, licenses, warranties and support end.

Next, gather historical metrics for at least 3–6 months, preferably a year: averages say little without peaks and trends. Look at CPU peaks, RAM usage, storage latency and IOPS, pool fill levels, network traffic on uplinks and between segments. Note seasonality (for example, reporting periods) and regular spikes from backups.

To avoid drowning in detail, assemble a short input pack:

current infrastructure and critical service diagrams;
peaks and the 95th percentile, plus trends for CPU, RAM, storage and network;
business plans (new systems, user growth, security requirements);
physical limits (power, cooling, space, lead times);
availability requirements (N+1, replication, DR, maintenance windows).

A simple example: if you plan to deploy a SIEM and retain logs for 12 months, growth won’t be only in capacity. IOPS and network (collection, normalization, replication) will also increase. Get these requirements from the project owner in advance, before you give everyone the same “buffer.”

Growth model: drivers, scenarios and assumptions

To avoid turning the forecast into guesswork, start with a simple model: what will grow and why. Choose a planning cadence: quarters are convenient if projects start often; half-years work if changes are rare and budgets are approved yearly.

Then break growth down by drivers. This makes discussions about causes rather than abstract percentages: how many users, which processes, how much data.

Common drivers include: users and endpoints (VDI, branches), transactions and operations (CRM/ERP, online channels), data (logs, video, archives, analytics), and new services (containers, AI pilots, integrations).

Define three scenarios. They show a budget range and help identify where a buffer is needed.

Base: growth per current business plans.
Optimistic: faster user growth or additional services launched.
Stress: projects are delayed but sudden requirements for security, retention or resilience appear.

Also note step changes: ERP deployment, mass VDI rollout, video streams, AI inference, opening branches. These events cause jumps in CPU/RAM/storage/network and rarely follow a smooth curve.

Finally, record assumptions: what is considered likely (e.g., “+500 users by Q4,” “replication 2x,” “log retention 180 days”) and who validates them — service owner, security, data owner, architecture, finance. Without confirmations the forecast quickly loses value.

CPU: how to forecast and what buffer to include

CPU is often misunderstood because of terminology. Physical cores provide baseline performance. Threads (Hyper-Threading/SMT) don’t always help and depend on workload. vCPU in virtualization is a planning unit, not a speed guarantee. Agree in advance which units you’ll use (for example, pCPU cores and frequency) and what peak utilization is acceptable.

Base your planning on peaks, not monthly averages. If accounting workloads spike at month-end and BI runs reports in the morning, those windows dictate requirements. A common practical approach: use the 95th or 99th percentile by cluster and separately record the heaviest days (period closes, scheduled calculations).

Don’t forget overhead: hypervisor, security agents, monitoring, backups, system services. Even 5–10% extra can be the difference between a stable cluster and a constantly constrained one.

CPU buffer usually has two parts: failure/maintenance reserve (N+1, sometimes N+2 for critical clusters) and growth reserve tied to peak utilization thresholds (for example, keeping peaks below 60–70%).

Before finalizing numbers, check “silent” bottlenecks: core frequency (important for databases), NUMA boundaries, license limits per core, and platform limits. Often these are more important than simply “adding more CPU.”

RAM: account for actual consumption and overcommit risks

Memory runs out sooner than CPU. Reports often show a lot “allocated” but little “used.” The issue is that during peaks and failures actual consumption matters, not monthly averages.

Start by comparing how much RAM systems were given versus how much they actually use. Do this by groups: databases, virtualization, VDI, application services, infrastructure (AD, monitoring, backups). This quickly shows where there’s a safe cushion and where memory is constantly near the limit.

What usually inflates RAM usage

Memory growth is rarely linear. It often jumps due to caches, indexes and new features.

Typical causes: larger DB caches and parallelism, new modules and queues in applications, more VMs and HA requirements, heavier VDI profiles (browser, video calls), and security agents (EDR, encryption, scanning).

Overcommit: when risk becomes real

Memory overcommit can work on calm days but fails in three situations: peaks, mass restarts and VM migrations (for example, during host maintenance or failure). If the cluster is tight, migrating several large VMs can hit RAM limits and break recovery plans.

A practical rule: calculate “normal” load and a “stress” scenario separately (peak plus N-1 host failures). For databases and VDI, make separate estimates — these workloads poorly tolerate aggressive overcommit.

Don’t forget reserve for OS and platform updates. New versions often demand more memory, and adding security and monitoring agents to hundreds of VMs yields noticeable aggregate growth. This buffer is almost always consumed without warning.

Storage: capacity, IOPS and the impact of backups and replication

Site limits review

We’ll help identify bottlenecks in racks, power, cooling, ports and licenses.

Get consultation

For storage, separate two questions: how much data (TB) and how fast data is read/written (IOPS and MBps). A frequent mistake is counting only capacity and then wondering why disks “look empty” while applications are slow.

Build a profile per major service: current data volume, average daily growth, read/write peaks, share of random I/O (usually higher for DBs), and block sizes. If metrics are missing, start with rough estimates by workload class: databases, virtualization, file shares, video surveillance, logs.

Count add-ons that often double (or more) capacity needs: logs and telemetry (especially with increased auditing), backups and snapshots with retention periods, replication to a second site and DR (plus network traffic).

Then separate data by “temperature”: hot (needs IOPS and low latency) and cold (TB matters more). This helps choose storage tiers: fast pools for critical systems and larger pools for archives and backups.

Validate forecasts with compression, deduplication and thin provisioning, but be cautious. Savings vary: virtual OS disks often dedupe well; encrypted backups often do not.

Include reserve for recovery and test environments. If you have 50 TB of production data, 30 days of backups and replication, real capacity demand can grow to 150–250 TB. IOPS often become critical during nightly backup windows and virtual machine boot storms.

Network: how to forecast bandwidth and ports

Network often breaks the plan because traffic growth is not always visible via external channels. Start by separating east-west (inside the data center) and north-south (to users, branches, internet). East-west usually grows faster due to microservices, virtualization, clusters and internal APIs.

Forecast based on peaks, not averages: use the 95th or 99th percentile for interfaces and port queues. Add a line for “hidden” traffic that grows with infrastructure: backups, replication, snapshots, logs, monitoring, updates.

Check the network at three levels: server uplinks, aggregation (leaf) and core (spine/core). Bottlenecks are often not total bandwidth but a lack of ports at the required speed or overloaded specific uplinks.

Quick checklist before calculation:

estimate peak east-west and north-south and note seasonality (month-end, nightly windows);
assess backup and replication contributions and their schedules;
check spare capacity on 10/25/100G ports and what can be used without rework;
account for overhead from segmentation and security (ACLs, firewalls, encryption, inspection);
design a "speed ladder" to move to faster interfaces without replacing the entire layer.

Example: adding an analytics cluster may increase external traffic by 10% but internal data transfers between nodes by 2x. The forecast must include a port and interface upgrade plan, not just gigabits.

Reserve for new projects: how not to overpay or take a risk

A reserve is not “just in case” but for specific unknowns. First list “known unknowns”: new customer-facing services, regulatory requirements, integrations, reporting growth, import substitution, cross-site migrations. This list helps arguments focus on real events, not percentages.

Then define reserve logic. A fixed buffer (e.g., +20% CPU/RAM and +30% storage) is simple but often leads to overspend. It’s more practical to size buffers by workload class: DBs need IOPS and low latency, VDI and analytics need RAM and GPUs, file services need capacity and growth speed.

A portfolio of reserves works well: a small fast pool for pilots (4–8 weeks), a planned reserve for 6–12 months to scale approved initiatives, a separate network reserve (ports, uplink, licenses), and storage reserve accounting for backups and replication.

To prevent the reserve from being used up unintentionally, set rules for project admission to the data center: minimum requirements, request template, confirmation timelines, and who prioritizes. Keep an emergency plan (for example, 2x growth in a quarter): which nodes to expand first, what to move to another cluster, what to buy.

Step-by-step 3-year calculation method

Server room expansion plan

We’ll prepare rack and S200 rack-server options for growth and procurement timelines.

Request configuration

To avoid turning calculations into opinion battles, start simple: fix the baseline, agree assumptions and calculate by consistent periods (for example, quarterly). Separate current load from peaks and from what new projects will add.

Calculation steps

Capture the baseline and inventory. Gather actual CPU, RAM, disk usage (used and growth), IOPS/latency, and network traffic. Add the list of servers, hypervisors, storage, switches and software versions. Use at least 4–8 weeks to see typical peaks.
Describe growth and the change calendar. List drivers (users, new services, data growth, security) and schedule: what launches, migrates or is decommissioned and when.
Calculate needs by period. For each quarter compute CPU and RAM (based on real utilization and overcommit policy), storage (capacity plus IOPS), and network (Gbps and number of ports). The output should be a “was – will be” table.
Add buffers with rules. Include N+1/failure reserves, seasonal peaks, backup and replication windows, and a project pool that can be quickly allocated. Phrase buffers specifically: “base need for Q3 + peak buffer + CRM launch buffer.”
Check physical limits and turn calculations into a plan. Verify racks, power, cooling, spare ports and uplinks, and lead times. Then split procurement and implementation into phases: what’s needed now, what in 12–18 months, and what can be deferred under a low scenario.

A good methodology lets any participant take the inputs and reproduce the result; scenario changes should immediately show budget and schedule impact.

Common mistakes and traps in planning

The most common mistake is using averages. A 20% average CPU looks safe, but if month-end or business-hours peaks reach 85–95%, users notice slowdowns. Over a multi-year horizon, understanding peaks and their causes matters more than neat averages.

Second trap: mixing test and production loads in one calculation. Test environments spike unpredictably: near-zero one day, a release run the next that consumes resources and network. This leads either to overspending for rare spikes or risking production.

Teams often underestimate sideways data growth: application logs, audit trails, monitoring, metrics, SIEM. When teams enable more detailed logging for investigations, log volume can double in a year, and write IOPS increase. If not planned, storage will fill unexpectedly even though “business data” changed little.

Another issue is underestimating resilience and maintenance windows. A tight plan breaks during host repairs or hypervisor upgrades: a cluster should survive N+1 (sometimes N+2), which impacts CPU, RAM, network and storage.

Before approving architecture, check platform and license limits: per-socket or per-core licensing, cluster size limits and software versions, RAM limits per host or VM, encryption, backup and replication requirements, and compatibility with planned CPUs and NICs.

Finally, don’t plan only for servers. Increased compute almost always drives storage and network growth: more traffic to databases, more backups, more replication. End-to-end bottleneck checks usually save budget and time.

Pre-budget checklist

Forecast and assumptions audit

We’ll validate assumptions, scenarios and buffers so the budget is defensible.

Request an audit

Before signing a budget, ensure you rely on facts, not impressions.

Metrics should cover at least 6–12 months to reveal seasonality and peaks. For CPU and RAM distinguish averages and the 95th percentile; for storage track data growth and IOPS during working hours.

Check that the forecast is broken down by periods (quarters or half-years) and that buffers are explicit: what is mandatory growth and what is a reserve for new projects.

historical metrics covering 6–12 months and clear peaks for CPU/RAM/storage/network;
three scenarios agreed and a list of projects with launch dates and owners;
per-resource forecast by period with buffer and rationale in a separate line;
site limits verified: power, cooling, racks/units, ports, uplink, licenses;
review points set (quarterly or semi-annually) and criteria to start the next procurement.

Example trigger: “If in the base scenario cluster utilization stays above 70% CPU or RAM for two consecutive months and a new project launches within 90 days — start procurement.” If you depend on manufacturing and delivery timelines, such triggers are crucial.

Example scenario: user growth and new services launch

Imagine a company with 20% annual user growth and 3–4 new branches opening within 12 months. Two projects launch: VDI for 150–200 seats and a new analytics platform. This shows why CPU-only planning fails.

Split growth into baseline (users and branches) and project-driven (VDI and analytics). Baseline growth moderately increases CPU and RAM but noticeably raises WAN/LAN traffic and data volumes (mail, files, logs, backups).

VDI causes a sharp increase in RAM and IOPS. Even if average CPU per desktop is low, disk peaks occur during mass logins, updates and antivirus scans. Analytics may be tolerant of CPU on normal days but require fast disks and good network during data loads, replication and refreshes.

Thus storage and network often grow faster than CPU: data volume drives backups, test copies, replication and journaling, while network grows due to branches, VDI traffic and cluster node communications.

Divide reserve into two layers: N+1 for cluster resilience and a pilot buffer (10–20% on key resources) with clear rules when it’s considered used.

Present results as a quarterly table: CPU, RAM, usable capacity, IOPS/throughput, ports/speed. Add expansion triggers: buy one server node when cluster RAM hits 70–75%; expand SSD pool when latency or IOPS exceed targets; increase uplink bandwidth when peaks persist; add backup and replication capacity; reserve rack space and power for the next step.

Next steps: from forecast to procurement and modernization plan

When calculations are ready, convert them into a plan that will survive the budget committee and not collapse with the first project. Prepare a summary table for CPU, RAM, storage (capacity and IOPS) and network (bandwidth and ports) by year. Next to it, list current limits and real utilization to see gaps: what’s lacking now and what will be short in 12–24 months.

Then prioritize actions by speed and risk. Two horizons work well: quick expansion to cover the next 3–6 months and planned modernization to avoid panic buys. Quick measures: add nodes, disks, licenses, increase uplink, rebalance loads. Planned measures: refresh server generations, expand storage arrays, rework ToR/aggregation network, provision new racks. Set checkpoints: quarterly recalculation and scenario review when projects change.

Tie each purchase to a forecast metric, not a vague “we need more iron.” Pre-agree operational details: delivery times, installation windows, migration plan and 24/7 support requirements. Define acceptance criteria (performance tests, resilience, monitoring).

If local delivery and system integration in Kazakhstan matter, discuss calculations and configurations with GSE.kz (gse.kz) as a vendor/integrator to align servers, network and data center infrastructure into a single modernization plan.