RDS Load Calculation: CPU, RAM and Profiles Without Queues
RDS load calculation helps choose CPU, RAM and disks, assess user profiles and run tests to avoid login queues as load grows.

Why RDS slows down and where queues come from
Users rarely say: “we don't have enough CPU.” They see waiting. Login drags, windows open with a pause, the cursor stutters, printing fails intermittently, and 1C or the browser shows delays after a click. In RDS this almost always means one thing: dozens of sessions simultaneously contend for a shared resource and start waiting.
The problem is often not a single bottleneck. It's a combination.
For example, the CPU may still be «tolerable», but disks are busy with profiles and temp files — login time grows. Or there seems to be enough memory, but because of RAM pressure the system uses the pagefile more, and the disk ends up bearing the whole load. Another frequent story is network and printers. Users think “the server is slow”, but in fact printing hangs or a profile loads over a slow link.
Typical “queues” that are easy to recognize from complaints:
- long login and “preparing the desktop” at shift start;
- delays when launching applications and opening files;
- freezes during mass printing, especially to shared printers;
- slow browser behavior when working with documents and tabs;
- “lags” at the end of the day when there are many active sessions and background tasks.
Scaling by sight is doubly dangerous: you can buy extra hardware and still get queues because you strengthened the wrong node. Then you need a proper load calculation: estimate CPU and RAM per user, understand where profiles and caches live, and verify hypotheses with tests before purchasing.
A practical approach usually looks like this: collect input about users and applications, run basic measurements on a pilot, calculate headroom and confirm numbers with a load run. After launch, enable monitoring so you see overload in advance, not through a barrage of tickets.
Collecting inputs: who and how will work
Any calculation starts not with formulas but with a clear picture of how people work. If inputs are vague, planning almost always ends in either overspending or queues at login and freezes during peak hours.
First determine the real number of concurrent sessions. It’s important not how many employees are in the company, but how many people will be in RDS at the same time and when exactly. Often peaks fall at 9:30–11:30 and 14:00–16:00, plus separate spikes at month-end for accounting.
Then segment users by behavior, not by job title. An “office” user with email and 3–5 browser tabs and a user who opens a heavy accounting system may differ in load by multiples. Mark separately those who:
- work in browser systems with many tabs;
- use thick clients (accounting, ERP, DMS) and print frequently;
- run heavy applications (CAD, analytics, reports);
- spend a lot of time in video calls or often play video.
Don't forget loads that are often “not counted” but consume resources daily: antivirus scans, encryption, inventory agents, backup, DLP, monitoring. Clarify what must be installed in-session and what can be moved to the image level or configured with exclusions.
Another important block is availability requirements. One server is simpler and cheaper, but any downtime immediately stops work. A farm (several RDSH, broker, profiles, file services) is more expensive but gives flexibility and maintenance planning without stopping services.
Finally, record constraints: budget for servers and storage, procurement timelines, licenses (Windows, RDS CAL), requirements for local production and support. For example, in the government sector of Kazakhstan this often affects the choice of platform and vendor, and thus how quickly you can scale.
Where to start: basic measurements before calculations
Before you start counting cores and gigabytes, it’s important to understand one simple thing: how much resources a single session really consumes in your environment. “RDS users” that look identical on paper can behave very differently because of Windows version, application set, antivirus, policies, printer drivers and even network quality.
Start with a small control group: 5–10 typical employees working as usual. Collect baseline metrics on their sessions. Watch not only CPU usage but where queues build up.
Minimum counter set to record:
- CPU: utilization and processor queue length;
- memory: free RAM and active consumption per session;
- disk: average disk queue length and read/write latencies;
- network: inbound/outbound traffic and loss/errors;
- responsiveness: how fast the interface reacts (user perception and proxy metrics).
Take metrics during a workday and definitely during peak. Averages often deceive: “average 40% CPU” can hide short peaks to 95% that users experience as freezes. So, besides averages, record percentiles (for example, 95th) — they better show what happens in the worst moments.
To make measurements useful, fix the “reference” environment and don’t change it during measurement:
- OS version and update level;
- gold image and installed applications;
- GPO/security policies, folder redirection;
- profile settings (local, containers, redirection);
- antivirus and exclusion rules.
Simple example: accounting runs 1C, email and a couple of browser tabs. A “typical” session is light, but at 10:00 everyone simultaneously generates reports and prints. If you miss that peak, the calculation will be optimistic and queues and long logins will appear after launch.
CPU estimation methodology: how to approximate cores per user
CPU in RDS rarely loads evenly. It’s important to distinguish average load (what you see most of the day) from peaks (what happens during login, starting 1C/browser, opening heavy reports). Plan for the peak, otherwise queues will appear exactly when everyone needs to work.
The practical scheme is simple: measure first, then scale.
Step 1: pilot group and live measurement
Choose a pilot: 10–20 typical users with real tasks and the same apps that will run in production. For 2–3 workdays record host metrics and note events: shift start, mass login, key app launch, printing.
Signs you hit CPU rather than disk or network are usually:
- Processor Queue Length grows and stays high for minutes;
- application response time increases while RAM remains available;
- during mass logins CPU spikes to 90–100% and stays there;
- delays worsen when new sessions start.
Then calculate “cores per user” from the peak: take the server’s peak CPU usage in percent and divide by the number of active sessions at that moment. Example: 16 vCPU, peak 80%, active sessions 40. That gives 16 * 0.8 / 40 = 0.32 vCPU per session. For 120 sessions that’s about 38 vCPU. Add 15–25% headroom for growth and unforeseen loads.
Step 2: account for spikes and choose scaling
Login and app-start spikes often matter more than the “steady” day. If the pilot shows short but sharp peaks, not only adding cores helps but also distributing users.
Horizontal scaling (more nodes) is usually better when peaks are frequent, you need fault tolerance, or apps consume a core per session. Vertical scaling (more CPU per node) fits when management is simpler on a single server and load is relatively even. In integration projects like those GSE.kz runs, this choice is usually confirmed by the pilot and availability requirements.
RAM estimation methodology: base, per-session and headroom
RAM shortages in RDS are often masked as disk problems: users complain everything “waits”, and monitoring shows rising disk queues. In reality the system actively swaps to the pagefile, and even fast SSDs become a bottleneck due to constant small I/O.
Memory is consumed not only by in-session applications. Part is used by the OS itself, file and update cache, drivers, print services, browser engines, and security agents (EDR/antivirus), inventory and monitoring. These “quiet consumers” are particularly noticeable after updates or when new policies are enabled.
Working method: measure the “base” first, then add “per session” and verify at peak.
- Measure RAM usage right after boot and service warm-up (the base).
- Connect 5–10 test users and see how much memory increases (per session).
- Multiply per-session by planned concurrent sessions and add the base.
- Verify the result with a load test in a “peak hour” scenario.
- Leave headroom for growth, updates and heavy days.
It’s better to keep a noticeable buffer so you don’t hit swapping during peaks and trigger a chain of delays. If you plan to expand software or know browsers and messengers will grow, keep a larger buffer.
Signs of memory pressure are usually obvious: frequent pagefile accesses, increased delays when switching windows, long login time, sudden app crashes and freezes when opening documents. If this repeats, RAM is insufficient, even if CPU still looks “ok”.
User profiles and storage: how not to kill login time
Queues in RDS often start not at the CPU but at login. At login the server reads the user profile: registry branches (NTUSER), app settings, caches, shortcuts, files in AppData. If the profile is on slow storage or full of tiny files, dozens of concurrent logins become a “storm” and users wait for the desktop.
There are three basic approaches for profile storage.
Local on each terminal server is usually fast for login but harder for moving users during balancing and failures. On a file server it’s easier to manage but you increase dependency on network and file server disks. The third option — profile containers (profile in a file-container attached at login) — often yields more predictable login time because it reduces the scatter of small files across the network.
For disks, gigabytes matter less than responsiveness. Planning should consider IOPS, average read/write latency, disk queue length and real speed on small blocks. A 300–800 MB profile can load slowly not because of size but due to thousands of small objects.
A separate pain point is antivirus. It likes to scan every tiny file in the profile and login slows sharply. This is especially noticeable in the morning: at 09:00, 120 employees log in at once and the file server sees a burst of read/write operations.
What typically helps reduce the risk of a login “storm”:
- shrink profiles: clean caches, move heavy data out of the profile, restrict unnecessary folders;
- set antivirus exclusions for typical profile/container paths (with InfoSec agreement);
- stagger mass logins: shift schedules, delay auto-start tasks after login;
- keep profiles and user data separate so the desktop appears faster;
- test storage with many small files, not only with MB/s measurements.
If you purchase hardware for RDS, discuss upfront with the integrator where profiles will live and the expected peak of concurrent logins. In GSE.kz projects this is usually a key point because it determines both user comfort and disk and network requirements.
Ancillary loads: printing, browser, video and network
RDS is often calculated for “core” apps, while queues emerge from surrounding loads: printing, browser, video calls, device redirection. These loads are poorly visible in preliminary calculations but they’re what turn “generally fine” into “it’s slow” at peak times.
Printing and scanning sometimes consume more resources than expected. Print queues grow when drivers vary, there are many printers, and jobs are heavy (PDFs, images, reports). Sometimes the issue isn’t CPU but a hung driver in a session: the user’s session freezes “as if the server” even though the app is light. If printing is heavy, agree on standard drivers in advance and offload printing to a separate print server so failures and queues don’t hit terminal nodes.
Browsers and video calls increase CPU without “heavy” programs. A single user with many tabs, webmail and CRM can consume more than an accounting system. In-browser video calls often spike CPU for encode/decode, especially when sharing a screen.
Is a GPU needed?
GPU is usually unnecessary for office tasks and thin apps. But it becomes important if users work massively with 3D, mapping, heavy dashboards, frequent video conferencing or graphic editors. Practical rule: if tests already show CPU hitting 80–100% due to graphics and video, consider nodes with GPUs or at least check hardware acceleration settings in applications.
Network between RDS and infrastructure matters more than “how many gigabits”. For profiles and home folders latency is critical: tens of extra milliseconds can turn login into waiting. Same applies to connectivity to file servers, profile servers and VM storage.
Peripherals and redirection
USB, smartcard, scanner and audio redirection often produce “strange” lags that are hard to link to CPU/RAM load. Check in advance what will be allowed for redirection and where the control point will be.
In short, clarify before scaling:
- how many printers and which drivers, and whether heavy PDF/image printing exists;
- how many users typically use browsers and how many tabs;
- whether there are constant video calls and screen sharing;
- where profiles and files are stored and what the latency to storage is;
- what devices will be redirected into sessions and whether exclusions are needed.
When building RDS on new nodes, it’s useful to assign separate roles from the start (for example, printing separate from terminal servers) and test network latency to file services. For government and large organizations in Kazakhstan such details are better discussed before procurement — later they cost more than extra cores or RAM.
Tests before scaling: what to run before purchasing
Before buying hardware or adding hosts, run a short but honest pilot. It shows better than any formula where the bottleneck is: CPU, RAM, disks, network or profiles. This is crucial if your calculation is tight and you want to avoid login queues and app freezes.
Make a test plan that resembles a real day. Choose a pilot group (for example, 10–20% of users) and a control group (with the same tasks, but in normal mode). Gradually increase concurrent sessions: 25%, 50%, 75%, 100% of planned load, holding each level for 30–60 minutes. Also run a “morning peak” — mass login in 10–15 minutes.
During the test record not only resource loads but user perception. Minimum metrics:
- login time (until desktop is ready);
- key application start time (first start and subsequent launches);
- input latency (printing in Word/1C, menu responsiveness);
- number of hangs, session drops and profile errors;
- disk queue and read/write latency on storage.
To simulate work, don’t limit testers to “click around”. Use scenarios: open mail, 3–5 browser tabs, a document, print to PDF, work in the accounting system, copy files. A good sign is when testers perform regular tasks from a checklist, not try to “break” the system.
Test profiles and disk separately. For example, run 30–50 sequential logins/logouts with cache clearing, then a mass login. If login time jumps and disk queues grow, the problem is usually profile storage and many small files rather than CPU.
Readiness for scaling is easier to assess by thresholds. Red flags:
- login longer than 60–90 seconds in peak moments;
- noticeable input lag and freezes when opening menus;
- frequent profile errors or “temporary profile” occurrences;
- persistent disk queues and rising latency for the same operations.
If the test shows a hardware bottleneck, it’s easier to discuss configuration: add hosts, move profiles to separate storage or rework roles. In practice pilots are often run on a standard rack server to see behavior under real sessions before procurement guesses.
Monitoring after launch: signs of approaching overload
After launch calculations matter, but the real picture emerges only in monitoring. With a few clear charts at hand, overload is usually visible days or weeks before queues and complaints begin.
Daily, it’s useful to look at the same metrics for peak hours (morning, after lunch, end of day):
- CPU and processor queue (if CPU isn’t always 100% but queue grows, the node is already strained);
- memory (memory pressure, active usage, rising paging);
- disk and I/O latency (especially where profiles and temp files live);
- logins: login time and number of login errors (including profile failures);
- session density: active and disconnected sessions and their growth over days.
A one-time issue is marked by a spike after an update, a faulty print driver or a network incident that returns to normal the next day. A systemic problem is visible as a trend: each week peak-hour charts rise and the “normal” level degrades. This often happens when the business gradually adds users or applications while capacity planning targeted the old scenario.
It’s time to add a node or resources if in peak CPU stays high and the queue doesn't drop, disk latencies keep rising, or average login time worsens week over week. Agree SLAs with the business: for example, “login no longer than N seconds” and “application responsiveness without hangs for typical operations.”
For incident triage enable logging of login events, profile load, application failures and critical service events. Then instead of arguments like “everything is fine” you’ll have facts: when it started, how many users were affected, and at which point the infrastructure hit its limit.
Typical mistakes in RDS calculations and how to avoid them
The most common cause of queues in RDS is simple: the calculation was based on the “average across the board.” The system looks calm until peak hour, mass login or heavy task start.
Mistakes that most often lead to overload
Below are frequent slip-ups and what to do instead.
- Looking at average CPU load (for example, 30%) and feeling assured. Better to focus on 95–99th percentiles and processor queue length: if CPU hits 90–100% even briefly in peak, users will feel it as a freeze.
- Putting “everyone” on one node: accounting, call center, engineers with heavy apps. Separate load profiles by collections or nodes and set limits (CPU/RAM) for the most demanding scenarios.
- Planning resources “as is” but forgetting growth: Windows/Office updates, antivirus, EDR agents, new policies, browser plugins. Include headroom and verify changes on a test node after each major update.
- Storing profiles on slow or overloaded file storage. This causes long logins and “black screens” after login. Check IOPS and latencies, separate profiles and shared folders, monitor the number of small files and antivirus scanning.
- Running a test on 5 users and multiplying by 100. That doesn’t work: disk and network contention grow, file locks appear, print queues form. You need tests at least at 20–30% of real load with similar scenarios.
Example: 200 operators work in browser and CRM, but on Monday at 9:00 everyone logs in almost simultaneously. If you measured only “daily average”, you’ll miss the login bottleneck: profiles, GPO, antivirus and disk.
How not to be wrong in practice
Agree beforehand on success metrics and measure not only session counts but activity:
- login time and time to desktop;
- peak CPU/RAM per host and per process;
- disk latency and peak IOPS (especially for profiles);
- network peaks (updates, printing, file operations).
If these numbers are stable in peak and in “mass login” and “heavy task” scenarios, the likelihood of queues after scaling is much lower.
Short checklist and next steps
When you think everything is calculated, small things often turn into queues: an extra 2–3 seconds on profile load, a sudden print job at shift end, or one heavy user taking CPU from everyone.
Before finalizing calculations and moving to procurement, run through the checklist:
- users and peaks: concurrent sessions average and maximum, apps used by each group, seasonal spikes;
- CPU and RAM: per-session estimates, check during peak hours, plus headroom for background services (antivirus, updates, agents);
- profiles and storage: where profiles live, their size, how fast they load (latency matters more than disk capacity);
- disks and latency: is there enough performance for profiles and temp files, are there drops during login;
- test plan: which scenarios you will run and thresholds for pass/fail (login time, app responsiveness, stability at peak).
If this list is closed, the calculation becomes noticeably more accurate and easier to defend to the business.
Next steps
A practical path usually looks like this:
- run a pilot on 1–2 nodes with real users (or a load that closely matches reality);
- set success criteria: e.g., login within N seconds, no hangs in typical apps, headroom in peak CPU/RAM;
- after the pilot refine per-user limits and recalculate farm capacity;
- scale gradually, adding nodes and repeating peak tests.
If you need help selecting servers and deploying RDS, involve an integrator that covers both hardware and project work. GSE.kz as a manufacturer and system integrator in Kazakhstan typically joins at the pilot stage: helping choose configurations, plan profile storage and support the infrastructure in operation (including 24/7).
FAQ
Why does RDS slow down even though the "hardware doesn't look at 100%"?
Most often it's waiting for a shared resource, not just “not enough CPU”. Typical causes: - slow storage for profiles and temporary files (long login, “black screen”); - memory pressure and swapping to pagefile (disk latency rises); - CPU spikes during mass login and application startup; - printing and problematic drivers (the session freezes “as if the server”); - network latency to the file/profile server or packet loss.
How quickly determine whether the bottleneck is CPU, RAM, disk or network?
First, look for where queues accumulate: - **CPU**: *Processor Queue Length* grows and stays high for minutes. - **RAM**: active paging/frequent pagefile activity; applications freeze when switching. - **Disk**: read/write latency and disk queue length increase, especially during logins. - **Network**: logins/profiles "drag", there are losses/errors, high latency to file services. Practice: compare metrics during the peak (mass login, starting 1C/browser, printing), not the daily average.
How many users can you host on one RDS host?
Focus on **concurrent active sessions** in peak windows, not total employees. A minimal approach: - identify time peaks (morning, after lunch, end of day, month-end); - group users into 2–4 behavior profiles (browser/thick client/heavy reports/video); - allow 15–25% spare capacity for growth and background agents. If unsure, start with a pilot of 10–20 users and collect metrics — that’s more accurate than any fixed table.
How to estimate CPU per user without guessing?
Use the pilot peak and calculate “vCPU per session”. Example: - host has **16 vCPU**, peak **80%** CPU; - active sessions: **40**. Estimate: `16 × 0.8 / 40 = 0.32 vCPU per session`. Multiply by the planned number of sessions and add a buffer (usually 15–25%). Separately test the morning mass login — it often produces the worst spikes.
How to correctly calculate RAM for RDS?
A reliable scheme: 1) measure the RAM “base” after boot and service warm-up; 2) connect 5–10 test users and measure the additional RAM per session; 3) multiply per-session increase by the planned concurrent sessions; 4) add buffer for updates, antivirus/EDR, cache and heavy days. Red flag: during peak you see active paging and simultaneously rising disk latency — memory is already insufficient even if CPU looks fine.
Where is it best to store profiles so logins don't turn into a queue?
Common options: - **Local profiles on each terminal server**: fast login, but harder to move users during balancing or failures. - **Profiles on a file server**: easier centralized management, but you depend on network and file server disks. - **Profile containers**: often give more predictable login times because they reduce the scatter of many small files over the network. Rule of thumb: login times are usually killed not by gigabytes but by **thousands of small files** and high I/O latency on storage.
Why does antivirus suddenly make RDS slow and what to do about it?
Antivirus can slow down logins and profile operations by scanning many small files. Common mitigations (with InfoSec agreement): - add exclusions for typical profile/container paths and temp folders; - avoid heavy scans during mass login periods; - clean caches and reduce profile size; - pay attention to the file server that holds profiles — it will also be stressed by scanning. When dozens of users log in simultaneously in the morning, antivirus scanning often becomes the main source of delays.
What to do if RDS "dies" during mass printing?
Printing often causes weird hangs due to drivers and queues, not CPU/RAM. Practical actions: - standardize drivers and reduce their variety; - if possible, move printing to a separate print server; - check if drivers hang inside sessions (then a specific user’s session “freezes” everything); - test mass printing scenarios separately (PDFs, images, reports). If printing is critical, treat it as a separate load and test it independently.
Which tests to run before buying servers for RDS?
The test should mimic a real day, not just idle clicks. Minimum plan: - step load: 25% → 50% → 75% → 100%, hold each level 30–60 minutes; - separate “morning” scenario: mass login within 10–15 minutes; - metrics: login time, key app startup, input latency, profile errors, disk latency. Red flags: persistent login times over 60–90 seconds in peak, constant disk queues, noticeable input lag during typical actions.
What signs in monitoring show that queues and complaints are coming?
Keep a few simple peak-focused charts: - CPU and processor queue; - memory pressure and paging indicators; - disk I/O latency (especially where profiles and temp files live); - login time and profile load errors; - number of active/disconnected sessions and weekly trends. It’s time to scale if in peak the processor queue stays high, disk latency keeps rising weekly, or average login time worsens by trend — these signs usually appear before a flood of complaints.