QoS for voice and video: configuration and verification in the network
QoS for voice and video: practical settings for marking, queues and shaping, plus a measurement method to compare quality before and after deployment.

What breaks in calls and video meetings and why
The most common complaints about voice and video sound similar: “the other person sounds like a robot,” “words drop out,” “echo,” “video turns into blocks,” “the picture freezes,” “sometimes you can hear them, sometimes not,” “the call drops when screen sharing starts.” A speed test may still show good numbers, yet the issue persists.
Most often the problem isn't a “small pipe” but how traffic goes through queues in network devices. When congestion appears somewhere (on the WAN, a switch uplink, Wi‑Fi, or a VPN gateway), packets start to pile up. Voice and video need predictability more than raw bandwidth.
There are three things to distinguish:
- Latency: how long a packet takes from you to the other side.
- Jitter: how much that latency varies from packet to packet.
- Loss: how many packets never arrive.
Even small loss and jitter are more noticeable than “a few fewer megabits.” Voice can mask occasional losses, but a series of losses causes gaps and robotic speech. Interactive video tolerates delay worse and quickly shows artifacts if frames don't arrive in time.
Without priority, voice and video compete with any “heavy” traffic: file uploads, updates, backups, cloud sync. Example: accounting starts a backup to NAS, a switch uplink fills up, the queue grows, and the director's call becomes a stream of fragments.
QoS for voice and video doesn't "speed up the Internet" — it defines what is important, how to mark traffic and which queue to put it in so that, under congestion, conversation packets pass first, not large data transfers.
Which metrics to set as targets and what to measure
For queues and prioritization to actually work, agree in advance what numbers you call “good” and how you measure them. Otherwise you'll set rules and later won't be able to prove whether things improved.
Key metrics for voice and video
Latency, jitter and loss are almost always the crucial metrics for calls and meetings. They directly translate into robotic audio, cut phrases and pixelation.
Vendor‑neutral guidance:
- Latency (one‑way): for voice aim for up to 150 ms; for video often tolerable up to 200 ms.
- Jitter: for voice prefer up to 20–30 ms; for video up to 30–50 ms.
- Loss: for voice strive for 0–1%; video shows problems already around 1% (especially in spikes).
Useful additional indicators: retransmissions (when available) and short queue spikes. Often the egress queue on an interface is the main source of problems.
These aren't magic numbers — more important is that they stay steady and don't spike.
Why stability matters more than the mean
An average latency of 30 ms looks great, but if once a minute there's a spike to 300 ms, people will still talk over each other and ask to repeat. So look at p95, maximum and frequency of spikes, not just the average.
A typical case: during large file copies the average loss is near zero, but every 10–20 seconds there is a short spike of loss and jitter. Users complain exactly at those moments even though “on average everything's fine.”
Two to three metrics to watch constantly
To avoid drowning in charts, keep a minimal set that gives a clear picture:
- jitter and loss for voice traffic;
- latency (and p95) on the key WAN interface or internet edge;
- queue length and drops on egress interfaces where congestion is most likely.
This makes it easier to tell whether prioritization helped and where quality is breaking.
Inventory: where voice and video travel in the network
Before configuring QoS, map where traffic appears and which devices it actually traverses. Otherwise you'll configure QoS “across the network” while the issue remains on one overloaded switch, a WAN link, or in Wi‑Fi.
Start with sources. Voice and video are not only IP phones: they include softphones on laptops, meeting room systems, mobile clients on corporate Wi‑Fi, and sometimes server components (gateways, SBCs, recording, VDI). For each source note the connection type (wired, Wi‑Fi, VPN) and the first hop (access switch, Wi‑Fi controller, VPN gateway).
Then record directions: inside the office, between branches, to the data center, to cloud, to the Internet. The same user can call a colleague in the next room or an external partner, and those paths load the network differently.
A quick path and bottleneck map can often be built in 1–2 hours:
- mark critical zones (call center, meeting rooms, reception, classrooms);
- find switch ports where phones and room systems connect (MAC/LLDP, port descriptions);
- list places where speed can “dip”: 1G uplinks, overloaded inter‑VLAN interfaces, WAN, VPN, Wi‑Fi;
- capture current utilization and errors on these interfaces (peaks, drops, queue stats);
- describe 2–3 typical call and meeting scenarios and their routes.
Separate flows. Media (RTP/SRTP) matters most for quality: it's sensitive to latency and loss. Signaling (SIP, HTTPS for cloud services) is usually lighter in volume but its failure looks like “calls don't connect.”
Practical example: video meetings between a branch and the data center go over VPN, while IP phone calls go via MPLS. If you only look at “telephony,” you might miss a hotspot on the VPN gateway where video loses priority and degrades during evening backups.
Marking traffic: DSCP and 802.1p without confusion
DSCP and 802.1p/CoS are ways to tag packets so switches and routers know what to forward first. DSCP is in the IP header (L3) and 802.1p/CoS is in the VLAN tag (L2). It's important not only to pick the right values but to ensure three things: marks are applied in one place, preserved along the path and interpreted the same way by all devices.
Agree on a short set of classes and apply them consistently on endpoints, access points, switches and gateways. A typical minimum:
- voice (RTP): DSCP EF (46), CoS 5
- video (meetings/video RTP): DSCP AF41 (34), CoS 4
- signaling (SIP/H.323): DSCP CS3 (24), CoS 3
- "other": DSCP 0, CoS 0
Where to apply markings
Ideally the source marks packets: an IP phone or video app if it supports it. But some clients mislabel or intentionally raise priority. In practice, marking is often applied or corrected at the access port (edge): the switch or Wi‑Fi controller classifies by VLAN, port, ACL, NBAR and then sets DSCP/CoS.
On the WAN gateway (or border router) markings are usually verified and remapped before traffic exits the site. That's where queues and rate limits typically start.
Trust boundary
The trust boundary is the point after which you "trust" the marking and stop changing it. A good rule: trust managed devices (IP phones, corporate video endpoints) and re‑mark everything else at the edge.
Simple rules to avoid losing marks:
- on ports to phones — trust the phone, but not a PC behind the phone (PC‑port);
- on user ports — don't trust, classify and mark on the switch;
- on trunks — preserve CoS and map CoS <-> DSCP where needed;
- on Wi‑Fi/VPN — ensure DSCP is not zeroed during encapsulation;
- on the WAN edge — use a consistent class table and the same DSCP values in both directions.
Example: if a video client on a laptop sets DSCP 46 (voice), don't trust it. Re‑mark at access to AF41 and give it priority, but not so much that it pushes out speech.
Queues and schedulers: how to actually give priority
QoS queues usually take effect on egress, not ingress. On egress the device decides what to send now and can order packets. On ingress packets have already arrived — if the port is saturated, you can only buffer or drop.
The idea is simple: voice needs minimal jitter, video needs to avoid loss, and everything else can wait a bit longer. A common approach combines a strict priority queue for the most sensitive traffic and weighted queues for the rest.
A basic queue layout that usually works
This isn't the only correct setup, but it's a clear starting point:
- strict priority queue (LLQ) for voice — serviced first but with a hard bandwidth cap;
- separate queue for video — guaranteed share plus an upper limit;
- queue for critical business apps — weighted, without absolute priority;
- "everything else" — lowest priority but not zero.
Don't blindly reserve a fixed chunk for voice and video. If you give video too much, it will crowd out applications. If voice gets unlimited priority, any mis‑marked flow can masquerade as voice and dominate the link.
Surviving burst traffic with simple tools
Bursts occur when traffic comes in spikes (updates, backups, large attachments). Two basic tools often deliver quick benefits:
- Shaping: smooths bursts by spreading sending over time. Useful on the WAN where the interface is faster than the actual link.
- Policing: strictly limits a flow. More useful on ingress where a scheduler can't help.
WRED is “intelligent early dropping”: when a queue fills, the device drops some lower‑priority packets proactively to avoid full congestion and mass loss. This helps background flows slow down and leaves room for voice and video.
WAN and internet links: shaping and policing
In LANs bottlenecks are often switch ports and prioritization there works predictably. On the WAN it's more complex: the link is limited and the provider may apply their own rules. Latency, jitter and loss often appear at the internet or MPLS edge.
The main idea: it's better for queuing and dropping to happen on your router (where you control priorities) than at the provider (where voice/video is just another flow). Therefore, apply shaping on the WAN egress — limit the rate slightly below the real link speed.
Practical speed setup:
- find the contractual and actual speed (at different times of day);
- set the shaper to 90–95% of the guaranteed speed so you don't fall into provider policing;
- apply queues and priorities before the shaper so voice/video leave first under congestion;
- check settings separately for upload and download (often upload is the issue).
Hard policing (especially with instant cuts) is dangerous for voice and video. It tends to drop packets in bursts, increasing jitter and creating robotic audio and video artifacts. If policing is needed (e.g., to limit guest traffic or hungry backups), apply it to low‑importance classes and keep critical traffic under shaping.
Another trap is tunnel overhead. PPPoE, VLAN tags, GRE/IPsec, L2TP and other tunnels reduce usable bandwidth and change frame sizes. If you configure shaper to the nominal line rate, it may exceed the provider's real capacity and cause provider‑side drops. Allow 5–10% headroom and verify MTU/MSS when VPNs are active.
Example: a branch has 50 Mbps but all traffic goes over IPsec. If you set shaping to exactly 50 Mbps, the provider may start dropping and video will feel jitter first. Shaping at 45–47 Mbps plus priority queues usually yields steadier quality.
Wi‑Fi and VPN: how to keep priority on the last mile
Even with QoS set on switches and WAN, quality often collapses on the last mile: Wi‑Fi and VPN. Priority is easily lost there and jitter and delay grow in bursts.
On Wi‑Fi it's critical to have WMM (Wi‑Fi Multimedia) — the mechanism for queues and priority on the wireless side. If WMM is disabled or misconfigured, voice and video compete with downloads equally: one laptop can start a download and cause pauses, robotic audio and smeared video on the call.
Wi‑Fi: roaming and AP overload
Jitter comes from both the link and the radio environment. Roaming between APs can take time and a heavily loaded AP builds queues so packets arrive in bursts. Typical scenario: a meeting room with 15 people in a video conference plus a couple of guests updating devices in the background.
What to check on the controller or AP:
- is WMM enabled on the SSID used by call participants?
- how many clients per AP and what's the real airtime/utilization?
- are there frequent roam events during a call?
- are there high retry counts due to interference or weak signal?
VPN: where DSCP disappears and how to check
In VPNs priority often vanishes in two places: during encapsulation (when the outer header doesn't carry the marking) and at network borders (where a provider or device zeroes DSCP). Sometimes marks exist inside the tunnel but are not set on the outer header, so traffic becomes best‑effort beyond the tunnel.
A simple test without a lab: start a test call, generate background load, and capture traffic before entering the VPN and after exiting. Compare DSCP values on RTP/video packets and on the outer tunnel header. If marks disappear, configure copy/remap on the VPN gateway and queues on the egress interface.
For remote users know your control boundaries. You control the VPN gateway, corporate Wi‑Fi and edge router policies. You can't change an employee's home router, but you can reduce risk: recommend 5 GHz, limit background sync over corporate VPN during work hours, and ensure corporate PCs and thin clients set correct markings on outbound traffic.
Step‑by‑step plan to roll out QoS in a real network
How to proceed step by step
QoS for voice and video must preserve quality at peak times when the network is truly loaded. So build the plan around bottlenecks, not around “a little bit everywhere.”
-
Define traffic classes. Usually 3–4 classes suffice: voice, video, critical business apps, and everything else. Classify by what you can reliably use: phone VLAN/subnet, media server IPs, signaling ports or tags from the UC client.
-
Document trust boundaries. Don't trust arbitrary DSCP from user PCs at the access. Typical approach: trust IP phones and video endpoints, re‑mark PC traffic at the first switch or Wi‑Fi controller.
-
Prioritize where queues form. In most networks this is the WAN/internet egress and inter‑segment uplinks. Configure queues on those interfaces: give voice strict priority with a limit, place video in a guaranteed queue, and leave the rest as best effort.
-
Keep background under control. Schedule or limit updates, backups and sync during work hours, especially on the WAN.
A practical order of work that usually doesn't take months:
- lock down classes and rules to identify voice and video;
- configure marking and trust boundaries from access to WAN;
- enable queues on the busiest egress interfaces and verify priority doesn't starve other traffic;
- throttle background transfers during business hours;
- enable QoS counters monitoring and start with a pilot (one office or floor).
After the pilot, compare quality on the same scenarios: 2–3 concurrent calls, a video meeting and simultaneous background load. If counters show traffic moving into priority queues and complaints drop, scale the settings by template to other sites.
Measurement methodology before and after: how to prove the effect
The benefit of QoS is best seen in numbers. First capture a baseline: the same metrics, the same measurement points and identical conditions. Then repeat after changes.
Baseline: how to measure "before"
Collect data as a series, not a single sample, otherwise you may compare a "good day" to a "bad day."
- pick 2–3 typical periods: morning peak, midday load, late hour;
- gather 3–5 working days in a row to check repeatability;
- record WAN utilization, drops at the edge, and number of active calls;
- note current policies: markings, queues, shaping and where they are applied;
- define measurement points: access switch, WAN edge, Wi‑Fi controller (if present).
Then run active tests: at the same time start a 10‑minute video call and a controlled background load (e.g., copying a large file). Repeat the scenario several times to see when degradation begins.
Collect passive data from UC reports (MOS/quality, jitter, loss) and from network stats: interface drops, queue fill, policy‑map/class‑map counters (or equivalents).
Comparing "before/after" without self‑deception
Good report artifacts are simple and comparable:
- graphs of latency, jitter and loss by hour;
- p95/p99 tables (not only averages) before and after;
- dumps: drops per class and queue fill levels;
- comparison with link load to show improvement is not just because the link became less busy.
To separate prioritization effects from other changes, don't change routing or capacity at the same time; record path and use identical test scenarios. If other changes are unavoidable, log them and compare only matching load intervals.
Common QoS mistakes and how to find them quickly
A frequent situation: packets are marked but quality doesn't improve. Usually the issue isn't DSCP/802.1p itself but that the marking never becomes a real priority in the egress queues.
Marks exist but there's no priority
DSCP or 802.1p may be set correctly, but the border router or switch lacks a class‑to‑queue binding. Then voice and video end up in best effort with downloads. Quick check: look at class counters on the egress interface. If voice/video counters don't grow or are in the same queue as Best Effort, there's no real priority.
Wrong trust boundary
If you trust markings on user ports, any PC can claim priority and push out calls. In practice trust is placed close to voice sources (IP phones, SBC, MCU) or on ports where a phone is in front of a PC and the switch can distinguish their traffic.
Priority queue misconfigured
Errors go both ways: the priority queue can be too small (voice drops under load) or too large (voice starves others). Signs: at moderate load priority class shows drops, or other classes are constantly starved.
Small example: video meetings fail only when someone downloads a file to a branch. Video was marked but the WAN egress had no queue for that class and only voice got priority.
Quality drops only in one direction
Sometimes QoS is applied almost everywhere except egress on one link (a particular access switch or one WAN router). Then sound is fine “there” but broken “back.”
Quick checks that often find the issue in 15 minutes:
- compare QoS policies and counters on both ends of the problematic link;
- ensure the class is applied on egress, not only ingress;
- check drops per queue and for microbursts (if supported).
Policing cuts RTP
Policing on the WAN can cut short RTP/video bursts and cause losses even when average link use is below thresholds. Sign: losses rise in a specific class while overall link load is below the limit. Usually a softer shaping, correct burst parameters and matching rates to the provider's real capacity help.
In large organizations keep a “QoS map”: where markings are trusted, where queues are enabled, and where shaping is applied. That helps quickly find the one node that deviates from the design.
Quick checklist and next steps
If quality improved after changes, lock the result and look for remaining weak spots. This short checklist is handy to run 30–60 minutes after any network changes.
Quick checklist
During a real call or test video session at peak time verify:
- markings are present: voice and video exit endpoints with correct DSCP and switches show proper 802.1p (if used);
- queues are active: on the WAN device and key switches the priority queue is served first without starving others;
- no "priority drop": traffic isn't falling into a common class due to wrong match or rewrite;
- buffers under control: no constant queue overfill on narrow links and uplinks;
- numbers improve: latency, jitter and loss for voice and video are better than before and remain stable.
If something doesn't add up, trace the path: phone or PC, then access switch, core, border router and provider edge. Marks are most often lost at domain boundaries (office to branch, before VPN, in Wi‑Fi) and queues accumulate at the narrowest point where there's no shaping.
How to document results
To agree changes and keep track:
- "before/after": 3–5 measurements under identical conditions (peak hour, same path);
- snapshots of key counters: queues, drops, utilization on narrow links;
- what was configured: classes, policies, where marks are set and trusted.
Then it's usually clear what to do next: turn the QoS profile into a standard (templates for switches, Wi‑Fi, WAN), roll it out to branches and enforce “don't change transit markings without reason.”
If you need help with audit, deployment and support, consider engaging a systems integrator. For example, GSE.kz as a manufacturer and integrator helps build infrastructure for UC and video: from the network layer to selecting servers and workstations for load, with continued support.
FAQ
Why is the speed test normal, but calls still sound robotic and fall apart?
Most often the issue is queues and congestion on a specific network segment, not a "small link." When voice and video packets end up in a general queue alongside downloads and backups, you get spikes in latency, jitter and loss, even though a speed test may look "good."
What latency, jitter and loss values are normal for UC?
For calls the most important metrics are one‑way latency, jitter and packet loss. As practical targets, aim for one‑way latency up to 150 ms for voice and about 200 ms for video, jitter preferably 20–30 ms for voice and 30–50 ms for video, and keep losses around 0–1% without spikes.
Why is looking only at average latency a bad idea?
Because people feel the spikes, not the average. Even if average latency is low, rare spikes to hundreds of milliseconds break the conversational flow, cause interruptions and dropped words, and video freezes or pixelation.
Where to start looking for the bottleneck for voice and video in the network?
Start from the traffic path: where the source is and which devices the media stream actually traverses. The bottleneck is usually the WAN/internet egress, a switch uplink, an overloaded Wi‑Fi access point or a VPN gateway. Look there for queues, drops and increased jitter.
Which DSCP markings are commonly used for voice, video and signaling?
A practical minimum is to agree on a few classes and use them everywhere. Typically voice is DSCP EF (46), video DSCP AF41 (34), signaling DSCP CS3 (24), and everything else Best Effort — but the key is that markings are applied, preserved along the path and mapped to real queues at egress points.
Where to place the trust boundary and who shouldn't be trusted with DSCP?
Trust markings only from managed devices such as IP phones and corporate video endpoints. Traffic from user PCs and guest devices should be re‑marked at the access port or Wi‑Fi controller, otherwise any client can claim high priority and push calls aside.
Why is QoS almost always configured on the egress interface rather than ingress?
Because QoS decides what to send on the wire when an interface is saturated, and that decision happens on egress. On ingress packets have already arrived and during overload the device can only buffer or drop, which quickly becomes noticeable for voice and video.
What is better for calls on the WAN — shaping or policing?
Shaping evens out bursts and helps ensure that queuing and drops occur on your router where you control priorities. Policing is harsher and often creates bursty losses that increase jitter and produce robotic voice and video artifacts. So keep critical traffic under shaping and apply policing to less important classes.
Why does QoS "get lost" in Wi‑Fi and VPN, and how to quickly check it?
On Wi‑Fi check that WMM is enabled on the SSID and that an access point is not overloaded by airtime use, otherwise priority won't work over the radio. For VPNs, ensure DSCP is not zeroed during encapsulation and that queues are applied to the outer header on egress; a quick way to verify is to capture traffic before entering the tunnel and after exiting it and compare DSCP values.
When does it make sense to involve an integrator and how can GSE.kz help?
If you need fast, clear results, capture "before/after" metrics under the same conditions and during peak hours, then compare metrics and queue counters at bottlenecks. For deployment under load, server and workstation selection, and ongoing support of UC and video infrastructure it makes sense to involve a systems integrator; GSE.kz, as a manufacturer and integrator in Kazakhstan, typically covers both the network and compute side.