Replacing a Commercial Proxy with Squid: Cache, ACLs, Logs, and TLS
Replacing a commercial proxy with Squid: migration plan, caching, ACLs, logging and TLS inspection where permitted, plus measures to avoid breaking critical web services.

Why replace a proxy at all and what can break
A commercial proxy often performs several tasks at once: internet access control, accounting and investigation via logs, basic protection from malicious domains, sometimes convenient authentication (AD/LDAP/SSO), security reports and ready-made policies. When replacing a commercial proxy with Squid, it’s important not just to install a new server but to preserve the functions that are actually used every day.
The reasons are usually simple. Licenses get more expensive, terms change, and rules and exceptions remain inside a “black box” that’s hard to explain to an auditor or even to yourself after six months. Add vendor lock-in and support difficulties, and it’s clear why organizations choose a more transparent option.
The proxy is a point through which critical traffic passes. Migration mistakes often don’t show immediately but appear as “odd” failures for users and services. Authentication and SSO (login loops, password prompts, unexpected 407) suffer most often, as do business sites with strict TLS requirements (certificate pinning, mTLS, bank clients), OS and antivirus updates (repositories and clouds), and video calls and messengers (WebRTC, nonstandard ports, large streams). A separate risk area is integrations where the proxy is hardcoded in an app or set via PAC/WPAD.
The risk isn’t only in blocking. Sometimes traffic unexpectedly bypasses rules, and logs lack the picture needed for investigation.
Warn in advance not only IT. Usually involved are IT operations (network, AD, endpoints), security (policies, log requirements, permissibility of TLS inspection), owners of key services (ERP, accounting, government portals, payments) and support so they can distinguish real incidents from “the migration.”
Simple example: in a government organization access to portals and digital signatures may rely on strict TLS while accounting depends on an external bank. If exceptions and tests aren’t agreed beforehand, migration becomes a string of urgent requests “allow this right now.”
Preparation: requirements, inventory, success criteria
Before replacing a commercial proxy with Squid, agree on goals and boundaries. Most failures aren’t due to configuration syntax but because nobody documented what counts as normal operation and what is an incident.
Start with requirements. “Need internet via proxy” is not enough. Clarify who your users are (office, branches, remote), which applications access the network (browsers, OS updates, ERP, video conferencing), and what restrictions exist: regulatory rules, log retention, prohibition of traffic decryption, need for allowlists.
Immediately record critical services that must not break: bank client, government procurement portals, EDI, supplier portals, video conferencing, healthcare (MIS), learning (LMS). Accounting may rely on a portal that only works with direct TLS without certificate substitution, and a call center may use web telephony sensitive to latency.
Inventory takes less time than post-incident forensics. Gather what exists in the current solution: access rules, exceptions, schedules, domain and category lists, block reports, typical user complaints. Separately list “grey zones”: rules nobody remembers the purpose of, and exceptions left “just in case.”
Minimum to collect before the first test installation:
- subnets and user groups that will go through the proxy
- critical domains and IPs (including external APIs and SaaS)
- current block rules and exceptions
- requirements for log storage and access
- applications that cannot work through a proxy
Success criteria
Criteria must be measurable. Otherwise migration turns into an argument of “seems worse.”
A clear set of criteria:
- availability: key services open and allow authentication
- latency: pages and web apps are not noticeably slower at peak
- log quality: you can see who visited what, when and why access was denied
- policy enforcement: blocks behave predictably without random denials
- security: TLS inspection is enabled only where permitted and agreed
At the end choose a cutover scenario. Parallel launch is almost always safer: new Squid serves a pilot group or one branch, while the rest stay on the old proxy. A single-moment switch is justified only if the configuration is simple, the inventory is complete, and rollback is fast (for example, returning routing or a PAC file in 5–10 minutes).
Squid deployment architecture in plain words
Squid can be deployed in different ways, which affects how smoothly migration goes. The clearest starting option is a standalone server or VM that accepts user traffic and accesses the internet on their behalf. This makes it easier to assess load, move settings and quickly roll back.
If traffic is large or the proxy becomes a critical point, plan for at least two nodes. This doesn’t have to be a complex cluster: often a pair of identical Squids and a failover mechanism (virtual IP, load balancer or quick client switch to a second address) are enough. Important that both nodes have identical rules and identical network access.
Explicit proxy or transparent mode
For controlled migration an explicit proxy is usually simpler. Devices receive the proxy address and port via policy, profile, MDM or browser settings. Then you know exactly who is going where and can move user groups in stages.
Transparent mode looks tempting (no client changes) but more often causes surprises: harder to debug, higher risk of breaking some sites, and harder to make applications aware that traffic goes through a proxy. For first deployments it’s usually left for later, after exceptions and service behavior are understood.
Separation of roles to avoid confusion
Let Squid do proxying, not everything at once. Keep nearby but separate key services: DNS (fast and predictable), user directory (AD/LDAP) for groups and access rules, log collection (syslog, SIEM or a dedicated server) and availability monitoring (port checks, latency, cache disk usage).
Simple example: move accounting to an explicit proxy first. If their bank client stops opening, you can quickly determine whether it’s DNS, an access rule, a certificate issue, or a domain that needs an exception. When roles are separated, root cause hunting takes minutes, not days.
Overall the architecture should answer three questions: where Squid is located, how users reach it, and what happens if a node fails. If these are thought through, discussing cache, ACLs and logs is much safer for critical web services.
Step-by-step migration without service interruption
A reliable migration is not “one night and switch”; it’s a series of small verifiable steps. At any time you should be able to return to the old proxy within minutes.
Start with a separate test Squid that doesn’t affect primary operations. Connect a small group to it: IT, a couple of heavy web-app users and at least one person from accounting or procurement. This yields quick signals about problems that lab tests miss.
Next bring over basic access rules “as is.” Don’t try to improve logic, merge lists or “clean up” at the start. The fewer differences from current behavior, the fewer surprises. Focus on three things: who can access the internet, which domains are essential for work, and which restrictions are already in force (for example, bans on anonymizers or torrents).
Enable caching cautiously and only after access is stable. A good practice is to start with conservative settings and measure the effect: what sped up, where load increased, and which services began acting strangely. Many modern SaaS and banking portals are pointless or harmful to cache.
When the pilot runs smoothly, expand by departments. After each expansion record changes: what rules were added, which exceptions were made, what domains were added to the allowlist. This disciplines the process and helps resolve complaints that appear “two weeks after cutover.”
Concurrently prepare and test a rollback plan:
- what exactly is switched back (PAC file, DHCP/AD policies, browser settings)
- who performs the rollback and who confirms recovery
- which metrics indicate “it got worse” (rise in 407/403, complaints about specific services)
- how long does restoration take
If your goal is to replace a commercial proxy with Squid, the secret is simple: change one parameter at a time and always keep a quick rollback.
Caching: where it helps and where it hurts
Squid cache is useful where the same files are downloaded by many people and rarely change. In an office this often yields noticeable effects: less outbound traffic, faster repeat loads and lower peak latency. But cache can harm if it stores content that must always be fresh or personal.
Start conservatively: cache only explicitly safe static content and leave the rest untouched. When replacing a commercial proxy with Squid this reduces complaints about “stale data” and strange web-app errors.
Typically safe to cache: OS and app updates, package repositories, static files (images, styles, scripts, public fonts), distributions and internal portals with static content updated on a schedule.
Better to exclude from cache: personal accounts, banking and payment pages, government resources, medical systems, dynamic web apps (feeds, chats), API requests and any scenario with authentication, tokens or individualized responses.
Keep four terms in mind: cache size (disk space), TTL (how long an object is considered fresh), hit (response served from cache) and miss (Squid contacted the internet). In practice balance matters more than a “perfect config”: better a small cache without surprises.
Check benefits with simple observations: does hit rate for static content grow, does outbound traffic drop at peak, are there no error spikes or complaints about old versions or login issues. If complaints arise, first disable cache for the problematic domain or URL type, then fine-tune TTL.
ACLs in Squid: how to keep rules clear and durable
ACLs can easily become a tangle of exceptions if added ad hoc and without structure. Readable rules usually look boring: organized by purpose, documented and changed rarely. This is especially important when replacing a commercial proxy with Squid and you want control, not a new source of risk.
Start with a basic structure: define users and groups separately, resource categories separately, and exceptions separately. In the config this often means separate files or blocks where each rule answers one question: “who?”, “where?” and “how?”. Then in an incident you look for “which group and which resource” rather than “why did it break?”.
Implement least privilege in steps:
- first allow OS and antivirus updates, corporate DNS and NTP (if they go through proxy)
- then allow main business web services only to the groups that need them
- finally explicitly deny “everything else”
- add new permissions via named ACLs with a short comment saying “why”
Keep critical services as a separate layer. For many organizations these are government portals, EDI, bank portals, payment gateways, cloud office suites and MFA services. These resources have strict TLS, cookie and redirect requirements, so they benefit from separate ACLs and a clear temporary bypass method for failures. Example: instead of “allow everything for accounting” create a group users_accounting and ACL bank_portals, and allow only their intersection.
Exceptions should show who requested them and a review date. Without an owner an exception almost always becomes permanent.
Also separate time-based access and guest devices. For guests it’s simpler to have a separate set of rules: fewer rights, limited categories and optional time limits. For emergencies a separate break-glass group with strict accounting and quick disable is useful.
Logging: investigate incidents instead of guessing
When replacing a commercial proxy with Squid, logs get noticed in the first week. When a user “cannot open a site,” logs show what happened: blocked by ACL, DNS error, certificate refusal, timeout or service-side problem.
A minimum usually includes three layers. Access log answers “who went where” (IP/login, method, response code, time). Cache.log helps see Squid errors, restarts, disk and memory problems and DNS. Also log denial events: reasons for blocks (by domain, time, auth) and signs of suspicious activity (many 407s, spikes in requests, many CONNECTs to nonstandard ports).
To avoid drowning in data, agree on format and a “field map” in advance. Usually record: source IP, user (if auth), domain or SNI (if available), URL, response code, size, time and denial reason. Then you can run targeted queries without reading gigabytes.
Agree on storage and access before launch. Simple principle: store enough to close investigations and limit access so logs don’t become a leakage source. Assign a log owner, a process for extract requests and who can view personal data.
Security teams typically want four things: who accessed a domain at a specific time and whether CONNECT was used to bypass; why a request was blocked and which rule fired; whether there were mass requests to suspicious domains; and controls over integrity and who can change logging settings.
Test log usefulness with a simple scenario. For example, in a hospital or government office a supplier portal stopped opening. Within 10 minutes collect the chain “user – domain – code – reason,” confirm or exclude a block and produce a short report. If this requires guessing and manual workarounds, the field format or logging levels were chosen poorly.
TLS inspection: where it’s allowed and how not to violate rules
TLS inspection (SSL-bump) is when the proxy doesn’t just pass encrypted traffic but temporarily decrypts it inside the perimeter, checks it against rules (DLP, antivirus, categories) and then re-encrypts to the destination. Without inspection HTTPS usually reveals only metadata: where the connection goes, not its content.
Technically this requires a corporate root certificate on managed devices: browsers will trust certificates the proxy issues on the fly. Without that you’ll get widespread certificate errors.
The question then becomes permissibility. On corporate-managed devices with an explicit acceptable-use policy the risk is lower. In practice inspection is appropriate on managed devices (MDM/AD) with approved policy, in segments with higher requirements (classrooms, shared terminals, guest networks), and for specific categories or incidents rather than “everything.” And almost always start with a pilot group.
There are zones where inspection often causes more problems than benefits and can be legally sensitive: finance (banks, acquiring), healthcare, government resources, personal email and messengers, and services where interception is prohibited. Even if technically possible, it can violate internal rules, contracts or compliance requirements.
Typical failures: apps with certificate pinning (some bank clients, messengers, updaters), SSO login errors, endless captchas and redirects, broken APIs for business systems. Externally this looks like “site won’t open” though the network is up.
The safest approach when replacing a commercial proxy with Squid is targeted inspection plus a large exclusion list:
- start with splice (no decryption) for everything and enable bump only for selected domains or categories
- immediately exclude banks, payments, healthcare, government resources, OS updates and critical clouds
- maintain a test-user group and a quick rollback method (ACL flag or separate port)
- log the fact of blocking and the reason, but don’t collect unnecessary content
- align policy with security and legal teams before enabling in prod
If TLS errors and complaints grow after enabling inspection, don’t try to “tune” endlessly. Often the right move is expanding exclusions and keeping inspection only where it clearly helps without breaking business services.
How not to break critical web services: practical techniques
Critical web services break not because of Squid itself but due to small issues: unexpected authentication, blocked methods, TLS interception, timeouts or rule mismatches. Before replacing a commercial proxy with Squid agree what must not be taken down and who confirms that “everything works.”
Start with a list of critical services and owners. For each service record the business owner, technical owner, test window and success criteria. Example criteria: “SSO login succeeds,” “document signing works,” “API responds within 300 ms at the 95th percentile.”
Checks to run before wide rollout
Run a short test suite on the pilot group (10–20 users or one office), not across the whole network. Check logins (SSO, OAuth, MFA, portals with CAPTCHA), data transfer (upload/download of large files, cloud drives), real-time scenarios (video calls, chats, WebSocket), special functions (digital signatures, crypto providers, government cabinets, bank portals) and integrations (service APIs, webhooks, AV updates, management agents).
Typical problem signs repeat: SSO loops, OAuth returning 403/429, infinite CAPTCHA, WebSocket disconnect every 30–60 seconds, updates blocked by domain or nonstandard paths.
How to diagnose and fix quickly
Keep the ability to compare old proxy vs new proxy behavior. Best is a test machine where switching takes a minute (via a separate group, PAC/browser setting or temporary exception).
During an incident record three things: URL or host, exact time, and user or subnet. In logs check codes (TCP_DENIED/403, 407, 502/504), response size and timing. If it works via the old proxy but not the new one, common culprits are ACLs (no CONNECT to required port or domain), TLS inspection (not allowed for that service) or too-strict timeouts.
Plan 3–5 days of heightened readiness: change windows, fast feedback channel and a tested rollback plan. Appoint one decision maker for temporary bypasses (e.g., excluding a domain or enabling direct access) so critical services don’t stay down for hours.
Quick checks and next steps
Before final cutover to Squid take a brief pause and run through checks. This reduces the risk of sudden failures with mail, banks, government or internal web systems.
Ten minutes before switching that save hours
Ensure you understand what will happen at cutover and that basics are ready:
- ACLs checked: who accesses what, and whether there are separate rules for server subnets and admin devices
- exceptions agreed: banks, SSO/IdP, OS and AV updates, video conferencing, partner APIs
- logs enabled and clear: access.log, cache.log, retention and time sync (NTP)
- monitoring ready: proxy availability, CPU/RAM/disk, space for cache and logs, error counts
- rollback plan documented: how to revert routes or PAC/WPAD, who executes, and time required
Then run a control test on the pilot group (one department or branch): open 5–10 critical sites, test corporate logins and download a large file to observe cache behavior and timeouts.
After switching: what to watch in the first hours
Right after moving traffic don’t expect silence. Quickly distinguish systemic issues from isolated cases. Collect feedback (mail, EDI, internet bank, government portals, video calls), watch metrics (rise in 4xx/5xx, increased timeouts, CONNECT spikes), inspect top errors from cache.log (DNS, TLS handshake, denied, auth, certificates) and confirm that exceptions for critical domains and updates actually work.
If an error repeats, record: time, user IP, domain, Squid code and a snippet from access.log. This speeds up root cause analysis noticeably.
Then update documentation: which subnets use the proxy, where ACLs and exceptions live, who approves changes and how to add a new service without risking an overly permissive rule. Move on to planned improvements: enable cache where it helps, clean up stale rules, and schedule reports on top domains and denials.
If project needs servers for proxy, logs and related tasks (for example colocating with other infrastructure or support requirements), this is often handled via GSE.kz: rack servers S200 are usually suitable for such roles, and the system integration team can help build a solution to your architecture and requirements.
FAQ
Why replace a commercial proxy with Squid?
People replace commercial proxies when licenses and support become too expensive or unpredictable, and access rules remain inside a “black box.” Squid is chosen for transparency: you can clearly see which ACLs and exceptions are active and explain this to security and auditors. The key is to migrate not just the server, but the functions people actually use: authentication, reports, logs and critical exceptions.
What breaks most often when migrating to Squid?
The most common breakages are authentication (SSO, 407 loops), sites with strict TLS (certificate pinning, mTLS, bank clients), OS and antivirus updates, and real-time apps (video calls, messengers, WebRTC, nonstandard ports, large streams). Another frequent issue is applications with proxy settings hardcoded or distributed via PAC/WPAD that are overlooked during migration.
Where to start preparing before moving to Squid?
Start with requirements and an inventory: who are the users, which applications access the internet, what are constraints on TLS inspection and logs, and which domains are critical. Then record measurable success criteria: key services open and allow login, latency hasn't noticeably increased at peak, and logs provide clear reasons for denials. Without this, complaints turn into debates about whether things actually got worse.
Which is better for migration: explicit proxy or transparent mode?
For an initial rollout an explicit (manual) proxy is usually simpler: you control who is moved and can roll back specific groups quickly. Transparent mode may seem attractive (no client changes) but often brings surprises: harder debugging, more risk of breaking HTTPS sites and apps that don’t expect a proxy. Consider transparent mode only after you’ve learned exceptions and stabilized rules.
How to migrate to Squid without interrupting users?
The safest approach is a parallel run: new Squid serves a pilot group or one office while the rest stay on the old proxy. This reduces downtime risk and provides real cases to tweak ACLs, exceptions and timeouts. A single-moment cutover is only appropriate when the configuration is simple and rollback takes minutes.
Should I enable caching in Squid and how not to cause harm?
Start conservatively: cache only clearly static, non-personal content after access has stabilized. Cache helps with repeated downloads like OS updates, repositories and static files, but can harm personal accounts, banking pages and dynamic web apps. If users complain about “stale data” or login issues, first disable caching for that domain or URL type.
How to build ACLs so the config doesn’t turn into a mess?
Make rules readable: define users and groups separately, resources separately, and exceptions separately. Don’t mix meanings in one rule. Practical sequence: allow essential items first (updates, antivirus, corporate DNS/NTP), then permit business services to the groups that need them, and finally explicitly deny everything else. Every exception should include a reason and an owner; otherwise it becomes a permanent hole.
Which logs are essential when switching to Squid?
At minimum collect access logs (who accessed what and when) and Squid system logs to see DNS, TLS, disk and restart errors. Agree on field formats and retention times in advance so you can make quick queries during incidents without manual guesswork. Also restrict log access: they often contain personal data and activity details.
When is TLS inspection in Squid appropriate and when is it better to avoid?
If there’s no agreement with security and legal, safe default is not to decrypt traffic and rely on HTTPS metadata. TLS inspection is appropriate selectively on managed devices when policy allows and there’s a clear purpose (e.g., checking specific categories or segments). Banks, payments, healthcare, government services and any pinned-certificate services are usually best placed on an exclusion list to avoid breaking business processes.
How to organize a fast rollback if problems start after the switch?
A rollback plan must be practical and fast: know exactly what you will switch back (PAC/WPAD, policies, routes), who does it and how recovery is confirmed. Aim to restore traffic to the old proxy in 5–10 minutes and define “it got worse” triggers such as sudden spikes in 407/403/5xx or mass complaints about critical services. If rollback is slow or untested, migration is risky regardless of configuration quality.