Replacing Nexus with open source: package migration and storage policies
Replacing Nexus with open source: how to plan package migration, set permissions, retention and disk control without downtime.

Why replace Nexus or Artifactory and what usually hurts
When a repository is small, a commercial Nexus or Artifactory feels like a "set and forget" solution. But as teams and the number of packages grow, problems emerge that hit both budget and reliability.
The first reason is almost always about money and vendor lock-in. Licenses grow with the number of users, repositories and extra features. If updates or support become unavailable, the repository turns into a single point of failure: without it builds and releases stop.
The second reason is operations. Pain usually appears in several places at once: disk runs out and cleanups are done manually; builds slow down because of indexing and heavy metadata; access rights spread out over time; snapshots, junk versions and duplicates accumulate.
Against this background, moving to open source is usually about control, not just "save at all costs." You want to set storage rules yourself, know where data lives, and avoid being tied to a single vendor.
Before you start, it’s important to list what you cannot lose during the migration. Most critical are familiar URLs and repository structure (so you don’t have to edit hundreds of configs), security (authentication, tokens, audit) and download speed for CI.
Even before choosing an Artifactory alternative it helps to agree on migration constraints: allowable downtime window (or requirement of zero downtime), switch order and RPO/RTO goals. For example, an RPO of “we don’t lose any published package” means a clear synchronization and publish-freeze plan, while an RTO of “builds must be working within an hour” affects DNS prep, mirrors and rollback scenarios.
Open source alternatives: what to compare before choosing
If you plan to replace Nexus with open source, start not from a product name but from a map of real needs. One solution can cover Maven well but be awkward for Docker/OCI. Another may offer a nice UI but require more manual permission setup.
First fix which package types live in your repo now and which will appear in the next 6–12 months. Usually that’s Maven, npm, PyPI, NuGet and Docker/OCI containers, but sometimes rare formats surface. The more accurate the list, the fewer surprises during migration.
Operation modes and compatibility
Check which modes you need and what they’re called in the chosen solution. In Nexus those are proxy, hosted and group. Open source alternatives may use different names, but the idea is the same: cache external sources, host your own packages, and provide a single address to clients.
Before installing in prod compare basics: supported formats and quirks (especially npm and NuGet), availability of proxy/hosted/group modes, ease of configuration for teams, permission model (roles, groups, granularity, protection against accidental publish), reliability (cluster or at least a clear recovery and backup plan), and UI and CI integrations.
Special note on Docker/OCI containers
Container registries often become the main disk consumer, so tag and cleanup rules should be checked in advance.
Minimum checks: can you protect release tags and forbid overwrites (including latest), how cleanup is implemented (old tags, unused layers), and how compatible the solution is with your build and deploy pipelines.
A practical point: in integration projects for government or large enterprises you usually need a predictable publish process. If the repo can’t strictly prevent overwriting release versions and lacks a clear retention policy for OCI, problems won’t appear during migration but a few months into operation.
Inventory: what you have now
Before replacing Nexus with open source it’s useful to describe the current picture as if handing the repository to another team. That saves days: fewer surprises, fewer rollbacks.
Start with a list of repositories and their roles. Usually there are stores for releases (immutable versions), snapshots (frequent builds), proxies to external sources (speed and resilience) and local repos for internal packages. Note what’s actually used and what is kept "just in case."
Then estimate volumes and growth. One total disk size is not enough: you need to know where the data accumulates. Often 20% of repos occupy 80% of space, and that defines transfer strategy and storage rules.
Record who and how publishes artifacts. Split publishing flows into CI (e.g., builds per commit), manual uploads, IDE publishes, separate teams or contractors. Manual uploads are a common sign you’ll need to tidy permissions and naming rules in the new system.
A minimal dataset to collect in one table:
- repository, type (release, snapshot, proxy, local) and owners
- current size, monthly growth, top heavy projects
- publish methods (pipelines, tokens, accounts)
- critical integrations (Jenkins or GitLab CI, IDEs, Kubernetes, vulnerability scanners)
- external sources that are proxied and why
If snapshots take more than half the disk and are published on every push while releases barely grow, the main migration risk is not transfer but that the new store will quickly fill without agreed retention.
Target layout: how the new repository will look
You need a target layout before migration. Otherwise you’ll move packages and then start reworking addresses, permissions and storage rules. It’s easier to document in advance how developers will publish and fetch artifacts in each environment.
Repositories by environment: dev, stage, prod
A clear option is to split publication by environment so a dev build can’t accidentally end up in prod. Usually:
- dev: frequent publishes, snapshots, experimental packages
- stage: release candidates, reproducibility checks
- prod: releases only, minimal publish permissions
Decide if separation is physical (different repos/buckets/directories) or logical (rules and permissions in one store). Small teams often get by with logic. For strict security boundaries you usually choose physical separation.
Domains and URLs: keep old or introduce new
If CI, Docker, Maven and npm are already configured for old addresses, keeping URLs significantly reduces risk. This is usually done with a DNS alias or a reverse proxy that accepts old requests and forwards them to the new service.
New URLs make sense when you want to clean things up: unified naming, a separate domain for internal services, clear repository names. If so, allocate time to update CI and developer configs.
Storage and availability
Many start with local disk but quickly hit growth limits. Network storage simplifies scaling and backups. Object storage works when you need cheap capacity and have experience with S3-compatible systems.
For availability choose a clear minimum: a single node with good backups or multiple nodes with load balancing. If growth is expected, plan where the repo will live (for example, in your datacenter on a GSE S200-class server or on an existing platform), and what CPU, memory and disk buffer you need for 12–18 months.
Step-by-step plan to migrate packages without surprises
The main fork in approach is parallel run or single-moment cutover. Parallel is safer: the old repo keeps working while the new one already accepts publishes and serves dependencies. A one-time switch is faster but requires perfect prep and a short change window.
Steps to follow in order
Start with a testbed and run the full flow for each package type: publish, download, metadata updates, snapshot handling, cache cleanup. Errors appear more often on publish and artifact signing than on read.
Then proceed in stages:
- Define switch strategy and freeze date (when you stop publishing to the old repo).
- Configure proxies to external sources and pre-warm caches for critical dependencies so the first build is not a surprise.
- Migrate hosted repositories: commons/libraries first, then services, then rare packages.
- Synchronize metadata and version tags where applicable, then enable test publishing from CI to the new repo.
- Switch clients: CI, developer workstations, clusters. Centralized variables and config templates make this easier.
A safe pattern to reduce risk: first move reads to the new repo while keeping writes on the old repo for 1–2 weeks. When builds are stable, enable publishing to the new repo and disable writes on the old one.
Rollback plan that actually works
Rollback should be tested and fit in minutes:
- restore old repository addresses in CI and templates
- temporarily reopen writes on the old repository
- note which versions were only pushed to the new repo and how to reconcile them back
This way replacing Nexus with open source becomes predictable: the system changes but the build and release paths remain intact.
Access rights: migrating roles and protecting publishes
During migration access is more likely to break than artifact transfer. People suddenly can’t download dependencies, or CI publishes artifacts to the wrong repository or a shared repo. Migrate permissions before the main migration and test with a sample project.
First document the access model: local users, LDAP/AD groups, roles, and tokens (user tokens, access tokens, API keys). Note separately what CI robots use and which repos are available to external teams.
Mapping permissions in the new solution
Permissions in almost any repository reduce to a small set of actions. Agree on clear roles (reader, publisher, maintainer, admin) and map them to permissions:
- Read (download) and Browse — for most developers
- Write (publish) — only for release pipelines and responsible teams
- Delete — a narrow group by process
- Admin — minimal number of people, ideally a separate group
Create CI service accounts separate from personal accounts. Give them minimal rights: publish only to specific hosted repositories and read only the necessary proxy/group repos.
Make tokens short-lived, store them in CI secrets, enable rotation (e.g., every 30–90 days) and revoke when someone leaves or a contractor changes.
For external teams and contractors follow a need-to-know principle: access only to required repositories and namespaces. Prefer publishing to a separate repo that can be disabled easily.
Logs and audit
To investigate incidents and pass audits decide in advance what to log. Minimum:
- who downloaded or published a package (user or service account)
- from where and when
- actions blocked by policies
- permission and role changes
- delete and republish operations
This makes it easier to find the cause when an unsigned version appears or someone accidentally overwrote a release.
Retention policy: how not to drown in versions
Retention is about predictability, not only saving space. When versions accumulate for years builds start pulling random dependencies and important artifacts are lost among temporary ones.
First split artifacts by meaning. Releases that go to production are usually stored long-term and rarely deleted. Temporary artifacts (feature branches, nightly, experimental packages) should live for a limited time.
A good starting point is simple rules that are easy to explain and verify:
- keep the last N release versions (e.g., 20) and do not touch versions labeled LTS
- delete temporary builds by age (e.g., older than 14–30 days)
- make exceptions by tags or prefixes (keep-, release-, hotfix-)
- pin critical artifacts before cleanup
- require clear tags and versions instead of random ones
For Docker/OCI you often need separate rules. The latest tag says nothing by itself — it’s just a pointer. It’s usually better to keep patterns: keep main and release tags longer, clean feature/* and nightly faster, and retain the last 5–10 images per service for rollbacks. Also verify you don’t delete images still referenced by clusters or deployment manifests.
Snapshots are easier when they have short TTLs and auto-cleanup enabled. But pin any snapshots used by testbeds or regression checks. For example QA may request to keep builds under test until the release cycle is closed.
Agree retention with teams using concrete scenarios. If a developer needs to roll back to a two-week-old version but rules delete it after 7 days, the rollback will fail. Either change thresholds or introduce an intermediate rule: anything that reached staging lives at least 30 days.
Disk control and maintenance
After replacing Nexus with open source problems often arrive later: disk unexpectedly fills, builds fail, and there’s no time to investigate. Volume control and regular maintenance should be mandatory parts of operation.
Limits and monitoring
Start with thresholds and clear alerts. Alerts should reach the people who can act, not just a single admin.
- 70% used: warning, assess growth and upcoming releases
- 85%: urgent cleanup per retention rules, stop unnecessary publishes
- 95%: emergency mode, temporarily forbid push, schedule maintenance window
To understand what grows, produce a weekly consumption report: top repositories, top projects, largest artifacts. This quickly identifies issues like nightlies published as releases or gigabyte-sized debug symbols with no lifecycle.
Cleanup and backups
Cleanup should be regular and predictable. A fixed schedule with a short maintenance window is better than rare “deep cleans” in the middle of a release. Test rules on a copy or a single repository before enabling auto-clean to avoid accidental deletions.
Proxy caches and temporary files can usually be cleaned safely: caches will be rebuilt by downloads, while locally published artifacts cannot. Check where temporary upload directories and incomplete sessions live.
Backups should cover both metadata and blobs:
- metadata (DB, configs, users/roles, settings)
- binaries (blob storage, artifacts)
- frequency: metadata daily, binaries according to your RPO (daily increments and a weekly full backup are often sufficient)
If a proxy repo grew by hundreds of gigabytes in a month while local releases barely changed, the cache is usually to blame. TTLs, regular cleanup of unused packages and reports of "what is downloaded but not used" help.
Example migration scenario for a medium project
Imagine a medium product: 3–4 teams, CI builds, using Maven (Java), npm (frontend) and Docker images for test and prod environments. Goal — replace Nexus with open source without stopping releases or losing artifacts.
First, deploy the new repository in a separate environment and configure proxies for external sources (Maven Central, npm registry, base Docker images). This reduces internet dependence and lets you test caching without touching current pipelines.
Migration then proceeds in waves. It’s convenient to start with releases: they change rarely, and it’s easier to verify integrity, signatures and metadata. After releases move snapshots, then branch builds (feature, hotfix) which usually create most of the "trash."
Example schedule:
- Day 1–2: deploy new repo, configure proxies and basic repositories (maven-releases, maven-snapshots, npm, docker).
- Day 3–4: migrate release artifacts and images, verify checksums and tags.
- Day 5: migrate snapshots and branch builds, enable initial retention policies.
- Day 6: switch one project in CI to publish to the new repo and observe.
- Day 7: switch the remaining projects and teams.
Start with simple permissions: developers read and fetch dependencies, CI publishes using a separate token, admins manage repos and deletion rules. This reduces the risk of accidental release overwrite or publishing to the wrong repo.
Finish by updating developer configs (settings.xml, .npmrc, docker login), measuring download speed and disk load, and keeping the old repository read-only for another 2–4 weeks. After a safe period when releases are repeating from the new store, decommission the old server.
Common mistakes and how to avoid them
The most expensive mistake is starting the move with "what’s easiest" without a full picture. Rare repos (an old npm or private Docker registry) used once a month often break a release at the worst moment. Remedy: before copying, document formats, repos, proxies, cleanup schedules and real clients (CI, developers, build agents).
A second common issue is overly broad CI permissions. If a build token can delete, overwrite or publish into a release repo, a wrong tag is enough to lose trust in the store. Separate rights: CI publishes to snapshots or staging, while promotion to releases is a separate role and pipeline step.
How not to break builds on cutover day
Breakages usually come from client settings: mirrors, paths, certificates, repo names. Allocate time for a test: one real service, one real pipeline, a full clean build. Keep a rollback plan: how to quickly restore old URL or proxy if the new repo shows a gap.
Retention and backups: be cautious with automation
It’s risky to enable retention immediately after transfer. Typical scenario: rule “keep last 10 versions” deletes the only usable tag that happened to be older. Run retention in report mode first, agree exclusions (releases, tags, important branches), and only then delete.
Before you start check basics:
- do you have backups not only of blobs but also of metadata (indexes, DB)?
- have you tested restore on a separate machine?
- is disk growth measured and are alerts configured?
- is the DNS/URL change and rollback procedure documented?
- have you listed non-obvious clients (old agents, scripts, cron jobs)?
If a Maven project has been built for years with one settings.xml, changing the repository name without an alias will likely break builds even if all artifacts are already migrated.
Short checklist and next steps
To prevent replacing Nexus with open source from becoming endless manual work, close the basic questions before final cutover. Agreeing and documenting rules once usually costs less than later investigating missing tags, mixed-up permissions or sudden disk exhaustion.
Check the minimum:
- Formats and sources recorded: which packages are stored (Maven, npm, Docker), which repositories are needed and who owns each.
- Permissions documented: who reads, who publishes, who administers, and what counts as "prod" (e.g., forbid overwriting releases).
- Retention policy agreed: how many versions to keep, what to delete, what exceptions apply (critical releases, regulatory needs, "golden" images).
- Disk control set: thresholds, alerts, scheduled cleanup and a clear emergency procedure for overflow.
- Test run completed: migrate part of the packages to a testbed, verify builds and have a documented rollback plan.
Next steps: proceed gradually so you don’t break development mid-sprint. Schedule the cutover window and notify teams which operations will be frozen (publishing, deletion, creating new repos). Run critical pipelines: release, clean build, dependency fetch in an isolated environment.
Practical next actions:
- Confirm the cutover date and brief rules for migration day.
- Prepare instructions for teams: addresses, tokens, publishing rules.
- Final verification: storage size, speeds, permissions, retention, alerts.
- Plan a post-migration audit in 1–2 weeks: what grew, what to clean, where restrictions are missing.
If you need help designing and implementing the migration you can involve a systems integrator. For example, GSE.kz (gse.kz) can handle the infrastructure side: server selection and delivery, deployment and 24/7 support.
FAQ
Why replace Nexus or Artifactory with open source?
Usually it comes down to license cost and vendor lock-in, plus operational pains: disk fills up, manual cleanups, slowed builds because of indexing and heavy metadata, drifting permissions, accumulation of snapshots and duplicates. When the repository becomes a single point of failure, any problem immediately stops builds and releases.
How do I choose which open source repository is right for me?
Start by listing the formats you actually use now and those you expect in the next 6–12 months: Maven, npm, PyPI, NuGet, Docker/OCI and any rare formats. Then check how the candidate solution implements proxy/hosted/group modes (names may differ). Those modes determine caching of external sources and whether you can present a single URL to clients.
Do I need to keep old URLs during migration?
In most cases it’s critical to keep familiar URLs and repository structure to avoid editing hundreds of CI and developer configs. This is often solved by a DNS alias or a reverse proxy that accepts the old requests and forwards them to the new service. Introduce new URLs only if you’re prepared to update all client settings and want to standardize naming immediately.
What must I collect before starting migration?
Start with an inventory: list repositories and their roles, sizes and growth, who and how publishes artifacts, which external sources are proxied, and which integrations depend on the repository. Also mark manual uploads and nonstandard clients — those are the ones that usually break during cutover.
What’s safer: run in parallel or switch at once?
Parallel run is generally safer: the old repository stays active while the new one serves dependencies and gradually accepts publishes. In practice it’s convenient to switch reads to the new repository first and keep writes on the old one for a short period until builds are stable. A single-moment cutover should be chosen only with excellent preparation and a well-defined change window.
How to determine RPO/RTO requirements for repository migration?
Define RPO and RTO in advance. If RPO is “we don’t lose a single published package”, you need synchronization before cutover and a defined freeze for publishing to the old system. If RTO is “builds must be working within an hour”, prepare a fast address-switch procedure, a warmed cache for critical dependencies, and a tested rollback plan.
How to migrate permissions without breaking access?
First document the access model: local users, LDAP/AD groups, roles, CI service accounts and tokens. In the new repository create separate service accounts for CI and give them minimal permissions: publish only to specific hosted repositories, and only read access to required proxy/group repos. That lowers the risk of accidental release overwrites or publishing to the wrong place.
What to look for first in a Docker/OCI registry?
Check whether you can prevent overwriting release tags and how cleanup works: removal of old tags, unused layers and growth control. The practical minimum is release protection, a predictable retention policy and compatibility with your build and deployment pipelines. If these are missing, problems usually appear a few months later when disk usage spikes.
How to set retention so I don’t delete important versions?
Start with simple rules: keep releases for a long time and delete temporary builds by age. Snapshots and branch builds usually need short lifetimes, but allow exceptions for artifacts used by test beds or regression runs. Before enabling automatic deletion, run retention rules in report (dry‑run) mode so you don’t remove important artifacts.
How to avoid disk overrun and what to do with backups?
Set clear disk-fill thresholds and alerts, and run regular reports showing what consumes space. Have a scheduled cleanup and backups covering both metadata (configs, users, roles, DB) and blobs. Crucially, test restore on a separate testbed at least once — otherwise backups might be unusable in a real incident.