Sep 21, 2025·8 min

On‑premise chatbot platforms: Rasa, Botpress and Microsoft Bot Framework

On‑premise chatbot platforms: how to compare Rasa, Botpress and Microsoft Bot Framework on NLU, integrations, dialog logs and security.

On‑premise chatbot platforms: Rasa, Botpress and Microsoft Bot Framework

Why choose a local chatbot platform without external cloud

A chatbot quickly becomes a destination for real business data. Even if it “just answers questions,” users tend to write whatever is most convenient: full name (ФИО), phone number, IIN, address, request details, sometimes scans and certificates. In corporate scenarios you add logins, contract numbers, amounts, payment statuses, data about patients or students.

Dialogs typically include personal data and identifiers, financial information, medical and social details, internal records (internal requests, incidents), fragments of documents and correspondence.

That is why many choose on‑premise chatbot platforms. The reasons are pragmatic: you need full control over where data is stored and who can access it; there are regulator and internal security requirements; you don't want to depend on external providers and traffic routes. In a protected contour it is important not only to “not send data outside” but also to be able to prove it: logging, network segmentation, key management and backups.

There is also common confusion in terminology. NLU is the part that recognizes intent and entities in text. Scenarios are the logic of steps and branches. Channels are where the bot “lives” (web chat, messenger, contact center). A platform can be good at text recognition but weak in integrations or dialog storage. This gap often appears only after a polished pilot.

You can deploy Rasa, Botpress and Microsoft Bot Framework locally. But beforehand, assess the surrounding systems without which a pilot won’t become a product: where the bot will run (VMs, Kubernetes, standalone servers), how it will connect to your systems (CRM, Service Desk, LDAP/AD), where to store dialogs and metrics (DB, SIEM, retention), how to arrange access and audit.

In organizations in Kazakhstan, where sovereignty and supply chain transparency requirements are especially noticeable, such implementations are often done with an integrator. For example, GSE.kz as a manufacturer and system integrator helps build a secure infrastructure and support so the solution doesn’t “fall apart” after the pilot.

Start by fixing requirements and constraints

Platform selection doesn't start with feature comparison but with a short 1–2 page document: where the bot will live and which rules cannot be violated. Without this, it’s easy to pick "the most convenient" system and later discover it fails security checks or doesn't support needed channels.

The first decision is placement. This can be your own data center, a private cloud or a fully isolated contour with no internet access. That determines installation and updates, and which external services are allowed at all. If you consider on‑premise chatbot platforms, clarify whether internet access is permitted for model downloads, telemetry, licensing and certificate checks.

Next, consider operations: who administers, who is responsible for patches and how quickly vulnerabilities are fixed. A platform can be convenient for development but heavy to maintain: updates may break connectors and rollback may be poorly thought out.

To avoid missing something, document a minimum set of requirements:

  • Contour and access: network segments, logging requirements, prohibition of external APIs.
  • Reliability: redundancy, uptime and recovery time objectives.
  • Scaling: expected peaks and concurrent dialog counts.
  • Channels: web chat on a portal, messengers, integration with a call center or Service Desk.
  • Languages: Russian and Kazakh (often critical), plus English if needed.

If a government body needs a bot in a protected contour with Russian and Kazakh and integration into an internal portal, network and update requirements will outweigh “ready‑made” cloud connectors. Such a list immediately filters out unsuitable options and saves weeks of pilot work.

Briefly about Rasa, Botpress and Microsoft Bot Framework

When comparing on‑premise chatbot platforms, it helps to separate two things: how the bot “understands” the user (NLU) and how you build logic, channels, storage and maintenance. The three options organize these parts differently.

Rasa is often chosen when you need highly customized NLU and complex dialog rules. It’s for teams with developers ready to work on training data, experiments and tuning. Rasa’s strength is deep control over intent and entity recognition and the ability to precisely tailor behavior to processes.

Botpress is usually picked when it’s more important to assemble scenarios quickly and work with visual dialogs. It’s commonly deployed next to internal services and configured so logs and dialogs remain inside the contour. This approach suits teams with an analyst and a developer for integrations who don’t want to constantly “live” in NLU settings.

Microsoft Bot Framework is a logical choice when the corporate environment is built on Microsoft and you need familiar development tools, accounts, access policies and integrations. It fits enterprises that already have standards around .NET, Active Directory and internal gateways.

In any of these three, you must still pick and agree on channels, dialog and metadata storage (and retention), monitoring and logging (what to write and what to mask), authentication and roles, and deployment contours (test, pilot, production).

In protected‑contour projects this often matters more than the platform brand: the final architecture depends on security, data and who will support the solution after the pilot.

How to evaluate NLU quality without self‑deception

NLU evaluation easily turns into a pretty report that fails on day one in production. To really understand what on‑premise chatbot platforms can do, test on data similar to yours and fix evaluation rules in advance.

Start with metrics but don’t limit yourself to “overall accuracy.” It’s more useful to see where the model confuses things and what it doesn’t recognize:

  • Precision and recall per intent: separately for each intent.
  • Confusion matrix: which intents the model systematically mixes up.
  • Entity extraction quality: correctness of values and boundaries (for example, contract number, IIN, date).
  • Share of requests the model should mark as unknown (and actually does).

Collect the test dataset from real inquiries: support chats, emails, tickets. Anonymize personal data and internal identifiers. Watch class balance: if one intent appears 10 times more often, the model will overpredict it. A good practice is to keep a “pilot” training set for development and a separate frozen set for final comparison.

Check robustness — often more important than perfect examples from the training set:

  • typos, missing spaces, varied casing;
  • slang and abbreviations ("VPN not working", "account blocked");
  • language mixing (Russian + Kazakh + English terms);
  • long messages with multiple requests.

Also evaluate failures: how the bot behaves on unknown requests. If an employee writes: "Can't log into 1С, password not accepted, and the printer won't print either," and the intent is unclear, the correct reaction is a clarifying question or escalation to an operator with context transfer — not a confident but incorrect answer.

Compare models fairly: same data, same post‑processing rules (synonyms, normalization, confidence thresholds) and the same escalation scenarios. Otherwise settings, not algorithms, will win.

Integrations: what to check in advance

In on‑premise projects integrations often determine a pilot’s fate. Even if NLU looks convincing in a demo, in a real environment the bot must call CRM, Service Desk, ERP and sometimes several systems at once. For on‑premise platforms this is especially important: all connections, rights and logs remain inside your contour.

Start with a data map: which fields the bot reads and writes, where personal data sits and who owns each system. This helps avoid risky choices like giving the bot direct access to production databases without an intermediate layer.

A practical pattern is to route calls through an API gateway. It provides a single control point for access, rate limits and clear logging. When security asks who called the CreateTicket method and when, it’s easier to answer if all calls go through one entry.

Not all integrations must be synchronous. If the bot creates an incident, triggers an approval or generates a document in ERP, asynchronous exchange via queues or events is usually more reliable: the bot responds quickly and the result arrives later (e.g., a ticket number). This reduces dependence on slow external systems.

Also verify user entry: AD/LDAP and SSO. Understand how the identifier is passed, how roles are assigned and what happens if a token expires.

Acceptance criteria for integrations are best recorded in advance:

  • Response times for key actions (e.g., client lookup, ticket creation).
  • Stability under peaks and limits on internal APIs.
  • Clear user errors and clear logs for support.
  • Retry on failure and duplicate protection.
  • Environment separation (test, pilot, prod) without manual edits.

Example: an employee in a protected contour asks the bot for ticket status. The bot authenticates via SSO, calls the Service Desk through an API gateway, and if the system is unavailable returns a clear message to the user and records the technical cause in logs for administrators.

Storing dialogs and logging without risks

S200 servers for a chatbot
GSE S200 servers are suitable for local services, databases and integrations without the cloud.
Select a server

In on‑premise projects the main risk is often not where the NLU model sits but how you store and expose conversation history. For an on‑premise chatbot platform decide in advance what you save, who sees it and how quickly it can be deleted.

Split data by purpose. It’s easier to give support access while restricting security to technical events:

  • Dialog texts: user messages and bot replies, attachments, session context.
  • Technical events: errors, timeouts, integration traces, request IDs.
  • Metrics: number of dialogs, scenario success rates, response time, load.

Store dialogs in a managed database with access control and backups. Files may be OK for prototypes: they are harder to protect and difficult to query. Technical events and metrics are often better sent to centralized logs or a SIEM so the security team can spot anomalies and have a single audit source.

Then define a retention policy. Set retention periods (e.g., 30–90 days for texts and longer for anonymized metrics), deletion rules and export procedures for user requests or incident investigation. Shorter retention makes approvals easier.

The most common mistake is logging personal data as‑is. Configure masking before writing: phones, IIN, emails, card numbers, addresses. If analytics require raw text, store it separately and tightly restrict access.

Treat access control as a process, not an agreement:

  • roles and least privilege (support, analytics, admins);
  • separated environments (test logs must not contain real PII);
  • audit of actions: who viewed, exported or deleted data;
  • two‑factor authentication for admins;
  • periodic review of access lists and keys.

This approach is crucial in protected contours of government agencies, banks and large enterprises where conversation history quickly becomes part of critical data.

Security requirements: what the security team will ask

Security usually doesn't start with engine selection but with questions about risks and controls. For projects where on‑premise chatbot platforms matter, describe which data the bot sees, where it writes and who can change its behavior.

The first request will be a simple threat model. Consider an insider (admin or developer access), leaked integration tokens (CRM, email, Service Desk) and vulnerabilities at integration points where the bot accesses internal systems.

Common questions include:

  • In which contour does the bot live and where is it allowed to go on the network?
  • Which secrets does the bot use (tokens, API keys), where are they stored and how are they rotated?
  • Where are dialogs and logs stored, how long are they retained and who can access them?
  • How are roles separated: platform admin, developer, support operator?
  • How are components updated and vulnerabilities fixed without downtime?

Next they check segmentation and isolation. A good practice is separate subnets for the bot and for integrations with explicit rules for allowed ports and services. In a protected contour it is important that dialogs, telemetry and backups do not leave the perimeter.

Secrets should not be in config files or code. Use a unified issuance and rotation mechanism and a fast revocation path if a token is compromised.

For logging there are two layers: bot runtime logs (errors, integration calls) and admin audit (who changed settings, NLU models, permissions). Logs must be protected from deletion and tampering.

How to prepare for review

Gather a minimal package: network diagram, list of integrations, data inventory for dialogs, role matrix, and update plan. In projects requiring closed segments this package speeds up security approvals and reduces the risk of rework during the pilot.

An example question that often appears late

If a bot helps HR or IT support, dialogs quickly contain IIN, ticket numbers, full names and incident details. Without masking rules, retention times and access separation, security can block launch even if NLU performs well.

Step‑by‑step plan to choose a platform for a pilot

Workstations for the bot team
We will select GSE L200 PCs for administrators and operators handling requests.
Select workstations

Pilots work better when you compare platforms under equal conditions. Then “which is better” becomes numbers and clear risks. For companies that need on‑premise chatbot platforms this is especially critical: network and data constraints quickly break beautiful demos.

First, list 10–20 typical bot tasks and the channels where it will live (web widget, corporate messenger, contact center). Mark which data is sensitive and must not be sent outside.

Prepare a unified set of test dialogs and success criteria. For example: intent recognition accuracy on “noisy” phrases, share of dialogs without operator involvement, response time, correct handoff to a human.

Run the pilot so the comparison is fair:

  • Build a minimal working bot on each platform with the same integrations (e.g., employee directory and ticketing system).
  • Run NLU tests and check resilience: what happens when a database falls, an API is unavailable or a container restarts.
  • Test load: simultaneous session peaks, log growth, recovery time.
  • Evaluate operations: monitoring, backups, updates, roles and audit.
  • Record deployment constraints: OS, containers, DBs, isolation requirements.

In the end, consolidate results in a single table and add total cost of ownership for 1–3 years: licenses, hardware, team effort, support and security risks. If the pilot runs in a protected contour, agree with security in advance where logs will be stored and who can access them to avoid architecture rework later.

Typical mistakes when choosing an on‑premise chatbot

The most common mistake is buying the “feeling from a demo.” Demos usually show an ideal scenario: clean phrases, a single integration, no access restrictions. Real quality appears only on your test dataset: real user requests, your terminology, your typos and your channels.

Another issue is mixing NLU and business logic into a single blob without clear rules. Then it’s unclear whether to fix errors in intents, dialog rules or integrations. A mature sign is clear boundaries: what NLU recognizes, what dialog rules exist and where integration logic lives.

Many pilots start without a dialog retention policy. Later it turns out logs contain IIN, phone numbers, addresses, medical data or credentials and security halts the project. Decide upfront what to log, how to mask sensitive fields and how long to keep data.

Support is often underestimated. On‑premise is not “install and forget.” You need monitoring, backups, updates, certificate management, access control and an incident response plan. If the platform updates often, decide who will test changes before rollout.

Finally, the risk of single‑person dependency: if the bot relies on one person who “knows everything,” changes become expensive and risky. Minimum safeguards:

  • documentation for scenarios and integrations;
  • automated tests for dialogs and NLU;
  • a clear release and rollback process;
  • environment scheme: dev, test, prod;
  • a list of owners and component owners.

In protected contours this is especially important: when security and operations join, the project survives only with transparent rules and reproducible builds.

Quick checklist before the final decision

Before choosing a platform and signing a budget, check basic things that usually surface after the pilot. For an on‑premise chatbot platform it matters not only whether it "can answer" but how it will live in your contour for months.

  • Deployment: installs locally and doesn't require critical external dependencies (mandatory SaaS, hidden APIs, external keys).
  • Data: it’s clear where dialogs, logs and files physically reside and how access is limited (roles, admin rights, action audit).
  • Operations: a plan for updates, rollback and backups (what is backed up, how often, how restore is verified).
  • NLU: tested on your phrases, with metrics (precision, recall, unknown share) and thresholds that block releases.
  • Integrations and load: connections work in the target network segment, survive peaks and don’t fail due to timeouts or queues.

Also check a “plan B” for failures. The bot must escalate to an operator: what the operator sees (dialog context), how data is transferred and what happens if the operator is unavailable. Prepare scenarios for NLU or integration outages: clear user message, incident logging, request retries and time limits.

A practical test: ask the team to show one full path — from incoming message to storage entry and alert. In protected‑contour projects (e.g., government or healthcare) this path usually shows whether the platform is ready for deployment.

A real scenario example: a bot in a protected contour

Security requirements review
We will review which data will appear in dialogs and how to configure masking and access.
Assess risks

A clinic or bank wants a chatbot for appointments, inquiries and checking request status. Personal data and dialog history must not leave internal systems. In such cases organizations typically choose on‑premise chatbot platforms and deploy everything inside a protected contour.

A minimum viable pilot often looks like: user identification (e.g., by client number or IIN via an internal service), appointment booking or ticket creation, status check by number and a short FAQ block. This is enough to validate value without overcomplicating the project.

Technically the scheme is simple: the chatbot app runs on a local server, a database for logs and service tables sits nearby, and access to medical or banking systems goes through an internal API gateway with limited rights. Dialog logs are best stored separately from operational data and sensitive fields should be masked from the start.

Pilots test grounded things: NLU quality on real user phrases and the frequency of handoffs to operators, response time at peak hours and resilience to integration failures, correct access rights (who can view logs and export reports) and audit completeness (what changed, when and by whom).

Successful launches often start with a limited contour: one channel, 2–3 scenarios, strict roles and short log retention. Then add topics and channels, expand integrations and gradually increase the bot’s autonomy while keeping security requirements unchanged.

Next steps: pilot, infrastructure and rollout

When requirements (channels, integrations, security, dialog storage) are fixed and 2–3 candidates are chosen, move to a pilot. The pilot must validate your scenario: typical questions, recognition quality, response speed, supportability and compliance with security requirements.

A practical pilot plan:

  • pick 1–2 processes with clear value (e.g., ITSM tickets and employee FAQs);
  • prepare a test set of phrases and dialogs from real users;
  • connect one key integration (AD/LDAP, ITSM or CRM) and verify access rights;
  • configure log storage and mask personal data;
  • agree on success criteria: NLU accuracy, response time, stability and maintenance effort.

Plan infrastructure in parallel. For on‑premise, CPU and RAM matter but so do disks for logs, redundancy and observability. Basic questions: where will containers or services run, where store dialogs, how to back up, how to set up monitoring and who will restore the system after failure.

If the bot is in a protected contour, hardware provenance and local support may be required. In such cases discuss implementation with a system integrator. For example, GSE.kz combines production and integration: for a contour you can select racks and compute on S200 servers, and admin/operator workstations on L200 PCs.

Rollout usually breaks on operations rather than development. Assign roles (bot owner, engineer, security, support), document update and incident procedures and decide whether 24/7 support is needed and how quickly you must recover from an outage.

FAQ

Why do companies choose a chatbot without external cloud?

Most often because dialogs quickly include personal data and internal identifiers — IIN, phone numbers, contract numbers, payment statuses, patient or employee data. A local platform gives control over where this data is physically stored, who can access it and how access is audited.

What should be checked first if the bot must run in a closed contour without internet?

Clarify immediately whether any internet access is needed for model downloads, licensing, certificate checks or telemetry. If such connections are forbidden, plan for fully offline updates and deliver packages through your internal deployment and verification processes.

How do NLU, scenarios and channels differ and why does it matter when choosing a platform?

NLU recognizes intents and entities in text; scenarios define dialog steps and branching; channels are where users interact with the bot (web chat, corporate messengers, contact center). Many pilots fail because they evaluate only NLU while issues appear later in scenarios, dialog storage and integrations.

When does it make sense to choose Rasa, Botpress or Microsoft Bot Framework?

Rasa is usually chosen when you need maximum control over NLU and complex logic and have a team ready to work with training data and tuning. Botpress is convenient when you need to build scenarios quickly and maintain them without constant fine‑tuning of recognition. Microsoft Bot Framework fits well when you already have a Microsoft ecosystem and established practices around .NET, Active Directory and internal gateways.

How to fairly evaluate NLU before going to production?

Use real user requests from support chats, emails or tickets and anonymize sensitive fields. Compare platforms on the same frozen test set and fix post‑processing rules; otherwise settings, not models, will win. Test behavior on typos, mixed languages and long messages with multiple requests.

Which integrations most often break on‑premise pilots and how to avoid that?

Start with a data map: what fields the bot reads and writes, where personal data resides and who owns each system. Prefer routing calls through a single API gateway to centralize access control and logging. For long operations (creating a ticket, approvals) use asynchronous exchanges so the bot responds quickly and does not depend on downstream latency.

How to store dialog history and logs safely to avoid creating a leak yourself?

Separate dialog texts, technical events and metrics, because they require different access rights. Mask IINs, phone numbers, emails and other sensitive fields before writing to storage, not during exports. Define retention periods and deletion procedures up front to avoid surprises during security review.

What questions does the security team almost always ask about a chatbot?

Security teams usually ask about where the bot lives, network rules, secrets (tokens and keys), roles and audit actions, and the process for updates and fixing vulnerabilities. Prepare a simple threat model and show how you limit access, where logs are stored and who can change bot behavior. The earlier this is documented, the fewer reworks after the pilot.

How to run a pilot so platform comparison is fair?

Define 10–20 typical tasks and one set of success criteria: recognition accuracy, share of operator escalations, response time and resilience to failures. Build a minimal equivalent bot on each platform with the same integrations, storage and logging rules. Also test ops: backups, monitoring, roles, updates and recovery after restarts.

What are the most frequent mistakes when choosing an on‑premise chatbot platform?

Common mistakes are underestimating the surrounding infrastructure: dialog storage, audit, backups, monitoring and update procedures. Another frequent error is starting without a masking policy and retention rules, which lets IINs and other sensitive data accumulate in logs. If you lack 24/7 support and operational processes, consider engaging an integrator to take on infrastructure and maintenance — in Kazakhstan, for example, GSE.kz does this.

On‑premise chatbot platforms: Rasa, Botpress and Microsoft Bot Framework | GSE