For Developers & Engineers

SARAH Code. Coding by voice, by chat, by intent.

Describe what you want built — out loud, in a chat, or on a screen. SARAH Code does the engineering work inside your own long-lived repo, then hands the diff back through whichever channel you started in.

Delivered exclusively over our Private Enterprise IP Network. No CLI to install, no keys to manage, no context to rebuild between sessions, and no public-internet hop between your seat and your workspace.

Built for the developer who already knows what they want — and wants the engineering done correctly, on owned silicon, in their own repo.

Architecture Comparison · 2026

SARAH AI Suite on NVIDIA DGX GB300
vs. OpenClaw / Hermes on a Public-Cloud VPS

Two ways to run an agentic AI platform. One owns the hardware, the memory, the storage, and the network. The other rents all four from a multi-tenant vendor and reaches them over the Public Internet. The architectures are not comparable — and the spec sheets prove it.

Jump to specs · Why the network matters

The Two Architectures

Same agentic workload — answer a customer call, look up the CRM, book the meeting, send the email. Two completely different stacks underneath.

Sovereign · Our Mini Data Center · GB300

SARAH AI Suite

SARAH Spark 2 Router on the customer premise · up to 400 GE backhaul to our Data Centre · DGX GB300 there runs the LLM for every call · audio never leaves your premise · zero Public-Internet hop.

Edge (on-prem)
SARAH Spark 2 Router · voice path local · audio never leaves the premise
Backhaul
Up to 400 GE · Private Enterprise IP Network · physical fibre · zero Public-Internet hop
DC compute
72× NVIDIA Blackwell Ultra · GB300 full rack · LLM inference served over the up-to-400-GE backhaul
DC VRAM
20 TB HBM3e total · 3 GB dedicated per active conversation
Memory bandwidth
576 TB/s aggregate · per-GPU HBM3e
Storage
Local NVMe at both ends · weights on-DC · per-call working set on-Spark
Vendor reach
Direct peering — Google Cloud, AWS, Azure, Cloudflare
Public-internet exposure
None. The platform is not addressable from the open web.
Tenant model
Single-tenant. The hardware is yours.
Tenant · Public Cloud · Shared

OpenClaw / Hermes on a VPS

Open-source agent framework on a rented GPU instance with Public-Internet ingress.

Compute
1× shared GPU on a rented instance (A10 / A100 / H100 typical)
VRAM
16–80 GB on instance · multi-tenant slot · no per-call dedication
Memory bandwidth
~2–3 TB/s peak · contended with co-tenants
Storage
Cloud block storage · network-attached · ms latency
Network
Shared cloud fabric egressing the Public Internet for any external call
Vendor reach
Public-internet hop to every dependency, even same-cloud services without VPC peering
Public-internet exposure
Full attack surface · public IPs · DDoS vectors
Tenant model
Multi-tenant. Your conversation shares silicon with strangers.
Four Ways to Ask

Voice. Portal. WhatsApp. Telegram.

SARAH Code answers in the channel you opened. Start a feature by voice, watch the diff land in the portal, get the result on WhatsApp or Telegram. Same workspace, same memory, same engineer.

By voice

Call your SARAH number. Describe the change. "Add a refund button to the order page, only show it to admins." SARAH confirms scope and texts you when the diff is ready.

In the portal

An IDE-class workspace under your account. File tree, diff view, conversation log. Watch the work happen, accept or reject the change, push to your branch.

On WhatsApp

Message your SARAH number. Describe the task in plain language, attach screenshots or specs, get the diff back as a document. The fastest way to put SARAH Code in the hands of every operator on your team without onboarding them to anything new.

On Telegram

Type a task. Get a patch attachment back. Long tasks come with a progress note and a follow-up message when the work lands. Built for builders who live on their phone.

Long-Lived Workspace

One repo per customer. Forever.

Your SARAH Code workspace is a real git repository on persistent storage in our Chicago Data Center. It survives across sessions, channels, and devices. Your prior conversations, decisions, and code all live in the same place.

Persistent state

Pick up where you left off. No "what were we building again?" — SARAH already knows the answer.

Isolated

Filesystem and process isolation per customer. Your code is yours alone. We never train on it. We never cache it.

Portable

It is a real git repo. Push to GitHub, GitLab, your own Gitea — anywhere you want. We never lock you in.

SARAH Code vs DIY

Doing it the hard way.

Six layers a senior engineer ends up rebuilding the moment they pick up a raw CLI. SARAH Code ships them all in the first call.

Capability SARAH Code Direct CLI / DIY
Voice input Native, sub-second confirmation, callback for long tasks Type only. No phone, no callback, no hands-free.
Multi-channel handoff Voice and portal and Telegram on the same workspace One channel at a time. Context rebuilt by you each time.
Long-lived repo Persistent per customer, hosted, backed up nightly You install, configure, backup, restore.
Setup time Zero. Call the number. Speak the task. Install CLI, set keys, configure, learn.
Account + identity One SARAH login covers voice, code, integrations, smart home Per-tool accounts, per-tool billing, per-tool keys.
L1 support Human support on the SARAH side, plus SARAH herself walks you through it Docs and a community Discord.
Pricing

Per-seat, per-month.

Every tier is delivered over Private Enterprise IP Network connectivity. Zero public-internet hops between you and your workspace.

Enterprise

$10,000 / user / month

For organizations who need sovereign, isolated, regulated-industry-grade software engineering at scale.

  • 20 GB dedicated vRAM · zero contention
  • Everything in Pro
  • Dedicated workspace volume
  • Private Enterprise IP Network with named-circuit option
  • Named technical contact + SLA
  • Co-branded portal option
  • Annual or multi-year contract
Detailed Specifications

Eight layers, side by side.

Compute, memory, storage, network, security, sovereignty, cost. Every layer of an AI platform measured against its real-world counterpart.

Layer SARAH AI Suite (NVIDIA DGX GB300) OpenClaw / Hermes on a Public-Cloud VPS
Edge / DC architectureOn-prem SARAH Spark 2 Router handles the voice path locally · DGX GB300 in our DC handles inference · up to 400 GE between them · audio never leaves the premiseEverything on one rented GPU in someone else's region · every stage contends for the same VRAM slot
GPU silicon72× NVIDIA Blackwell Ultra · GB300 full rack · Light Matter chips & switches · LLM-only workload1× shared instance GPU · whatever the cloud vendor schedules you
VRAM (total)20 TB HBM3e · single coherent pool16–80 GB on the instance · ends at the box boundary
VRAM (per call)3 GB dedicated · isolated to that conversation · zero contentionNo per-call allocation · whatever the runtime scrapes from a shared pool
Memory bandwidth576 TB/s aggregate~2–3 TB/s peak per GPU · degrades under noisy-neighbour load
Model storageLocal NVMe · ~670 GB Deep Thinker + ~244 GB Doer · loaded once, served foreverCloud block storage or HuggingFace pull at boot · re-downloaded on instance restart
Per-call working memory128K-token context window held in dedicated VRAM for the life of the callContext window survives only as long as the shared GPU lets it
Backbone networkUp to 400 GE from SARAH Spark 2 Router to DGX GB300 · Private Enterprise IP Network · physical fibre interconnectShared cloud-vendor fabric · TCP over the open internet for anything external
Public-internet exposureNone. The platform is unreachable from the open web by design.Public IPs · open ports · part of the cloud-vendor's blast radius
External-vendor reachDirect peering with Google Cloud, AWS, Azure, Cloudflare · private interconnect, no public hopPublic-internet egress to every service, even same-cloud APIs unless you build VPC peering yourself
Inference latencySub-400 ms first-word · streaming TTS · parallel sentence synthesisVariable: cold-start + queue + cloud-network hops + shared GPU contention
Tenant modelSingle-tenant · the silicon is physically yoursMulti-tenant · your conversation shares hardware with arbitrary strangers
Data sovereignty100% on your premises (or our PEIPN) · data never crosses borders unless you say soVendor terms govern what they do with your prompts and outputs
Cost modelBuy once, own forever · zero per-token meter · zero per-block chargePer-token, per-second-GPU, per-egress-GB · the meter never stops
Vendor lock-inNone. The hardware and the software are yours; open-source LLMs fine-tuned in-house.Cloud vendor + framework vendor + occasional model vendor — three locks per workflow
Failure domainA single rack you can see · 394 restore points · 200 kW EMG off-grid powerA region in someone else's data centre. Their outage is your outage.
Compliance postureSOC 2 / ISO 27001 / GDPR / CCPA / HIPAA / PCI DSS · examiner-ready audit trailInherits cloud-vendor SOC 2 + your own scaffolding · audit trail you have to build
The Network Layer

Up to 400 GE connectivity to our Data Centre · via the SARAH Spark 2 Router · through to the DGX GB300.

Up to 400 GE backhaul between the on-prem SARAH Spark 2 Router and our Data Centre. Only the prompt and response text traverse the long-haul link — your audio never leaves your premise. No Public-Internet hop. No shared pipe.

Direct peering with the major hyperscalers

SARAH AI Suite's Private Enterprise IP Network terminates directly into the four interconnect fabrics that run most of the world's cloud workloads. When SARAH needs to read a Google Sheet, post to an S3 bucket, hit an Azure Cognitive endpoint, or push through Cloudflare — none of those packets touch the open internet. They ride a private cross-connect.

GCPGoogle CloudDirect peering
AWSAmazon Web ServicesDirect peering
AZMicrosoft AzureDirect peering
CFCloudflareDirect peering
4 TB/E
Layer-2 fibre backbone
10 GE
Edge minimum
0
Hops through the open internet
1 VLAN
Per client site · zero exposure to other tenants
Every client site runs in its own VLAN on the PEIPN. The physical fibre is shared with our other clients, but the Layer-2 boundary is yours alone — no broadcast, no ARP visibility, no inter-tenant traffic ever lands on your interface. Your private network ends at your premises, full stop.

The OpenClaw / Hermes VPS comparison: a public IP, a TCP egress over a shared cloud fabric, a Public-Internet hop to every external dependency, and a full attack surface that the public web can probe at will. Same workload. Two universes of risk.

The Cost Reality

The meter is the point.

An open-source agent framework on a rented GPU is "free" the way a treadmill at a gym is free — you pay for everything attached to it. SARAH AI Suite does not have a meter to attach.

Cost itemSARAH AI SuiteOpenClaw / Hermes on a VPS
GPU instance timeIncluded · the silicon is yoursPer-second meter · 24/7 to keep the agent warm
Token throughputNo per-token meter · run it as hard as the silicon will goPer-token bill if you use a hosted LLM behind the framework
Egress bandwidthDirect peering · effectively flat-rate inside the PEIPNPer-GB egress meter to every external destination
Storage I/OLocal NVMe · no IOPS billPer-GB-month + per-IOPS on cloud block storage
Idle costZero. Idle silicon is silicon you already own.The VPS is billing the moment you spin it up — even at 3am with nobody calling
Year-3 cost trajectoryMaintenance only ($300K/yr Enterprise · $3M/yr DC)Same line items, same meters, three more years of inflation
OpenClaw and Hermes are good open-source agent frameworks. Run on a public-cloud VPS, they will get you a demo. They will not get you an enterprise. Once the conversation matters, the architecture decides everything — and a sovereign, GB300-class platform on a 4 TB/E private fibre network is a different category of system than a multi-tenant agent on a rented GPU.
200×
More memory bandwidth (GB300 vs A10)
3 GB
Dedicated VRAM per call · zero contention
0
Public-internet hops in the call path
Open a Workspace

Open a SARAH Code workspace.

If you are already on SARAH AI Suite, SARAH Code shows up in your portal sidebar the day after we activate it for you. If you are new, the fastest path is to book a call with the creators — we will tell you whether SARAH Code fits the way you build.

01
Schedule a call
02
We activate your workspace
03
Call SARAH on her dedicated extension · authenticate over SIP · start coding by talking to her
04
Ship the diff