Why law firms should stop arguing about cloud AI and just run it on-prem
Most law firms are stuck between 'AI is table stakes' and 'we can't put client privilege in the wrong training data.' On-prem AI resolves the dichotomy. Here's what it costs and what it does.
Half the law firms I talk to have an AI committee. The committee produces a policy. The policy says "we don't use public chatbots for client work." Some of the lawyers use public chatbots for client work anyway, because the alternative is drafting the memo from scratch and they have 12 other memos due. The policy is observed in the breach.
This is not a sustainable equilibrium. The answer is not "ban AI harder." The answer is give the lawyers a tool that does what the public chatbots do, without the data ever leaving your walls.
That's what GhostLink is.
The actual problem
Law firms have three real constraints that cloud AI can't satisfy cleanly:
1. Client privilege. You can't put privileged material into a system that trains on it, or that retains it anywhere your client's adverse counsel could subpoena.
2. Ethical walls. Firm A defends Pfizer. Firm B sues Pfizer. Firm A's AI system has to have zero path (architectural, not promise-based) to leak prep strategy to Firm B. Cloud multi-tenancy is structurally wrong for this.
3. Regulatory / insurance. Your PI carrier has opinions. Your Law Society has opinions. Your clients' in-house counsel have opinions. "The vendor said they don't train on our data" is not a reassurance that survives a root cause analysis after a breach.
The cloud AI vendors keep trying to resolve this with SOC 2 reports and enterprise terms. Those help, but they're contractual protection against risks that are partly architectural. If your data never enters a multi-tenant inference pool, you never need to trust the SOC 2 report.
What on-prem AI actually looks like in 2026
The picture changed a lot in the last 18 months:
- Models caught up. Modern open-weight models are now good enough for 90% of the document review, summarisation, contract drafting, and research tasks firms want to automate.
- Hardware got cheap. A workstation-class GPU rig runs those models at practical speeds for a small-to-mid firm. Under AUD $35K including the workstation.
- Tooling matured. Deploying a production inference endpoint is now a one-afternoon job, not a six-month systems integration.
The net effect: a mid-sized firm (20-80 lawyers) can run local AI for under $80K of total build + hardware, with zero ongoing API fees to any cloud vendor.
What GhostLink does
GhostLink is the package I ship for regulated industries:
Document intelligence
- Ingest your document management system (most major DMS products are supported)
- Index every document into a private vector store that runs on the same machine or a separate one
- Query: "find me every clause in our M&A deals from the last 3 years that waives consequential damages and has a cap below 1x fees"
- Cite-by-pulling — every answer links to the source document and page
Contract review
- Drafting assist ("rewrite this clause to be seller-favourable")
- Clause extraction (lookups against playbooks)
- Redlining against your firm's playbook
- All of it in your firm's tone and style, trained on your own precedent bank
Research
- Ingest CaseBase, AustLII exports, LexisNexis downloads
- Cross-reference with client matter context that lives in your DMS
- Never sends anything to an external service
Voice interface (optional)
- Partners dictate drafts to the assistant via a local voice interface
- All on-prem — the microphone audio never leaves the building
Ethical walls
- Matter-based access control — the assistant physically cannot see documents tagged to walls the user doesn't have access to
- Per-matter audit logs — every query is logged to the matter file, admissible in PI defence
What it doesn't do
- Replace lawyers. Obviously. It drafts, summarises, searches, cross-references. Humans practise law.
- Work offline from the cloud forever. Some clients ask for fully air-gapped. That's possible but more expensive (you maintain your own model updates). Most of our clients choose "hybrid air-gap" — the inference stays local, but model updates come from our encrypted distribution channel on a schedule they control.
- Solve hallucinations. No AI does. But grounding on your own document corpus and cite-by-pulling reduces hallucinations 10x compared to asking a generic cloud model the same question.
The hardware reality
For a small firm (5-20 lawyers):
- Single-node deploy: one workstation with workstation-class GPU(s), plenty of RAM, fast local storage
- Cost: AUD $18K-$28K hardware
- Performance: faster than most humans read
For a mid firm (20-80 lawyers):
- Two-node deploy: workstation for inference + separate server for vector DB + document processing
- Cost: AUD $35K-$55K hardware
- Performance: Concurrent users up to ~15 without queueing
For a larger firm (80+ lawyers):
- Rack-scale: GPU servers in your data centre (or colocated)
- Cost: quoted per scope, usually AUD $80K-$250K depending on concurrency and redundancy requirements
- Performance: multi-model routing for different task types with proper load balancing
The build scope
Standard GhostLink build includes:
- Hardware spec + procurement (we quote, you buy direct — no markup)
- OS + inference stack deploy (we set it up and harden it)
- DMS connector (read-only integration with your DMS)
- Vector index build (one-time indexing of your existing corpus)
- Web interface — internal URL your lawyers go to, SSO via your existing identity provider
- Admin dashboard — usage analytics, model switching, audit logs
- Staff training — half-day session per office
Build cost: AUD $15K-$40K depending on scope. Timeline: 4-8 weeks.
The ongoing cost
Here's the part the cloud AI vendors don't want you to do the math on:
A firm doing serious AI-assisted work with cloud APIs burns $8-15 per lawyer per day once you're using it for actual drafting and research. For a 40-lawyer firm, that's $128K-$240K/year in API fees alone. Forever. Growing.
On-prem: you buy the hardware once. Amortise over 3-4 years. Electricity + maintenance + model updates run about $8K-$15K/year for a mid-firm deployment.
Crossover: under 12 months for most mid-firms. After that, you're saving six figures annually AND you control your data.
Regulation watch
The specific legislation to track in Australia:
- Privacy Act 1988 + APPs — unchanged, but the OAIC has signalled heightened scrutiny on AI processing of personal information
- Australian Information Commissioner guidance on AI (updated 2025) — emphasises privacy impact assessments for AI systems processing personal data
- Professional Conduct Rules (per state Law Society) — most now have explicit AI usage guidance; cloud AI for client matters is increasingly discouraged without specific waivers
- Supervisory Notices from PI insurers — several major firms have received written warnings about cloud AI use
On-prem deployment sidesteps most of these cleanly.
How to start
- Run the scope generator with a description of your firm size + document volume + main use cases
- Book a 30-min call — NDA first if you want; I sign before we start talking specifics
- Pilot in one practice group — typically corporate / M&A or dispute resolution, where the document volume pays back fastest
- Expand after 60 days of measured usage
Minimum engagement: AUD $15K for a pilot. Most firms end up in the $35K-$80K range for full deployment + training.
FAQ
Q: Can the model still be good if it's running locally? A: Modern open-weight models outperform last year's frontier cloud models on most legal-task benchmarks. The gap with 2026's frontier cloud models exists but is narrow for grounded document-work tasks, which is 80% of what law firms actually need.
Q: What happens when new models come out? A: We push updates on your schedule via our encrypted distribution channel. You control when. You can skip versions. If you want fully air-gapped, you handle updates yourself — we document the process.
Q: Who owns the hardware? A: You. We spec it, you buy direct from the vendor. No markup. No lease.
Q: What if I change my mind and want cloud AI later? A: The interface layer is portable. The same frontend that talks to local models can talk to a cloud reasoning layer with a config change. You're not locked into on-prem — you just default to it.
Q: What industries besides legal does GhostLink work for? A: Healthcare (clinical coding, medical records summarisation), finance (compliance review, trade settlement reconciliation), government (internal document triage, FOI processing), defence (classified environment AI). Anywhere client / patient / national privilege trumps API convenience.
If this sounds like your firm, book a 30-min call. Mutual NDA first if you want it. Scope + fixed-price deployment plan within a week of the NDA being signed.
Got an AI project in mind?
I'm Nikolaos. I build the kind of systems I write about — solo, end-to-end, Melbourne. 30-min call. Fixed-scope quote in 48 hrs. No decks.
What my prospects feel, not what my builder built
A prospect books a call. Sixty seconds later they have a briefing about their own business. I walk in having studied them for a week. Neither of us wastes time.
Pulse.AI is taking bookings. Promo + Pro Packs inside.
Pulse.AI is officially open. Launch promo for the first 20 clients. Industry Pro Packs for tradies, hospo, small biz, lead gen and e-comm from $4,500. Consulting from $350/hr. Here's what's on the menu.
We rebuilt a community FM station as a 24/7 digital platform in 12 weeks. Here's what it unlocked.
Greek FM went from a single Melbourne FM transmitter with no digital presence to a 5-surface 24/7 AI-hosted platform in 12 weeks. The business outcomes, the capability upgrade, and what every community station should be paying attention to.