Praxia — Multi-Agent Orchestrator with Cyclic Memory

Why Praxia

38+ advantages no other framework offers in one package.

🧠

Personal → org memory loop

Senior staff's "magic prompts" auto-promote into shared knowledge via three independent paths: frequency, outcome correlation, and LLM self-eval.

🔄

3-path promotion engine

Frequency-based, outcome-correlated, LLM-scored. Run in parallel — never depending on a single signal. Configurable thresholds for auto-promote vs review.

🧬

Skills also promoted

Not just memory — your personal skills get tracked, scored, and promoted to the org skill catalog when they prove themselves.

📊

Outcome tracking built-in

record_outcome() attaches success/failure to episodes. The consolidator uses these signals statistically — no separate analytics pipeline needed.

🪪

Memory mode toggle

Per-user switch: accumulate (default) or read_only. Read-only sessions silently drop writes — useful for sensitive content. Admins can lock the mode tenant-wide or by role.

🧬

Multi-LTM fusion + routing

Run several LTMs in parallel and fuse with Reciprocal Rank Fusion — or route per query (temporal → Zep, audit → JSON, entity → Mem0). English + Japanese keyword detection. Higher recall without picking a winner.

🔌

6 LTM backends

JSON, Mem0, LangMem, Letta, Zep, HindSight — switch with one line. Plus Graph layer (optional) for relationship-heavy domains. Zero vendor lock-in.

🤖

Multi-LLM (15+ first-class · 100+ via LiteLLM)

Claude, ChatGPT, Gemini, Gemma, Qwen-API, Qwen-local (Ollama), DeepSeek, Mistral, Grok, Llama, Cohere, Perplexity, Phi + 100+ via LiteLLM. Same models on enterprise clouds: azure/* (Azure OpenAI), azure_ai/* (AI Foundry), bedrock/* (AWS Bedrock), vertex_ai/* (GCP Vertex AI). Auto-detect from env vars; switch model per-call.

🔐

Auth, RBAC, SSO, audit — in OSS

API key + JWT + OIDC (Google/MS/Okta/GitHub/Keycloak) + 4 default roles + append-only audit log. Most competitors paywall this.

🔐

User-delegated OAuth

Each Praxia user authorizes Box / SharePoint / Dropbox / Drive / Salesforce with their own credentials. The external system's native ACL is enforced per Praxia user — alice can only see what alice has access to.

🔑

KMS-backed token encryption

OAuth tokens use envelope encryption — fresh DEK per write, AES-GCM payload, DEK wrapped by your KMS. 5 adapters: local / aws / azure / gcp / vault. Master key never lives on the application host.

🌐

Production OAuth callback (HTTP)

praxia serve exposes /api/v1/oauth/{provider}/{start,callback,status}. Multi-worker safe state cache (TTL-pruned JSON), pinned redirect URI via PRAXIA_PUBLIC_URL, optional success-redirect to your frontend.

🛡

Resource access policies (ACL)

Glob-pattern allow / deny rules per resource type (connector, memory, prompt, skill). Built for enterprise IS departments. Every decision audit-logged.

🤖

Autonomous agent — LLM-driven tool-use loop

An LLM-driven tool-use loop over your full Praxia stack — personal memory, org memory, frozen layer, skills, connectors. The agent picks tools on its own (search → run skill → pull connector → answer) with ACL gates and audit logging. Ships as praxia.agent.AutonomousAgent, praxia agent run, and an MCP meta-tool for remote clients.

✨

Prompt Designer — turn intent into a polished template

Describe the task in one line ("score contract risk 1-5 in JSON") → get a production-grade prompt design back: tuned system message, ${variable} user template, 2-3 few-shot examples, 5-criterion rubric. Per-LLM idioms applied automatically (Claude XML / OpenAI JSON-mode / DeepSeek-R1 reasoning / Mistral concise / Llama numbered steps).

🎨

Document Designer — code-gen pptx / docx (Claude-Skills-style)

The LLM authors python-pptx / python-docx code, a sandbox runs it (AST allowlist + 30s timeout + 512MB cap on POSIX), and you get a design-rich .pptx / .docx back — multi-column layouts, matrix slides, embedded matplotlib charts, themed branding (colors / fonts / logo / footer from .praxia/themes/). On traceback the error is fed back to the LLM and the attempt repeats up to 3 times. Themes managed in Admin → 🎨 Themes.

⚙️

Workflow-specialized flows

Sales prep, logic checking, RAG self-correction — three production-ready multi-agent pipelines that run in 5 minutes. No bespoke orchestration code required.

🎯

6 default business skills

Investment, sales, design, purchasing, patent, legal — domain-tuned agents with built-in guardrails (tax law, jurisdictional caveats, hallucination guards).

🔬

MCP / Claude Skills compatible

Skills serialize to standard SKILL.md. Drop into Claude Skills, Cursor Skills, or any MCP-compatible registry without code changes.

🛡️

Evidence by default

Sentence-level hallucination detection and retrieval metrics ship as first-class modules. "It works" comes with proof attached.

🎯

LLM output quality eval (CI gate)

Catch quality regressions before merge. tests/llm_eval/ grades real LLM output against rubrics + a committed baseline. Score drop > 5pt fails the build. Per-skill cases ship for all 6 skills.

⚖️

A/B experiments built in

Test prompt variants on real users with deterministic per-user assignment (SHA-256 bucket). Audience filter (roles / users / window). Outcome rollup + tentative winner detection. CLI + SDK.

🧮

Hermetic test harness — stubs & drivers for every surface

Every public surface (auth / memory / fusion / exporters / OAuth / parsers / CLI / extensions / experiments / connectors / agent) ships with backend stubs, fixture factories, and protocol-conforming drivers — so contributors can write hermetic tests without standing up real services. CI runs them on every PR.

🔗

20 storage / SaaS connectors

Box / SharePoint / Dropbox / Drive / kintone / Salesforce + Notion / Confluence / Jira / Slack / Teams / GitHub / HubSpot / Zendesk / Linear / S3 / Azure Blob / GCS / WebDAV / Email. Per-user OAuth means alice only sees what alice can in each system.

📄

File parsers (PDF · Word · Excel · PowerPoint · CSV · HTML)

Drop a file in — auto-dispatch by extension. PDF page-by-page, Word with heading detection, Excel as Markdown tables, PowerPoint with speaker notes. Custom formats register via entry points.

🖨

Output exporters (HTML · PPTX · DOCX · MD · JSON)

Skills produce Markdown by default. OutputFormatSkill infers requested format from natural-language hints ("パワポで" → PPTX, "as a Word doc" → DOCX). Custom formats register via entry-point.

🎙

Voice input + voice output

Speech-to-text (Whisper) and text-to-speech (OpenAI TTS / ElevenLabs / Piper). Embedded in Streamlit UI as record-and-go input and read-aloud output.

👥

Full admin user CRUD

Create / update / delete / deactivate / rotate keys / change roles — all via CLI, UI, or SDK. All operations audited.

🛡

Admin-controlled LTM policy

Pin which backend(s) users may pick and what the default mode is, at the tenant level. Resolution: admin enforced > call-site > user pref > admin default.

💾

Admin data exports

CSV / JSON / JSONL exports of audit log, users, usage, memory, policies — for compliance, SIEM, backups. Each export action self-audited.

📊

Personal & org dashboards

Flow / skill counts, success rate, top users, promoted blocks, frozen files, distributed skills — out of the box, with no separate analytics pipeline.

📝

Custom prompt distribution

Users save personal prompts. Admins promote them to org or push to specific roles / users. Three scopes with merge precedence.

🪐

MCP server (stdio + remote HTTP/SSE)

Use Praxia from Claude Desktop / Cursor / Continue.dev. Local: praxia mcp serve. Remote (multi-host): praxia serve exposes /api/v1/mcp with auth + audit log. Every skill + flow becomes an MCP tool automatically.

🌐

Backend-only or full-stack

Use Praxia as a brain behind your own frontend (SDK embed or praxia serve FastAPI HTTP API), or run the bundled Streamlit UI for the fastest path. Same auth, memory, skills.

📜

Apache 2.0 + Open Core ready

Permissive license, commercial-friendly. NOTICE.md inventories every dependency's license. Open Core path for enterprise extras planned.

📱

Mobile-responsive UI + landing

Landing has chip-style nav on phones, scrollable tabs, ≥44px touch targets, prefers-reduced-motion respected. Streamlit UI injects responsive CSS + a "Compact mode" toggle for slow connections.

Try it — pick your scenario

See exactly what Praxia does for your work.

Pick a role + use case to see the matching CLI command, sample output, and concrete Before/After. Then click Run preview to see a typed-out simulation in your browser — no install yet.

I am

I want to

Pre-meeting research for a B2B account

You're meeting Acme Manufacturing tomorrow at 14:00. Praxia ingests their IR, recent press, and your past wins → produces top-3 pain hypotheses, a 5-row FAQ with citations, and a proposal outline.

Before

6 hours of LinkedIn skim + 10-K reading
Hit-or-miss prep — CFO asks about a recent capex you didn't know about
Acceptance rate ≈ 55%

After

1 hour total prep
3-pain-hypotheses + 5-row FAQ + proposal outline
Acceptance rate +15–20pt

Variation: attach .pdf board deck → Praxia auto-parses + cites it. Or pull straight from Salesforce → praxia connector pull salesforce "SELECT Id,Name FROM Account WHERE Id='001..'".

praxia run sales \
  --customer-name "Acme Manufacturing" \
  --product "Praxia"

# Click ▶ Run preview to see a typed-out simulation

Who it's for

Seven target personas, one platform.

Praxia is opinionated about where it shines — mid-cap to large enterprises with senior staff whose tacit knowledge is currently locked in one person's editor.

🏢

Information Systems / Platform team (300–5,000 employees)

Need: Roll out AI tools across the org without handing every team a different vendor — and without paywalling SSO / RBAC / audit.

Fit: Auth + RBAC + ACL + per-user OAuth + audit log all in OSS, not behind an enterprise tier. Self-hostable on-prem or private cloud. Same code as the OSS, just operated by you.

Typical year-1 result: 100 knowledge workers, ~$1.25M net benefit, full audit trail, no per-seat licensing surprises.

🏗️

Engineering / Product VP (50–500 in scope)

Need: Senior architects' code-review and design intuition is the bottleneck. Junior PMs ramp in 12–18 months. Best practices live in Slack threads and one staff engineer's head.

Fit: DesignSkill + sleep-time consolidation distills "how senior X reviews specs" into reusable shared blocks. Markdown + git frozen layer fits existing PR review workflow.

Typical year-1 result: Senior load 16h/wk → 4h/wk, junior PM ramp 6–9 months, NFR coverage 5–7 → 15–20 axes.

⚖️

Legal / Compliance lead (regulated industry)

Need: 50–100 contracts/month bottlenecked on 2–3 people. Critical risk slips through under deadline. Need an auditable AI workflow with no vendor lock-in.

Fit: LegalSkill (RACE framework) + read-only memory mode for sensitive contracts + per-user OAuth respects external system ACL + every action audited. Apache 2.0 means you can show the source to your auditors.

Typical year-1 result: Per-contract review 60–90min → 10–15min, throughput 50–80/mo → 200–300/mo, critical-miss rate 5–10% → 1–2%.

🧪

OSS / Research integrator (engineering team)

Need: Build a domain-specific agent system over Mem0 / LangGraph / your-own-vector-DB without re-implementing auth, memory cycling, dashboards, exporters yourself.

Fit: 7 plugin types (~50 LoC each) — connectors, memory backends, parsers, exporters, OAuth providers, skills, flows. Use as a Python library, run praxia serve as a backend, embed in LangGraph. Apache 2.0.

Typical day-30: domain skill PR'd, custom connector pip-installable, memory cycling working, ~3 weeks ahead of building it from scratch.

📈

Sales / Revenue Operations lead

Need: 50+ AEs prepping for meetings; quality of pre-call research is uneven. Senior reps win 2× more deals than juniors and the pattern doesn't transfer.

Fit: SalesSkill + memory cycling distills "how senior X researches an account" into shared playbooks. Salesforce + Slack + GitHub connectors feed real customer context. Per-user OAuth means each AE only sees their own pipeline.

Typical year-1 result: Pre-call prep 6h → 1h, proposal acceptance rate +15-20pt, meetings/wk per AE 3 → 6-8.

🛒

Procurement / Supply Chain lead

Need: 5-supplier RFQs take 2-3 weeks. ESG / BCP / single-source risk is treated as an afterthought. Subcontract Act / Anti-Bribery compliance creates legal exposure if missed.

Fit: PurchasingSkill (QCD+S framework) + connectors to Salesforce / kintone / Box for RFQ documents. Audit log captures every supplier evaluation step.

Typical year-1 result: 5-supplier eval 3-4wk → 3-5 days, hidden cost discovery +30%, single-source detection 70% → 95%+.

📑

IP / Patent agent or in-house counsel

Need: Prior-art searches cost $3-5k each via outside counsel. Cross-domain art is often missed. Inventors expect first-pass results in days, not weeks.

Fit: PatentSkill (5-step framework) + file parsers for inventor disclosure docs. Memory cycling captures "patterns that distinguish prior art from real novelty" across cases. Read-only memory mode for confidential client work.

Typical year-1 result: Per-case time 1-2 days → 2-4h internal, external counsel fees −50-70%, faster turn for inventors.

Why OSS matters here

The capabilities you typically pay for — already in the package.

SSO + RBAC + audit are not paywalled

OIDC SSO (Google / Microsoft / Okta / GitHub / Keycloak) is in the OSS. Most agent frameworks ship without it; most agent platforms paywall it. Praxia treats it as table stakes.

Memory format is not locked in

Layer 4 is plain Markdown in your git repo. Layer 3 exports to JSONL. Layer 1 is your chosen backend's native format. The framework doesn't hold your data hostage — leaving costs nothing.

You can read every line

Apache 2.0. Show the source to your auditors, your security team, your customers. No "trust us, the SaaS is secure" — inspect the auth manager yourself.

Multi-LTM ensembles, not single-vendor

Run Mem0 + Zep + HindSight in parallel and fuse with RRF, or route per query. No commercial agent platform exposes this — they pick a backend and lock you in. Praxia treats it as a first-class feature.

Per-user OAuth respects external ACL

When alice pulls from Box, Box's own ACL applies — alice only sees what alice can see. Service-account designs (typical SaaS shortcut) leak data across users. Praxia's per-user OAuth makes this the default.

Run fully on-prem with Gemma / Qwen

Set PRAXIA_LOCAL_MODEL=gemma, run Ollama, choose backend=json. No cloud LLM, no cloud vector DB, no telemetry. Air-gapped customers run identical code as cloud customers.

KMS-backed token encryption (5 adapters)

OAuth tokens are envelope-encrypted with the master key in AWS KMS, Azure Key Vault, GCP KMS, HashiCorp Vault — or locally for dev. Most agent frameworks store tokens with a local symmetric key; Praxia treats KMS as a first-class concern in the OSS.

Production OAuth callback handler

Multi-worker safe — state cache survives across processes (TTL-pruned JSON file), redirect URI pinned via env var. Run praxia serve behind nginx and the callback works correctly with N replicas. Most OSS competitors only support CLI loopback.

A/B testing + LLM quality eval included

Run controlled experiments on prompts / skills / LLMs with deterministic assignment + outcome tracking. CI-gate quality regressions with a baseline-flagging eval framework. Both in the OSS — no separate "experimentation platform" subscription.

How it works

From individual usage to organizational standard — automatically.

1

You just work

Run flows and skills via CLI / SDK / UI. Every interaction lands in your personal memory automatically — no save() calls in business code.

p = Praxia(user_id="alice")
p.run(SalesAgentFlow, inputs={...})
# Memory accumulates implicitly

2

Outcomes get attached

When deals close, tests pass, or PRs merge, attach an outcome. The consolidator uses these to weight which patterns are actually effective.

p.personal_memory.record_outcome(
    episode_id=ep.id,
    success=True, score=0.9,
    notes="closed-won",
)

3

Nightly distillation

The Sleep-time Consolidator clusters similar memories across users, runs each through the 3-path engine, and auto-promotes the high-confidence ones.

praxia consolidate
# auto_promoted: 3, review_queued: 5

4

Living → frozen

Promoted shared blocks become living org knowledge. The most stable get frozen into Markdown + git for PR review. Every step is auditable.

praxia freeze --block manufacturing_pain
# → .praxia/frozen/.../*.md

Architecture

Six layers that turn one expert's drawer into everyone's playbook.

UI · CLI · SDK

Orchestrator · Flows · Skills

Layer 1 — Personal Memory (auto-extracted)

Layer 2 — Sleep-time Consolidation · 3 promotion paths

Layer 3 — Shared Memory Blocks (living)

Layer 4 — Markdown + git frozen layer (stable)

Layer 5 — Graph layer (optional)

Layer 6 — Skills Registry (promotion-aware)

Auth · RBAC · Audit · SSO (OIDC / SAML)

Three workflow flows

Run a multi-agent pipeline in 30 seconds.

Sales Agent Flow

Customer IR + past minutes + RAG → hypotheses → FAQ → proposal outline.

praxia run sales \
  --customer-name "Acme" \
  --product "BizFlow"

Logic Checker Flow

Three agents (structure / contradiction / reader) review long docs.

praxia run logic \
  --document spec.md

RAG Optimization Flow

Self-correcting RAG: query expansion → eval → hallucination check loop.

praxia run rag \
  --question "What license?"

Six default business skills

Domain-tuned agents, ready out of the box.

Investment

InvestmentSkill

Equity research, due diligence, portfolio decisions with bull/bear analysis.

Sales

SalesSkill

Account research, proposal drafting, FAQ prep, objection handling.

Engineering

DesignSkill

System design review, requirements engineering, architecture trade-offs.

Procurement

PurchasingSkill

Supplier evaluation, RFQ analysis, TCO calculation, BCP risk scoring.

IP / Patent

PatentSkill

Prior-art search, claims drafting, patent maps, filing strategy.

Legal

LegalSkill

Contract review, compliance checks, M&A diligence, policy drafting.

5-minute Quickstart

From `pip install` to live agent in under 5 minutes.

# Install (with UI + connectors + office parsers)
pip install "praxia[ui,connectors,office]"

# Initialize
praxia init

# Run flows + skills
praxia run sales --customer-name "Acme"
praxia skill run investment "3-year thesis on Acme Mfg (fictional)"

# Launch the UI (11 tabs incl. Dashboard / Policies / Admin / Connectors)
praxia ui --port 8501

# OR — backend-only mode for your own frontend (FastAPI HTTP)
pip install "praxia[server]"
praxia serve --host 0.0.0.0 --port 8000

# Output exporters — render skill output to HTML / PPTX / DOCX
praxia export report.md slides.pptx --title "Q3 Review"

# Memory mode — accumulate (default) or read-only per user
praxia memory mode --user-id alice read_only
praxia admin memory-policy-set --enforced-backend mem0 --allowed mem0,zep

# A/B experiments — test prompt variants with deterministic assignment
praxia experiment create proposal_v2 --name "Prompt v2" \
  --variants '{"control":{"prompt":"..."},"candidate":{"prompt":"..."}}' \
  --traffic-split "control=0.5,candidate=0.5"
praxia experiment start proposal_v2

# Production-grade OAuth + KMS-encrypted tokens
export PRAXIA_KMS_ADAPTER=aws
export PRAXIA_KMS_KEY_ID=arn:aws:kms:...
pip install "praxia[server,kms-aws]"
praxia serve --host 0.0.0.0 --port 8000

# Personal → org memory distillation
praxia consolidate

# Enterprise: resource policies, audit exports, connectors
praxia policy add deny connector "box:/Confidential/*" --principals "role:member"
praxia admin export-audit audit.csv --since-days 30
praxia connector pull salesforce "SELECT Id, Name FROM Account"

Bring your own LLM key — Anthropic, OpenAI, Google (Gemini / Gemma), Alibaba (Qwen), or run Gemma / Qwen locally via Ollama. Two deployment modes: full-stack praxia ui or backend-only praxia serve behind your own frontend — see deployment-modes.md.

Use cases

Concrete Before / After across six business functions.

Function

Before

After Praxia

Lift

Investment DD

4–6h / deck

45–60 min

−80%

Sales prep

Hit-or-miss prep

Storyboard + FAQ + RAG

+15–20pt acceptance

Design review

16h / week senior load

4h / week

−75%

Procurement RFQ

Direct cost only

Full TCO + ESG

+30% true cost surfaced

Patent prior art

1–2 days + counsel

2–4h internal

−50–70% counsel fees

Legal M&A

4–8 weeks

2–3 weeks

−50% external costs

See full Before/After tables in docs/use-cases.md.

UI tour

The Streamlit dashboard at a glance.

Run Flow tab — execute a multi-agent flow with file upload — 🎬 **Run Flow** — pick a flow, fill inputs (or attach files), watch each agent step run

Business Skill tab — invoke a domain-tuned agent like legal_reviewer — 🛠 **Business Skill** — one of six domain-tuned agents (here: legal contract review)

Dashboard tab — personal and organizational usage metrics — 📊 **Dashboard** — flow / skill counts, success rates, top users, top skills

Policies tab — resource access control list management — 🛡 **Policies** — glob-pattern allow/deny ACL for IS departments

Connectors tab — pull and push data between Praxia and external systems — 🔌 **Connectors** — Pull / Push to Box · SharePoint · Dropbox · Drive · kintone · Salesforce

Admin Downloads tab — export audit log and other data — 💾 **Admin Downloads** — CSV / JSON / JSONL export with chain-of-custody audit logs

Plus 👥 Users, 📝 Prompts, 🧠 Memory, 🌙 Consolidate, ℹ About tabs (11 in total). Local file upload supported throughout.

Concrete examples

One CLI invocation, real business output.

VC pre-screening: 1 deck, 45 minutes

Before: 4–6h reading the deck, scrubbing competitor research, modeling financials.

After: Full 5-section memo (Profile / Quant / Qual / Risk / Decision) with bull-and-bear cases and confidence intervals.

📉 Time: 4–6h → 45–60 min
📊 Coverage: 3–5 competitors → 10–15 surrounding-domain peers
🎯 Capacity: 5–10 deals/wk → 20–30 deals/wk

# CLI
praxia skill run investment "\
Mid-term thesis on a hypothetical issuer:
- sector: consumer electronics, mid-cap JP
- horizon: 3 years
- compare with two anonymized peers
"

B2B account research: 8h → 1h

Before: Hit-or-miss prep based on LinkedIn skim. CFO asks about a recent capex you didn't know about.

After: Praxia ingests IR + 6 months of press, extracts top-3 pain hypotheses, and generates a 5-row FAQ with citations.

⏱ Prep: 6h → 1h
📈 Acceptance rate: +15–20pt
📞 Meetings/week: 3 → 6–8

# Multi-agent flow
praxia run sales \
  --customer-name "Acme Manufacturing" \
  --product "Praxia Sales" \
  --additional-context "Mid-term plan
calls for 30B JPY DX investment"

Architecture review: 4h → 30min

Before: Senior architect spending 16h/wk on PR-style design reviews. NFRs slip through.

After: DRAGON framework (Data flow / Requirements / Architectural fit / Gaps / Operation / NFRs) — checks all 6 axes systematically.

⏱ Senior load: 16h/wk → 4h/wk
📋 NFR coverage: 5–7 → 15–20 axes
👶 Junior PM ramp: 12–18mo → 6–9mo

praxia run logic --document spec.md

# or single-skill review
praxia skill run design "\
Review the attached architecture for
the new payments microservice...
"

RFQ analysis: 2 weeks → 3 days

Before: Direct cost only; ESG / geopolitics / BCP risk treated as afterthoughts.

After: Full TCO matrix + QCD+S framework + Subcontract Act compliance check + risk grid.

📦 30-supplier eval: 3–4wk → 3–5 days
💸 Hidden cost discovery: +30% of initial quote
🚨 Single-source detection: 70% → 95%+

praxia skill run purchasing "\
Evaluate 5 PCB suppliers for our new
product line. Annual volume 2M units.
Constraints: Japan-domiciled HQ,
ISO9001, no Russia/Belarus exposure.
"

Prior-art search: 2 days → 4 hours

Before: 30–50万円 / case to outside counsel for first-pass research.

After: 5-step framework (element extraction → IPC/FI/F-term search formula → hit analysis → novelty → inventive step). Counsel only reviews the draft.

⏱ Per-case: 1–2 days → 2–4h
💴 External fees: −50–70%
📊 Cross-domain art: significantly improved

praxia skill run patent "\
Prior-art search: solid-state battery
with three-layer ceramic electrolyte
and Li-rich cathode. Provide:
1. Element decomposition
2. IPC/FI/F-term search strategy
3. Hit-analysis table
4. Novelty + inventive-step verdict
"

Contract review: 90min → 12min

Before: 50–100 contracts/month bottlenecked on 2–3 people. Critical risks slip through.

After: RACE framework (Risk / Allocation / Compliance / Exit) + 🔴/🟡/🟢 severity ladder. Critical-risk miss rate falls from 5–10% to 1–2%.

⏱ Per-contract: 60–90min → 10–15min
📂 Throughput: 50–80/mo → 200–300/mo
⚠️ Critical-miss: 5–10% → 1–2%

praxia skill run legal "\
Review this services agreement
focusing on:
- Liability cap
- IP assignment vs license
- Data return on termination
- Anti-bribery clause
"

Resume screening: 200 candidates in 2h

Before: Recruiter screens 50-80 resumes/day; quality varies; senior recruiters' "spot the right hire" instinct doesn't transfer to juniors.

After: Custom HRSkill applies your role criteria + culture fit signals consistently. Memory cycling captures "what predicted a successful hire" from past placements.

⏱ 200 resumes: 6h → 2h
🎯 Top-10 quality: junior matches senior recruiter
📊 Time-to-hire -30%; offer-acceptance +10pt

# Custom skill (yours) + connectors
praxia connector pull s3 \
  "hiring-bucket/q3-applicants/" \
  --user-id alice
praxia skill run hr_screener "\
Apply ICP criteria + grade against
the Senior PM role posted Sep 5.
Output: top-10 ranked + flag risks.
"

Customer ticket triage: hands-on or autonomous

Before: 200+ tickets/day, junior agents escalate ~40% to seniors; SLA breaches in regulated industries trigger fines.

After: Zendesk + GitHub + Confluence connectors give context. Custom SupportSkill drafts replies in your voice. Memory cycling captures "how senior X handled the tricky ones".

📂 Median resolution: 4h → 1.5h
⬆ Senior escalation rate 40% → 15%
🎯 CSAT +8pt; SLA breaches -70%

praxia connector pull zendesk \
  "tickets:status:open priority:high"
praxia skill run support_triage "\
Read the ticket and last 5 comments.
Suggest a reply matching our brand voice.
Flag if escalation is needed.
"

Platform rollout: 5,000 users in 4 weeks

Before: 6-month vendor evaluation, lawyer review, custom SSO integration, separate audit log pipeline. Each tool needs its own.

After: OIDC SSO (Microsoft / Okta) day-one. SCIM provisioning auto-syncs user lifecycle. KMS-backed token encryption per cloud. Audit log for SIEM ingest.

⏱ Vendor eval: weeks → days (open source)
🔐 SSO + SCIM + KMS in OSS, no enterprise tier
📊 Same code as paid tier — auditors verify

# Production deploy
export PRAXIA_SSO_PROVIDER=microsoft
export PRAXIA_KMS_ADAPTER=aws
export PRAXIA_SCIM_TOKEN="$(openssl rand -hex 32)"
praxia serve --host 0.0.0.0 --port 8000

# Okta admin: point SCIM at /scim/v2/Users

Lit review: 200 papers in 3 hours

Before: PhD students spend weeks doing prior-art / state-of-the-art reviews. Sometimes they miss the one paper that already solved the problem.

After: Email + GitHub + WebDAV (institutional repo) + S3 (preprints) connectors feed papers in. Custom ResearchSkill extracts methodology + findings + relevance score. RAG-fused memory across the lab's history.

📚 Lit review: 3 weeks → 3 hours initial pass
🎯 Cross-domain hits found: +40%
🤝 Lab-wide memory: students inherit predecessors' knowledge

praxia connector pull s3 "arxiv-mirror/2024/cond-mat/"
praxia run rag --question "\
Latest research on three-layer ceramic
electrolytes for solid-state batteries —
group by approach, flag contradictions.
"

Full Before/After tables (10 industries × 3 use cases each) in docs/use-cases.md.

ROI projection

Cumulative effect compounds with the memory loop.

Annual ROI formula

Year 1 ROI = (N × C × t × s₁) + Q − P
Year 2+    = (N × C × t × s₂) + Q × g − P

N  = knowledge workers in scope
C  = loaded cost / FTE
t  = time on routine work
s₁ = year-1 time savings (typ. 30–50%)
s₂ = year-2 time savings (typ. 50–75%)
   ↑ s₂ > s₁ because org memory compounds
Q  = quality lift (errors avoided)
P  = Praxia cost (license + infra)

Worked example: 100 knowledge workers

Variable	Year 1	Year 2
Workers in scope (N)	100	100
Loaded cost (C)	$90k	$90k
Routine work share (t)	40%	40%
Time savings (s)	35%	60%
Quality lift (Q)	$65k	$200k
Praxia cost (P)	$80k	$80k
Net benefit	$1.25M	$2.30M

3-year cumulative net ≈ $5.2M. Even after halving each parameter, ROI remains > 10×.

3-year compounding effects

KPI	Before	1 year	3 years
New-hire ramp time	6–12 months	4–6 months	2–3 months
Knowledge loss on departure	Several / yr	50% reduction	Zero
Output quality variance	2–3× spread	50% narrower	≤ 20% spread
Cross-team best-practice flow	Almost none	5–10 / mo	30+ / mo
AI utilization (org avg / individual best)	30–50%	60–70%	80%+

Extensibility

Built to grow with your team.

Custom flow (~30 lines)

Define a multi-agent pipeline by subclassing Flow. Each step references prior outputs via ${var} templates.

class IncidentResponseFlow(Flow):
    name = "incident_response"
    steps = [
        FlowStep("triage", ...),
        FlowStep("hypothesis", ...),
        FlowStep("mitigation", ...),
    ]

Custom skill (~20 lines)

Subclass Skill with a system prompt + manifest. Auto-serializes to SKILL.md for MCP / Claude Skills.

class HRRecruitingSkill(Skill):
    manifest = SkillManifest(
        name="hr_recruiting",
        domain="hr",
        ...
    )
    system_prompt = """..."""

Custom LTM backend

Implement the 4-method MemoryBackend protocol. Plug in any vector DB (Pinecone, Weaviate, Qdrant, ...) — and optionally combine with built-ins via CompositeBackend / RoutedBackend.

class PineconeBackend:
    def add(...): ...
    def search(...): ...
    def all(...): ...
    def clear(...): ...

Custom connector (~50 lines)

Implement the 2-method Connector protocol (pull / push). Per-user OAuth, ACL enforcement, and audit logging plug in for free. End-to-end Notion example in the guide.

class NotionConnector:
    name = "notion"
    def pull(self, path, *, limit): ...
    def push(self, path, data): ...

Custom output format

Built-in: HTML, PPTX, DOCX, MD, JSON. Add your own (LaTeX? RTF? Confluence Storage?) by implementing the Exporter protocol and declaring an entry-point.

class LatexExporter:
    format = "latex"
    extensions = ("tex",)
    def export(self, content) -> bytes: ...

Detailed extension guides: PLUGINS.md · CUSTOM_CONNECTORS.md · design specs (EN + JA).

Where it fits

What Praxia is best at.

Best for…

Teams where senior staff's tacit knowledge needs to compound
Workflows in sales / legal / patent / design / purchasing / investment
Enterprises that need RBAC + ACL + audit out of the box
Self-hosted / on-prem deployments (LLM + memory both)

Less ideal when…

You need a fully generic agent graph builder — try LangGraph
You want a hosted-only knowledge platform — try a SaaS
You're building a single-shot chatbot — Praxia is overkill
Your team is a single user with no organizational learning need

Plays well with…

Mem0 / LangMem / Letta / Zep / HindSight — Praxia uses them as backends, or several at once via Composite / Routed
Claude / ChatGPT / Gemini / Gemma / Qwen via LiteLLM (local Ollama or cloud)
Box / SharePoint / Dropbox / Drive / kintone / Salesforce connectors
MCP / Claude Skills / Cursor Skills (skill format compatible)
Your own frontend (Next.js / Slack / mobile) — run praxia serve as the HTTP backend

Detailed feature inventory and integration matrix in docs/FEATURES.md.

FAQ

The questions everyone asks.

Does Praxia send my data to a third party?

No. The default json backend stores everything on local disk. LLM calls go to whichever provider you configure — pick qwen-local (Ollama) for fully in-house operation. You choose the trust boundary.

How is this different from "just using Mem0"?

Mem0 is a memory layer. Praxia is the orchestrator + memory + skill registry + flows + eval + auth. Mem0 is one of six interchangeable backends inside Praxia.

Is auto-promotion actually safe?

Three guardrails: (1) auto-threshold defaults to 0.75 — high; (2) review queue catches mid-confidence items for human approval; (3) the audit log records every promotion so rollback is trivial.

Can I run Praxia fully offline / on-prem?

Yes. Pick qwen-local (Ollama) for the LLM and json or self-hosted Mem0/HindSight for memory. No cloud calls.

How does Praxia compare to LangGraph?

LangGraph excels at general agent orchestration but doesn't ship workflow templates, business skills, memory cycling, or auth. Praxia is opinionated and batteries-included for the "specialized multi-agent + organizational memory" niche.

Can I use this commercially?

Yes — Apache 2.0. Even auth/SSO is in the OSS, where competitors typically paywall those features.

Is the 6-skill set fixed?

No. Add your own with ~20 lines. PRs that contribute new skills are very welcome.

What about MCP / Claude Skills compatibility?

Skills serialize to standard SKILL.md frontmatter. Drop any Praxia skill into Claude Skills, Cursor Skills, or any MCP registry without code changes.

Is my org memory locked into Praxia?

No. Layer 4 is plain Markdown in your git repo. Layer 3 exports to JSONL. Layer 1 personal memory is standard JSONL or your chosen backend's native format. You can leave at any time.

How big can my org grow before hitting limits?

JSON backend handles ~10k users comfortably. Beyond that, switch to Mem0 + Qdrant/Pinecone or HindSight. The promotion engine scales with LLM tokens — budget 10–50 LLM calls per consolidation per cluster.

Can I use Praxia behind my own frontend (Next.js / mobile / Slack bot)?

Yes — that's mode B. Two paths: embed the Python SDK directly if your backend is Python, or run praxia serve (FastAPI, 8 endpoints under /api/v1) and call it from any HTTP client. Same auth, RBAC, ACL, and audit log as the Streamlit UI. Setup recipe: deployment-modes.md.

Does the user have to write to memory every time?

No. Personal memory accumulates implicitly during normal use. Per user, you can also flip read_only mode for sensitive sessions — writes are silently dropped, reads still work. Admins can lock the mode tenant-wide or per role.

How do I get a PowerPoint / Word output instead of just text?

Use OutputFormatSkill — it detects format hints in natural language ("パワポで" / "as a Word doc" / "HTML please") and renders via the matching exporter. CLI: praxia export report.md slides.pptx. Custom formats register via the praxia.exporters entry-point.

Can I plug in a connector to a system you don't ship?

Yes. The Connector protocol is two methods (pull / push) and ~50 lines. Per-user OAuth, ACL enforcement, and audit logging plug in for free. End-to-end Notion example: CUSTOM_CONNECTORS.md.

Is Gemma supported?

Yes. gemma / gemma-2b / gemma-9b / gemma-27b via local Ollama; gemma-cloud via Google Vertex AI. PRAXIA_LOCAL_MODEL=gemma makes auto_detect() fall back to Gemma instead of Qwen-local when no cloud key is set.

How are OAuth tokens protected at rest?

Envelope encryption: a fresh 256-bit DEK per token, AES-GCM payload encryption, and the DEK wrapped by a configurable KmsAdapter. 5 adapters ship: local (HKDF, dev), aws (AWS KMS CMK), azure (Key Vault Keys), gcp (Cloud KMS), vault (HashiCorp Transit). Master key never leaves the KMS / HSM. Switch with PRAXIA_KMS_ADAPTER=aws.

Can I A/B test a new prompt before rolling it out?

Yes — built into the OSS. Define an experiment with control / treatment variants, set traffic split, restrict the audience (roles / users / time window). Each user's assignment is deterministic (SHA-256 hash) so they always see the same variant during the experiment. Outcomes auto-track via the existing record_outcome() API. praxia experiment results <id> shows tentative winner.

Can I use Praxia from Claude Desktop / Cursor over the network?

Yes. Praxia is an MCP server in two flavors. Local (recommended for desktop): praxia mcp serve → configure Claude Desktop's mcp.json to spawn it via stdio. Remote (multi-host / team): run praxia serve and the MCP HTTP+SSE endpoints under /api/v1/mcp are available. Auth via API key, JWT, or a shared X-MCP-Token. Every business skill + multi-agent flow + memory search becomes an MCP tool automatically — no per-tool wiring required.

How do I prevent prompt changes from silently degrading quality?

Run tests/llm_eval/ in CI. Each PR runs 6 canonical cases (one per business skill) against the configured LLM and grades output with rubrics (keyword / structure / length / must-not-contain / LLM-as-judge). Scores below the committed baseline minus 5pt fail the build. Update the baseline with --update-baselines after a known-good change.

Editions

Self-host today, or join the hosted alpha.

Praxia is fully open source under Apache 2.0 — every feature (SSO, RBAC, ACL, audit, OAuth, all skills, all connectors, AutonomousAgent) is in the OSS package. The hosted edition is invitation-only alpha while we tune onboarding; commercial pricing will be set at v1.0.

Open Source

$0

forever · Apache 2.0

Full framework — every feature included, no paywall
Self-hosted on your infrastructure (cloud or air-gapped)
You run upgrades and operations
Community support (GitHub Discussions / Issues)
Apache 2.0 — commercial use, redistribution, embedding allowed

Install now

Hosted (alpha)

Invitation-only

pricing TBD at v1.0 · waitlist open

Same OSS framework — we run it for you
Hosted infrastructure + automatic upgrades
Onboarding session + best-practice templates
Selected pilots run free during alpha
Custom deployment (on-prem / VPC / air-gapped) negotiable per engagement

Alpha status: hosted backend is being stabilized. We onboard waitlist organizations in batches of ~10 as capacity allows. Need on-prem or compliance review? Mention it in the waitlist form and we'll coordinate.

Join waitlist

Looking for OSS license interpretation, embedded use, or revenue share? See LICENSE and NOTICE.

Ready to make your team's tacit knowledge unforgettable?

Star us on GitHub, run the quickstart, or reach out for a tailored PoC.

★ Star on GitHub Open Portal Contact us

38+ advantages no other framework offers in one package.

Personal → org memory loop

3-path promotion engine

Skills also promoted

Outcome tracking built-in

Memory mode toggle

Multi-LTM fusion + routing

6 LTM backends

Multi-LLM (15+ first-class · 100+ via LiteLLM)

Auth, RBAC, SSO, audit — in OSS

User-delegated OAuth

KMS-backed token encryption

Production OAuth callback (HTTP)

Resource access policies (ACL)

Autonomous agent — LLM-driven tool-use loop

Prompt Designer — turn intent into a polished template

Document Designer — code-gen pptx / docx (Claude-Skills-style)

Workflow-specialized flows

6 default business skills

MCP / Claude Skills compatible

Evidence by default

LLM output quality eval (CI gate)

A/B experiments built in

Hermetic test harness — stubs & drivers for every surface

20 storage / SaaS connectors

File parsers (PDF · Word · Excel · PowerPoint · CSV · HTML)

Output exporters (HTML · PPTX · DOCX · MD · JSON)

Voice input + voice output

Full admin user CRUD

Admin-controlled LTM policy

Admin data exports

Personal & org dashboards

Custom prompt distribution

MCP server (stdio + remote HTTP/SSE)

Backend-only or full-stack

Apache 2.0 + Open Core ready

Mobile-responsive UI + landing

See exactly what Praxia does for your work.

Pre-meeting research for a B2B account

Before

After

Seven target personas, one platform.

Information Systems / Platform team (300–5,000 employees)

Engineering / Product VP (50–500 in scope)

Legal / Compliance lead (regulated industry)

OSS / Research integrator (engineering team)

Sales / Revenue Operations lead

Procurement / Supply Chain lead

IP / Patent agent or in-house counsel

The capabilities you typically pay for — already in the package.

SSO + RBAC + audit are not paywalled

Memory format is not locked in

You can read every line

Multi-LTM ensembles, not single-vendor

Per-user OAuth respects external ACL

Run fully on-prem with Gemma / Qwen

KMS-backed token encryption (5 adapters)

Production OAuth callback handler

A/B testing + LLM quality eval included

From individual usage to organizational standard — automatically.

You just work

Outcomes get attached

Nightly distillation

Living → frozen

Six layers that turn one expert's drawer into everyone's playbook.

Run a multi-agent pipeline in 30 seconds.

Sales Agent Flow

Logic Checker Flow

RAG Optimization Flow

Domain-tuned agents, ready out of the box.

InvestmentSkill

SalesSkill

DesignSkill

PurchasingSkill

PatentSkill

LegalSkill

From pip install to live agent in under 5 minutes.

Concrete Before / After across six business functions.

The Streamlit dashboard at a glance.

One CLI invocation, real business output.

From `pip install` to live agent in under 5 minutes.