Milens in Action

From one-command onboarding to enterprise multi-agent pipelines — see how Milens transforms development at every scale. 17 Workflow Stories + 38 tactical scenarios.

17 Workflow Stories

38 Tool Scenarios

43 MCP Tools

7 Agent Harnesses

Workflow Stories

Not just tools. Complete workflows.

Each story chains multiple Milens tools into a real workflow — the kind that saves hours or days. Pick your scale: startup, enterprise, legacy, or greenfield.

New Developer: Productive in 1 Hour

Any Scale New Hire Onboarding

A new developer joins a 500k-line codebase. Zero documentation. Instead of weeks reading code, they ask the AI agent to explain the codebase — and Milens feeds it exactly what it needs, in the right order, with minimal tokens.

codebase_summary() → domains() → routes() → context({name: "OrderService"}) → ...start coding

Outcome: Developer understands the project structure, key entry points, and the module they need to work on — all in under 60 minutes. Milens keeps the AI agent context under 1000 tokens per call.

Startup: Ship Features Daily (2–5 Devs)

Small Team 2–5 Devs Velocity

A 4-person startup needs to move fast. Every PR must be safe, tested, and reviewed — but there's no dedicated QA or security team. Milens acts as the automated safety net.

init --profile full → impact({target: "Payment"}) → test_plan({name: "Payment"}) → review_pr() → pre_commit_check()

Outcome: Every commit: blast-radius checked, tests planned, PR reviewed, security scanned. All automated. 4 devs ship like a 10-person team. No dedicated QA needed.

Enterprise: Monorepo with 200+ Engineers

Large Enterprise 200+ Devs Governance

A company with 15 microservices in one monorepo. Changing a shared type in @company/common could break 12 services. Milens indexes all 15 repos and tracks cross-repo impact automatically.

repos() → impact({target: "UserDTO", depth: 5}) → review_pr({ref: "main"}) → orchestrate() → test_impact()

Outcome: Before merging any shared-type change: see exactly which 12 services break, get risk scores per service, know which tests to run, and get a structured action plan — automatically.

Legacy Codebase: Migrate Without Fear

Legacy Any Team Migration

A 10-year-old PHP monolith with 300k lines and zero tests. The team needs to extract the billing module into a microservice. No one knows what's safe to touch.

find_dead_code() → domains() → trace({name: "billingProcess", direction: "from"}) → impact({target: "Invoice"}) → annotate({symbol: "Invoice", key: "migration", value: "extracted to svc-billing"})

Outcome: Dead code removed first (safely). Billing domain isolated via graph analysis. Full downstream trace of every function the module touches. Annotations track migration progress across multiple sessions and agents.

Multi-Agent Pipeline: 5 Agents, 1 Feature

Any Scale AI-Native Team Automation

A feature request goes through 5 specialized AI agents — Planner, Coder, Reviewer, Tester, Security — each with its own Milens session. Handoffs preserve context, annotations track decisions, and nothing gets lost.

session_start("planner") → impact + test_plan → handoff(to: "coder") → edit_check + implement → handoff(to: "reviewer") → review_pr + review_symbol → handoff(to: "tester") → test_impact + test_generate → handoff(to: "security") → security_scan + fix_apply

Outcome: Planner analyzes blast radius → Coder implements with real-time safety checks → Reviewer scores every changed symbol → Tester auto-generates test files → Security scans and auto-fixes. Each handoff transfers full context. Audit trail via annotations.

CI/CD: Automated Quality Gate

Any Scale DevOps Pipeline

Every PR gets a Milens quality gate before merge. No human reviewer needed for routine changes. Risky changes get flagged automatically for human attention.

detect_changes() → pre_commit_check() → review_pr() → test_impact() → orchestrate()

Outcome: GitHub App auto-comments on every PR: risk scores, affected tests, dead code introduced, coverage gaps. Low-risk PRs auto-merge. High-risk PRs ping the right team. Zero manual triage.

Security Compliance: SOC2 / ISO 27001 Ready

Enterprise Security Team Compliance

A fintech company needs SOC2 compliance. Every commit must be scanned for secrets, injection, and data leaks. Auditors need evidence of continuous monitoring.

security_scan({scope: "all", severity: "LOW"}) → security_scan({scope: "secrets"}) → fix_apply({ruleId: "..."}) → annotate({key: "security", value: "audited 2026-05-28"}) → recall({key: "security"})

Outcome: 190+ rules run on every commit. Secrets caught pre-push. Auto-fixes create auditable backups. Annotations provide timestamped evidence. Recall produces audit trail on demand. SOC2 evidence: done.

Tech Debt Sprint: From Chaos to Backlog

Any Scale Engineering Lead Maintenance

Your team knows there's tech debt but can't quantify it. Is it 20 hours? 200 hours? Milens turns vague "we should clean up" into a prioritized, risk-scored backlog in 10 minutes.

find_dead_code({limit: 100}) → test_coverage_gaps({limit: 50}) → review_symbol({name: "..."}) → impact({target: "...", depth: 3}) → annotate({key: "refactor", value: "priority: P1, risk: LOW, effort: 2h"})

Outcome: 100 dead symbols found. 50 untested high-risk symbols surfaced. Each scored by blast radius. Annotations tag priority and effort. Your tech debt backlog writes itself — with actual data, not guesswork.

Bug Hunt: Production Incident in 20 Minutes

Any Scale On-Call Incident Response

Payment processing is failing. Error: "Transaction timeout" at PaymentGateway.process(). Thousands of users affected. You need the root cause NOW — not after reading 50 files.

trace({name: "PaymentGateway.process", direction: "to"}) → smart_context({name: "PaymentGateway", intent: "debug"}) → explain_relationship({from: "PaymentGateway", to: "BankAdapter"}) → compare_impact({name: "PaymentGateway", action: "snapshot"}) → ...fix & deploy

Outcome: Trace shows the exact call chain: API → PaymentController → PaymentGateway → BankAdapter (where it times out). Smart context reveals what changed. compare_impact shows the regression diff. Root cause in 20 min, not 4 hours.

Greenfield: Best Practices From Day Zero

New Project Any Team Foundation

Starting a new project. You want AI agents to follow your conventions, write tested code, and never commit secrets. Milens sets up the guardrails before the first line of code.

init --profile full --interactive → analyze -p . --force --skills → AGENTS.md generated → 6 skill files installed → 190+ security rules active → pre-commit hooks ready

Outcome: One command. AGENTS.md tells every AI agent your conventions. Skills define workflows. Security rules run on every commit. Pre-commit hooks block risky changes. Your project starts with enterprise-grade guardrails — at zero cost.

The Self-Learning Codebase

Any Scale All Teams AI Learning

An AI agent discovers a subtle bug in createUser(): "must call normalizeEmail() first or it corrupts data." This knowledge shouldn't die when the session ends. Milens makes your codebase learn — and the agent gets smarter every session.

annotate({symbol: "createUser", key: "bug", value: "Call normalizeEmail() first"}) → Session 2 → recall({symbol: "createUser"}) → session_context() → Session 5: confidence=0.9 → evolve

Outcome: Session 1: agent finds and annotates the bug. Session 2: agent auto-recalls the warning before touching createUser(). Session 5: confidence reaches 0.9 → milens evolve promotes it to .agents/skills/milens-bug/SKILL.md. Now enforced as a permanent rule. Your codebase literally gets smarter over time. No other tool does this.

Context Never Dies: The Hook System

Any Scale AI-Native Continuity

AI agents have context windows (~200k tokens). When they fill up, the agent forgets everything — all the conventions it learned, all the bugs it found. Milens hooks automate context persistence: save before compaction, restore after. The agent never truly resets.

hook_preCompact() → save metrics snapshot → hook_postCompact() → recall annotations → session_start() → annotate + recall

Outcome: 6 hooks automate everything: onSessionStart refreshes index + recalls warnings. onSessionEnd auto-reviews changes + annotates. onPreCommit runs safety checks. onFileChange live-updates impact. onPreCompact saves before context loss. onPostCompact restores after. Agent works for hours without losing a thing.

Zero-Trust: Fully Offline Intelligence

Enterprise Security Team Compliance

Your source code is your most sensitive asset. Sending it to cloud AI APIs is a non-starter for defense contractors, banks, hospitals, and anyone under GDPR/HIPAA/ITAR. Milens runs entirely on your machine. Zero network. Zero telemetry. Zero external calls.

npx milens init → analyze -p . --force → SQLite + tree-sitter WASM (local) → serve --http --port 3100 → 43 tools over localhost MCP

Outcome: All 43 tools work completely offline. Knowledge graph stored in local SQLite. Tree-sitter runs as WASM — no native binaries. MCP server binds 127.0.0.1 only. No API keys, no cloud dependency, no telemetry. Defense-grade code intelligence without defense-grade complexity.

Standardize Your AI Team

Enterprise Team Lead / CTO Governance

5 developers, 5 AI agents, 5 different coding styles. Chaos. Milens auto-generates AGENTS.md from your actual codebase conventions — not a generic template. Skill files enforce workflows. Security rules run on every commit. And evolve turns learned patterns into permanent rules.

init --profile full --interactive → AGENTS.md auto-generated → 6 skill files installed → 190+ security rules active → evolve --schedule install

Outcome: Every agent reads the same AGENTS.md — "use tabs, not spaces", "always run impact before editing", "never commit without detect_changes". Conventions enforced, not suggested. evolve runs weekly via cron: high-confidence patterns auto-promote to permanent rules. One command. One standard. Zero chaos.

Reduced Token Usage = $1,000s Saved Monthly

Any Scale Cost-Aware Optimization

AI agents charge per token. Every time your agent reads a full file to understand one function, you're burning cash. Milens' compact format cuts 40-60% of tokens. codebase_summary is ~500 tokens for the entire project. smart_context returns only what your intent needs. For a 10-person team using AI daily, this saves thousands per month.

codebase_summary() → ~500 tokens → smart_context({intent: "edit"}) → callers only → overview() → 1 call vs 3 separate → context({detail: "L0"}) → names only → $1000s saved/month

Outcome: Compact format name [kind] file:line vs raw code blocks. Detail levels L0/L1/L2 control output size. overview combines 3 tools in 1 call (saves 2 round trips). smart_context filters by intent. For a team making 200 tool calls/day, Reduced Token Usage = real budget impact. Your CFO will love this page.

Clean Uninstall — Remove Every Trace

You're done with milens on a project and want full cleanup. Not just npm uninstall — injected blocks, generated files, hooks, cron, registry entries, MCP configs, env vars.

milens uninstall → Scans 11 trace categories → Interactive removal → Final verification scan

Outcome: milens uninstall scans injected blocks, generated files, git hooks, cron entries, Windows tasks, databases, registry, MCP configs, deps, env vars, and manual references. --dry-run previews without deleting. --scan-only detects only. One command, zero traces left.

12-Language Accuracy — Precision/Recall Validation

You maintain a multi-language codebase (TypeScript, Python, Ruby, Vue). You need to know how accurate milens's symbol detection is for each language before trusting it in production.

8 accuracy fixtures → expected.json per project → analyze() → precision/recall → Validated per language

Outcome: Ruby detects constants, instance/class variables, type bindings, Rails DSL. Vue SFC parses Composition API + AST template + scoped CSS. CSS extracts class/id/keyframes/media queries. HTML uses tree-sitter + form/link detection. Cross-language linking: HTML class → CSS symbol. 8 test projects validate precision/recall across all 12 languages.

💥 Check Blast Radius Editing

You say

"sửa hàm loginHandler"

Tool

impact({target: "loginHandler", depth: 3})

What happens

Shows every symbol that breaks if loginHandler changes. Depth-1: 3 dependents. Depth-2: 8. Depth-3: 14. If depth-1 > 5, Milens warns you before editing.

🗑️ Delete a Symbol Safely Editing

You say

"xóa oldHelper đi"

Tool

grep({pattern: "oldHelper"})
impact({target: "oldHelper"})

What happens

Finds ALL text references (templates, configs, docs, routes) + all code dependents. Nothing is missed — grep catches what impact can't see.

✏️ Rename a Symbol Editing

You say

"đổi tên getUser thành fetchUser"

Tool

grep({pattern: "getUser"})
impact({target: "getUser", direction: "upstream"})

What happens

Combines grep (text search across all files) + impact (code-level dependents). Shows exactly which files need updating before you touch anything.

🔒 Safe Commit Editing

You say

"commit thôi"

Tool

detect_changes({ref: "HEAD"})

What happens

Scans git diff, returns only the symbols that actually changed + their direct dependents. Know exactly what you're committing before you commit.

📊 PR Risk Assessment Review

You say

"review PR này"

Tool

review_pr({ref: "HEAD"})

What happens

Scores every changed symbol CRITICAL, HIGH, MEDIUM, or LOW based on heat (incoming refs), dependents, and test coverage. Instantly see which files are risky.

🎯 Deep-Dive a Symbol Review

You say

"hàm handlePayment risky không?"

Tool

review_symbol({name: "handlePayment"})

What happens

Returns role + heat + full dependent list + test status + risk level + actionable recommendation. One call, complete picture.

🧹 Find Dead Code Review

You say

"có dead code không?"

Tool

find_dead_code({limit: 30})

What happens

Lists all exported symbols with zero incoming references. Filter by kind (function, class, method) to find unused code instantly.

🚦 Pre-Commit Gate Review

You say

"check pre-commit"

Tool

pre_commit_check()

What happens

Runs review_pr + find_dead_code + test_coverage_gaps in one call. A complete pre-commit safety gate before you push.

📋 Plan Tests Testing

You say

"viết test cho createUser"

Tool

test_plan({name: "createUser"})

What happens

Returns mock strategy (what to stub/spy/fake) + 3+ test scenarios: happy path, edge case, error handling. Know exactly what to test.

🕳️ Coverage Gaps Testing

You say

"chỗ nào chưa có test?"

Tool

test_coverage_gaps({limit: 20})

What happens

Lists untested exported symbols sorted by risk (CRITICAL first). Prioritize what to test based on actual impact, not guesswork.

🎯 Test Impact Testing

You say

"chạy test nào sau khi sửa?"

Tool

test_impact({ref: "HEAD"})

What happens

Maps changed symbols directly to affected test files. Run only the tests that matter — no more running the entire suite.

🤖 Auto-Generate Tests Testing

You say

"generate test cho formatDate"

Tool

test_generate({symbol: "formatDate"})

What happens

Auto-detects test framework (vitest/jest/mocha/pytest) and generates a complete test file with mock strategy and scenarios.

🛡️ Full Security Audit Security

You say

"scan bảo mật toàn bộ project"

Tool

security_scan({scope: "all", severity: "HIGH"})

What happens

190+ rules across 10 categories: secrets, injection, unicode, dangerous patterns, config leaks, data leaks, crypto, auth, file access. One call replaces 10 manual grep searches.

🔧 Auto-Fix Security Security

You say

"fix lỗi bảo mật dòng 42 file auth.ts"

Tool

fix_apply({ruleId: "secret-in-code", file: "src/auth.ts", line: 42})

What happens

Applies the security fix automatically and creates a backup in .milens/backups/. Safe, reversible, auditable.

🔑 Detect Leaked Secrets Security

You say

"có secret nào bị leak không?"

Tool

security_scan({scope: "secrets"})

What happens

Detects API keys, access tokens, passwords, private keys, and connection strings embedded in source code. Know before they reach production.

🔍 Trace Execution Debugging

You say

"lỗi ở sendEmail, trace execution path"

Tool

trace({name: "sendEmail", direction: "to"})

What happens

Shows the full call chain from entrypoints down to sendEmail. Understand exactly how this code gets triggered at runtime.

🔗 Find Relationships Debugging

You say

"AuthService và TokenManager liên quan thế nào?"

Tool

explain_relationship({from: "AuthService", to: "TokenManager"})

What happens

Returns the shortest dependency path between any two symbols. AuthService → AuthMiddleware → TokenManager (2 steps).

🩺 Debug a Symbol Debugging

You say

"tìm hiểu tại sao User model bị lỗi"

Tool

smart_context({name: "User", intent: "debug"})

What happens

Returns execution paths + data flow for the User symbol. See what calls it, what it calls, and where data flows — all in one call.

📦 Codebase Overview Architecture

You say

"tổng quan project này thế nào?"

Tool

codebase_summary()

What happens

~500 token overview: domains, top hub symbols, test coverage %, annotation counts. The fastest way to understand any codebase.

🧩 Domain Clusters Architecture

You say

"project này có những module nào?"

Tool

domains()

What happens

Shows logical module clusters based on the dependency graph. See which files form cohesive groups — auth, db, routes, utils, config.

🌐 API Endpoints Architecture

You say

"có những API endpoint nào?"

Tool

routes({framework: "express"})

What happens

Detects framework routes and maps them to handler functions. [GET] /api/users → userController.list. Supports Express, FastAPI, NestJS, Flask, Go, PHP, Rails.

🔎 Find Symbols Discovery

You say

"tìm hàm xử lý authentication"

Tool

query({query: "auth"})

What happens

FTS5 symbol search returns definitions instantly. AuthService [class] src/auth.ts:12 (exported). Use query for code identifiers, grep for display text.

📝 Text Search Discovery

You say

"tìm chữ 'User not found' ở đâu?"

Tool

grep({pattern: "User not found"})

What happens

Searches ALL files — templates, configs, styles, docs, routes. Finds every occurrence, not just code symbols. Perfect for UI labels and error messages.

🪞 Find Similar Code Discovery

You say

"có hàm nào giống formatCurrency không?"

Tool

find_similar({name: "formatCurrency"})

What happens

Returns top 10 symbols that share callers or callees. Great for finding patterns to copy, refactor together, or avoid duplicating.

📄 File Symbols Discovery

You say

"file src/auth.ts có những symbols gì?"

Tool

get_file_symbols({file: "src/auth.ts"})

What happens

Lists every symbol in the file with kind, line number, and reference/dependency counts. Understand any file at a glance without opening it.

🧠 Semantic Search Discovery

You say

"tìm concept 'payment processing' — không cần biết tên hàm"

Tool

semantic_search({query: "payment processing"})

What happens

Finds relevant symbols by meaning, not name. Discover checkoutFlow even though you searched "payment". Requires --embeddings flag for vector search.

🌳 Explore AST Discovery

You say

"parse thử đoạn code này xem AST structure"

Tool

ast_explore({code: "const x = 1", language: "typescript"})

What happens

Returns the S-expression AST tree for any code snippet. Supports all 12 milens-indexed languages. Debug parsers, write tree-sitter queries, or understand language grammar.

🏛️ Type Hierarchy Architecture

You say

"PaymentProcessor kế thừa từ những lớp nào?"

Tool

get_type_hierarchy({name: "PaymentProcessor"})

What happens

Shows full inheritance tree: BaseProcessor → PaymentProcessor → CreditCardProcessor, PayPalProcessor. Understand OOP structure instantly.

🩺 Index Health Architecture

You say

"index database còn healthy không?"

Tool

status()

What happens

Returns symbols count, links count, files, test coverage %, and staleness. Know instantly if your index is up to date or needs analyze --force.

🗂️ List Repos Architecture

You say

"có bao nhiêu repo đã được index?"

Tool

repos()

What happens

Lists all indexed repositories with symbol counts, file counts, and last-indexed dates. Essential for monorepo and multi-project setups.

⚠️ Pre-Edit Safety Check Editing

You say

"trước khi sửa resolveLinks, có warning gì không?"

Tool

edit_check({name: "resolveLinks"})

What happens

Returns callers + export status + re-export chains + warnings. "WARNING: exported function — changing signature is a breaking change." Designed specifically for editing intent.

⚡ Overview in One Call Editing

You say

"cho tôi context + impact + grep của Database trong 1 call"

Tool

overview({name: "Database"})

What happens

Combines context, impact, and grep into ONE call. Saves 2-3 round trips. The preferred tool before editing, deleting, or renaming any symbol.

📊 Compare Impact Debugging

You say

"refactor UserService xong, blast radius tăng hay giảm?"

Tool

compare_impact({name: "UserService", action: "snapshot"})
compare_impact({name: "UserService", action: "compare"})

What happens

Snapshot before, compare after. Detects regressions: new dependents appeared, heat increased, blast radius widened. Your automated regression guard for every refactor.

🔄 File Change Impact Debugging

You say

"vừa sửa 3 files, symbols nào bị ảnh hưởng?"

Tool

hook_onFileChange({files: ["src/auth.ts", "src/models.ts"]})

What happens

Re-analyzes changed files and shows impact on all affected symbols. Hook triggers automatically when files change — no manual re-analysis needed.

🔬 Test Tree-Sitter Query Testing

You say

"test tree-sitter query trước khi thêm vào parser"

Tool

test_query({query: "(identifier) @name", code: "...", language: "typescript"})

What happens

Runs a tree-sitter query against code and returns all captured matches. Test queries before adding them to a LangSpec. Essential for parser developers.

📜 Session History Discovery

You say

"phiên làm việc này đã dùng những tools gì?"

Tool

session_context({session_id: "..."})

What happens

Returns full session metadata: tool calls made, annotations recorded, duration, agent name. Complete audit trail for every AI session.

📋 Symbol-Level PR Review Code Review

You say

"review cai PR nay, nhung dung flag nhung symbol khong thay doi"

Tool

review_pr({ref: "main"})

What happens

Symbol-level diff via git show — compares old vs new commit symbols, not entire files. Cross-file impact tracks downstream callers. Fixture/test files excluded. Internal implementation details deprioritized. Only actually changed symbols flagged.

🧹 Clean Uninstall Architecture

You say

"xoa het milens traces trong project nay"

CLI

milens uninstall
milens uninstall --dry-run

What happens

Scans 11 trace categories: injected blocks, generated files, git hooks, cron, Windows tasks, databases, registry, MCP configs, deps, env vars, manual refs. Interactive wizard or auto-remove with --dry-run to preview.