HIPsHanzo Proposals
Back to HIPs
HIP-192DraftStandards TrackInterface

Unified MCP Tools Architecture

Hanzo AI Team
Created: 2025-01-21

HIP-0300: Unified MCP Tools Architecture

Abstract

This HIP proposes consolidating hanzo-mcp's 52+ individual tools into 8 core orthogonal tools following Unix philosophy. Each tool handles one axis (bytes, processes, symbols, diffs, UI) with minimal overlap, composable via stable identifiers.

Principles

  1. De-dupe: One canonical way to do a thing; everything else is alias/shim
  2. Unify: Identical envelope + paging + error model + path/range semantics
  3. Orthogonal: Each tool does one axis with minimal overlap
  4. Composable: Tools are pure-ish functions over stable IDs (uri, hash, proc_id), so agents can pipe outputs reliably

Specification

The 8 Core Tools

ToolDomainAxisKey Actions
wsWorkspaceContextdetect, capabilities, help, schema
fsFilesystemBytes + Pathsread, write, stat, list, apply_patch, search_text
codeSemanticsSymbolsdefinition, references, rename, format, diagnostics
procProcessesExecutionexec, ps, kill, logs
testTestingVerificationlist, run, status
vcsVersion ControlDiffs + Historystatus, diff, apply, commit, branch
netNetworkHTTP/APIrequest, download, open
uiInterfaceScreen/Browserclick, type, screenshot, focus, record

Optional (heavy dependencies):

ToolDomainKey Actions
llmLLM Providersquery, stream, embed, consensus
memoryPersistent Storagerecall, create, update, facts
hanzoPlatformcloud, deploy, auth, node

Hard De-dupe Rules (Non-negotiable)

1. One Edit Primitive

Only fs.apply_patch(base_hash=...) mutates existing files.

  • Prevents stale edits
  • Enables transactions/replay
  • Makes "review before apply" natural
# The ONLY way to edit files
fs(action="apply_patch", uri="file:///...", patch="...", base_hash="abc123")

2. One Search Primitive Per Axis

AxisToolAction
Text (grep)fssearch_text
Symbolscodesearch_symbol

Never mix these.

3. One Execution Primitive

All command execution goes through proc.exec. test.run is a normalized preset on top of it.

# All execution
proc(action="exec", command="npm test")

# Testing (wrapper over proc.exec)
test(action="run", suite="unit")

4. One Diff Primitive

  • vcs.diff for repo diffs
  • fs.diff(hash_a, hash_b) optional, content-hash based

Unified Envelope

Every tool returns:

{
  "ok": true,
  "data": { },
  "error": null,
  "meta": {
    "tool": "fs",
    "action": "read",
    "trace_id": "...",
    "backend": "ripgrep|lsp|...",
    "paging": { "cursor": null, "more": false }
  }
}

Error response (codes > strings):

{
  "ok": false,
  "data": null,
  "error": {
    "code": "CONFLICT",
    "message": "base_hash mismatch",
    "details": { "expected": "abc123", "actual": "def456" }
  },
  "meta": { }
}

Composability Primitives

Stable identifiers for chaining outputs:

IDFormatExample
urifile:///... alwaysfile:///src/main.py
hashContent hash from fs.read/statsha256:abc123...
range{start:{line,col}, end:{line,col}} 0-based{start:{line:10,col:0}}
refLog/blob handlesstdout_ref:proc_123
symbol_idStable IDs from LSP/TSsym:UserService.auth

Example pipe:

1. fs.search_text("error") → {matches:[{uri, range}]}
2. fs.read(uri, range+context) → {text, hash}
3. fs.apply_patch(uri, patch, base_hash=hash)
4. test.run()
5. vcs.diff()

No tool needs to "know" the others—just consumes normalized IDs.

Tool Specifications

ws (Workspace)

ActionDescriptionReturns
detectAuto-detect project{root, languages, build, test, vcs, backends}
capabilitiesList available backends{lsp: [...], vcs: "git", ...}
helpTool manpage + examplesMarkdown
schemaJSON Schema for all actionsJSON Schema

fs (Filesystem)

ActionDescriptionKey Params
readRead fileuri, range?, encoding?
writeCreate new fileuri, content
statFile metadatauri{size, hash, mtime}
listList directoryuri, depth?, pattern?
mkdirCreate directoryuri
rmRemove (guarded)uri, confirm
apply_patchEdit fileuri, patch, base_hash
search_textRipgrep searchpattern, path?, paging?
diffContent diffhash_a, hash_b

Critical: apply_patch is the ONLY mutation for existing files.

code (Semantics)

ActionDescriptionKey Params
definitionGo to definitionuri, position
referencesFind referencesuri, position
renameRename symboluri, position, new_name
formatFormat codeuri, range?
diagnosticsGet errors/warningsuri
search_symbolSearch symbolsquery, scope?
hoverHover informationuri, position
completionCode completionuri, position

Backends negotiated via ws.capabilities: LSP → TreeSitter → heuristic. All results normalized to {uri, range, snippet, symbol_id}.

proc (Processes)

ActionDescriptionKey Params
execRun commandcommand, cwd?, env?, timeout?
psList processesfilter?
killKill processproc_id, signal?
logsGet process logsproc_id, tail?, since?

Returns: {proc_id, exit_code, stdout_ref, stderr_ref}

No filesystem edits or parsing—just runs things.

test (Testing)

ActionDescriptionKey Params
listList test suitespath?
runRun testssuite?, filter?, parallel?
statusGet test statusrun_id

Normalizes: {pass, fail, skip, duration, failure_locations}.

Opinionated wrapper over proc.exec.

vcs (Version Control)

ActionDescriptionKey Params
statusWorking tree status-
diffShow diffref?, staged?
applyApply patchpatch
commitCreate commitmessage, files?
branchBranch operationsop (list, create, delete)
checkoutSwitch branchref
logCommit historylimit?, path?

Outputs diffs in unified patch format; integrates with fs.apply_patch.

net (Network)

ActionDescriptionKey Params
requestHTTP requesturl, method, headers?, body?
downloadDownload fileurl, output
openOpen URL in browserurl

Separate from proc to avoid "curl in shell" duplication.

ui (Interface)

ActionDescriptionKey Params
clickClick at positionx, y, button?
typeType texttext
screenshotCapture screenregion?
focusFocus windowtitle?, pid?
recordStart recordingduration?
stopStop recording-
sessionRecord + analyzeduration

Consolidates computer + screen tools.

Built-in Actions (Every Tool)

ActionDescription
helpShort manpage + examples
schemaJSON Schema for each action
statusTool status (enabled, version, backend)

Modes (Convention, Not Tools)

ModeAllowed Actions
inspectread, search, diagnose only
applypatch, rename
verifytest, run
shipcommit, tag, release

Encoded as meta.intent hints or config, not baked into actions.

Consolidation Mapping

# Old → New

# Filesystem (7 → 1)
read            → fs(action="read")
write           → fs(action="write")
edit            → fs(action="apply_patch")  # base_hash required
tree            → fs(action="list", depth=...)
find            → fs(action="list", pattern=...)
search          → fs(action="search_text")
ast             → code(action="search_symbol")

# Shell (8 → 1)
zsh/bash        → proc(action="exec")
ps              → proc(action="ps")
npx             → proc(action="exec", command="npx ...")
uvx             → proc(action="exec", command="uvx ...")
open            → net(action="open") or ui(action="focus")
curl            → net(action="request")
wget            → net(action="download")

# Computer (2 → 1)
computer        → ui(action="click|type|...")
screen          → ui(action="session|record|stop")

# Code (2 → 1)
lsp             → code(action="definition|references|...")
refactor        → code(action="rename|format|...")

# Git (implicit)
git commands    → vcs(action="status|diff|commit|...")

# Testing (new)
test runners    → test(action="run|list|status")

Implementation

Phase 1: Foundation

  1. Implement UnifiedToolBase with envelope + schema/help
  2. Implement fs.apply_patch with base_hash preconditions
  3. Implement ws.detect / ws.capabilities

Phase 2: Core Tools

  1. Merge lsp + refactorcode with backend fallback
  2. Merge screen + computerui
  3. Wrap test runners into test.run over proc.exec

Phase 3: Polish

  1. Add paging/cursor support to all search/list actions
  2. Implement vcs as thin wrapper over git
  3. Add net for HTTP without shell

Base Class

class UnifiedToolBase(BaseTool):
    """Base for unified tools with action routing and envelope."""

    name: str
    _handlers: dict[str, Callable]
    _schemas: dict[str, dict]

    def action(self, name: str, schema: dict = None):
        """Decorator to register action handler with schema."""
        def decorator(fn):
            self._handlers[name] = fn
            if schema:
                self._schemas[name] = schema
            return fn
        return decorator

    async def call(self, ctx, action: str = "help", **kwargs) -> dict:
        if action == "help":
            return self._envelope({"actions": self._get_help()})
        if action == "schema":
            return self._envelope({"schemas": self._schemas})
        if action not in self._handlers:
            return self._error("UNKNOWN_ACTION", f"Unknown: {action}",
                             available=list(self._handlers.keys()))
        try:
            result = await self._handlers[action](ctx, **kwargs)
            return self._envelope(result, action=action)
        except ConflictError as e:
            return self._error("CONFLICT", str(e), details=e.details)
        except Exception as e:
            return self._error("ERROR", str(e))

    def _envelope(self, data, action=None, paging=None):
        return {
            "ok": True,
            "data": data,
            "error": None,
            "meta": {
                "tool": self.name,
                "action": action,
                "paging": paging or {"cursor": None, "more": False}
            }
        }

    def _error(self, code, message, **details):
        return {
            "ok": False,
            "data": None,
            "error": {"code": code, "message": message, **details},
            "meta": {"tool": self.name}
        }

Entry Points

[project.entry-points."hanzo.tools"]
ws = "hanzo_tools.ws:TOOLS"
fs = "hanzo_tools.fs:TOOLS"
code = "hanzo_tools.code:TOOLS"
proc = "hanzo_tools.proc:TOOLS"
test = "hanzo_tools.test:TOOLS"
vcs = "hanzo_tools.vcs:TOOLS"
net = "hanzo_tools.net:TOOLS"
ui = "hanzo_tools.ui:TOOLS"

# Optional
llm = "hanzo_tools.llm:TOOLS"
memory = "hanzo_tools.memory:TOOLS"
hanzo = "hanzo_tools.hanzo:TOOLS"

Anti-patterns to Avoid

  1. Mega-tool with 70 actions - Kills orthogonality
  2. Multiple edit primitives (edit/write/patch) - Pick apply_patch
  3. Mixing UI automation with semantic code ops - Keep separate
  4. Non-normalized paths/ranges - Agents hallucinate conversions
  5. Strings for errors - Use typed error codes

Rationale

Why 8 Tools?

Each handles one orthogonal axis:

  • ws: Project context
  • fs: Bytes and paths
  • code: Symbols and semantics
  • proc: Process execution
  • test: Verification
  • vcs: History and diffs
  • net: Network requests
  • ui: Screen interaction

No overlap. Maximum composability.

Why Clean Break?

  • Deprecation paths add complexity
  • Old clients can pin to v0.11.x
  • New clients get clean API immediately
  • Less code to maintain

Why base_hash for Edits?

Prevents race conditions and stale edits:

# Read file, get hash
result = fs(action="read", uri="file:///main.py")
hash = result["data"]["hash"]

# Edit with precondition
fs(action="apply_patch", uri="file:///main.py",
   patch="...", base_hash=hash)  # Fails if file changed

Backwards Compatibility

No backward compatibility. This is a breaking change.

Clients using v0.11.x should:

  1. Pin to hanzo-mcp<0.12
  2. Or migrate to unified tools

Security Considerations

  • Same permission model applies
  • Same sandboxing for file/shell operations
  • fs.rm requires explicit confirm=true
  • proc.exec respects existing restrictions

Test Cases

  1. help action - Returns manpage for all tools
  2. schema action - Returns valid JSON Schema
  3. unknown action - Returns structured error with available actions
  4. base_hash conflict - Returns CONFLICT error with details
  5. paging - Large results paginate correctly
  6. composability - Pipe outputs chain without interpretation

Implementation Status

Python Implementation

Repository: hanzo/mcp (Python)

ToolStatusActions
proc✅ Fullexec, ps, kill, logs, help
fs✅ Fullread, write, edit, patch, tree, find, search, info
think✅ Fullthink, critic, review
memory✅ Fullrecall, create, update, delete, manage, facts, summarize
browser✅ Full90+ Playwright actions
ui✅ Fullclick, type, screenshot, focus, record, session
mode✅ Fulllist, activate, show, current
plan✅ Fullcreate, update, list, get

Test Coverage: ~95% across all tools

Rust Implementation

Repository: hanzo/mcp/rust

ToolStatusActionsNotes
proc (ShellTool)✅ Fullexec, ps, kill, logs, helpAuto-backgrounding at 45s
fs (FsTool)✅ Fullread, write, edit, patch, tree, find, search, infoTree-sitter AST search
think (ThinkTool)✅ Fullthink, critic, reviewAll review focus areas
memory (MemoryTool)✅ Fullrecall, create, update, delete, manage, facts, summarizeSession/project/global scopes
browser (BrowserTool)⚠️ Partial90+ actions via PlaywrightRequires Playwright runtime
ui (UiTool)✅ FullmacOS native + cross-platformQuartz backend on macOS
mode (ModeTool)✅ Fulllist, activate, show, current10+ developer modes
plan (PlanTool)✅ Fullcreate, update, list, getTask management

Rust-specific features:

  • Tree-sitter AST search (8 languages: Rust, JS, TS, Python, Go, Java, C, C++)
  • Native macOS UI automation via Quartz
  • Process auto-backgrounding with 45s timeout
  • Unified search with modality detection (Text, AST, Symbol, Vector, Memory, File)

Test Coverage: ~90% parity with Python tests

Test Files (Rust)

rust/tests/
├── test_shell_tools.rs    # proc tool tests
├── test_fs_tools.rs       # fs tool tests
├── test_search_tools.rs   # unified search tests
├── test_think_tools.rs    # think/critic/review tests
├── test_memory_tools.rs   # memory operations tests
└── test_browser_tools.rs  # browser automation tests

Search Modalities (Rust)

pub enum SearchModality {
    Text,    // Ripgrep-based text search
    Ast,     // Tree-sitter AST search
    Symbol,  // Symbol/definition search
    Vector,  // Semantic vector search
    Memory,  // Memory/knowledge search
    File,    // File pattern search
}

Modality auto-detection:

  • Natural language queries → Vector + Text
  • Code patterns (class, fn, def) → AST + Text
  • Single identifiers → Symbol + Text
  • File paths/extensions → File + Text

Key Rust Structures

// Process execution
pub struct ProcToolArgs {
    pub action: String,
    pub command: Option<Value>,  // String or Array
    pub cwd: Option<String>,
    pub env: Option<HashMap<String, String>>,
    pub timeout: Option<u64>,
    pub shell: Option<String>,
    pub proc_id: Option<String>,
}

// Filesystem operations
pub struct FsToolArgs {
    pub action: String,
    pub path: Option<String>,
    pub content: Option<String>,
    pub old_text: Option<String>,
    pub new_text: Option<String>,
    pub pattern: Option<String>,
    pub depth: Option<usize>,
    pub offset: Option<usize>,
    pub limit: Option<usize>,
}

// Think/reasoning
pub struct ThinkToolArgs {
    pub action: String,
    pub thought: Option<String>,
    pub analysis: Option<String>,
    pub work_description: Option<String>,
    pub focus: Option<String>,
    pub code_snippets: Option<Vec<String>>,
    pub file_paths: Option<Vec<String>>,
    pub context: Option<String>,
}

Building & Testing

# Build Rust MCP
cd hanzo/mcp/rust
cargo build --release

# Run all tests
cargo test

# Run specific test module
cargo test -p hanzo-mcp test_shell_tools

# Run with verbose output
cargo test -- --nocapture

Copyright

This document is licensed under the MIT License.