CLI Reference¶
Commands¶
pikoclaw extract¶
Extract knowledge from one or more email archives.
Arguments:
| Argument | Description |
|---|---|
files |
One or more PST/OST files, Maildir directories, or MBOX files |
Options:
| Flag | Default | Description |
|---|---|---|
-o, --output DIR |
./pikoclaw-output |
Output directory |
--json |
off | Generate extraction.json alongside wiki |
--csv |
off | Generate CSV files (emails.csv, contacts.csv, threads.csv) |
--wiki / --no-wiki |
--wiki |
Generate markdown wiki output |
--extract-attachments |
off | Save binary attachment data to wiki/attachments/ |
--max-messages N |
unlimited | Cap messages processed per source file |
--password PASS |
none | Password for encrypted PST/OST files |
--skip-protected |
off | Skip password-protected files instead of failing |
--incremental |
off | Delta extraction: skip messages already extracted |
--redact [TYPES...] |
off | Redact PII (email, phone, ssn, credit_card, ip) |
Usage Examples¶
Basic Extraction¶
Custom Output Directory¶
Multiple Sources, Mixed Formats¶
All messages are merged into a single knowledge base. Threading and contact aggregation work across sources.
JSON Only (No Wiki)¶
Produces extraction.json without generating the markdown wiki.
Save Attachments¶
Binary attachment data is saved to wiki/attachments/msg-NNNNNN/filename. Filenames are sanitized to prevent path traversal.
Limit Processing¶
Processes at most 5,000 messages from each source file. Useful for previewing large archives.
Output¶
PikoClaw prints a summary after extraction:
============================================================
Processing: mailbox.pst [pst]
============================================================
12847 messages, 234 calendar events
--- Results ---
Emails: 12847
Calendar: 234
Contacts: 342
Threads: 4291
Multi-msg: 1847
Generating wiki in ./pikoclaw-output/wiki/
Created: index.md
Created: contacts.md
Created: contacts.json
Created: threads.md
Created: calendar.md
Created: emails/all.md
Created: emails/inbox.md
Created: emails/sent.md
Created: provenance.json
Created: network-analysis.md
Done. Output in ./pikoclaw-output/
Source hash: a1b2c3d4e5f6a7b8...
Exit Codes¶
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Error (file not found, unsupported format, extraction failure) |
Format Auto-Detection¶
PikoClaw detects the format automatically:
- PST/OST --
.pstor.ostextension (requireslibpff-python) - Maildir -- directory containing
cur/new/tmpsubdirectories, or Enron-style directory tree with message files - MBOX --
.mboxor.mbxextension
If no adapter matches, PikoClaw exits with an error listing supported formats.
pikoclaw search¶
Search extracted email archives using the TF-IDF index.
| Flag | Default | Description |
|---|---|---|
-n, --top-n N |
10 | Number of results to return |
--sender EMAIL |
none | Filter by sender email address |
--after DATE |
none | Only show results after this date (ISO 8601) |
--before DATE |
none | Only show results before this date (ISO 8601) |
Examples:
pikoclaw search ./pikoclaw-output "wastewater compliance 2017"
pikoclaw search ./pikoclaw-output "budget review" --sender alice@acme.com
pikoclaw search ./pikoclaw-output "project update" --after 2024-01-01 --before 2024-06-30
pikoclaw validate¶
Validate an extraction.json file against the PikoClaw schema.
Prints detailed validation results with error counts and a summary of the extraction on success.
pikoclaw stats¶
Show quick overview statistics for an extraction.
Displays extraction metadata, email/contact/thread counts, top contacts, date range, and folder distribution.
pikoclaw viz¶
Generate an interactive D3 force-directed graph visualization.
| Flag | Default | Description |
|---|---|---|
-o, --output FILE |
graph.html |
Output HTML file |
Opens in any browser. Node size reflects message count, color reflects Louvain community. Supports drag, zoom, and hover tooltips.
pikoclaw serve¶
Start an HTTP API server for querying extracted archives.
Local vs. Deployed API
The pikoclaw serve command starts a local API server for development and direct agent integration.
This is distinct from the production Cloudflare Worker API (api.pappas.work), which has a different set of available endpoints and does not include a /search endpoint.
The local API is intended for direct access to your extracted data within your environment.
| Endpoint | Description |
|---|---|
GET /api/status |
Server status and dataset summary |
GET /api/stats |
Extraction statistics |
GET /api/emails?limit=N&offset=N&sender=EMAIL |
Paginated email listing |
GET /api/contacts |
Contact directory |
GET /api/threads?limit=N&offset=N |
Conversation threads |
GET /api/provenance |
Extraction provenance metadata |
pikoclaw schema¶
Print the JSON schema for PikoClaw extraction output.
Outputs a JSON Schema (draft-07) document describing the structure of extraction.json.