Skip to content

CLI Reference

Commands

pikoclaw extract

Extract knowledge from one or more email archives.

pikoclaw extract <files...> [options]

Arguments:

Argument Description
files One or more PST/OST files, Maildir directories, or MBOX files

Options:

Flag Default Description
-o, --output DIR ./pikoclaw-output Output directory
--json off Generate extraction.json alongside wiki
--csv off Generate CSV files (emails.csv, contacts.csv, threads.csv)
--wiki / --no-wiki --wiki Generate markdown wiki output
--extract-attachments off Save binary attachment data to wiki/attachments/
--max-messages N unlimited Cap messages processed per source file
--password PASS none Password for encrypted PST/OST files
--skip-protected off Skip password-protected files instead of failing
--incremental off Delta extraction: skip messages already extracted
--redact [TYPES...] off Redact PII (email, phone, ssn, credit_card, ip)

Usage Examples

Basic Extraction

pikoclaw extract mailbox.pst

Custom Output Directory

pikoclaw extract mailbox.pst -o ~/Documents/email-archive

Multiple Sources, Mixed Formats

pikoclaw extract mailbox.pst archive.mbox /data/maildir --output ./kb

All messages are merged into a single knowledge base. Threading and contact aggregation work across sources.

JSON Only (No Wiki)

pikoclaw extract mailbox.pst --json --no-wiki

Produces extraction.json without generating the markdown wiki.

Save Attachments

pikoclaw extract mailbox.pst --extract-attachments

Binary attachment data is saved to wiki/attachments/msg-NNNNNN/filename. Filenames are sanitized to prevent path traversal.

Limit Processing

pikoclaw extract huge-archive.pst --max-messages 5000

Processes at most 5,000 messages from each source file. Useful for previewing large archives.

Output

PikoClaw prints a summary after extraction:

============================================================
Processing: mailbox.pst [pst]
============================================================
  12847 messages, 234 calendar events

--- Results ---
  Emails:     12847
  Calendar:   234
  Contacts:   342
  Threads:    4291
  Multi-msg:  1847

Generating wiki in ./pikoclaw-output/wiki/
  Created: index.md
  Created: contacts.md
  Created: contacts.json
  Created: threads.md
  Created: calendar.md
  Created: emails/all.md
  Created: emails/inbox.md
  Created: emails/sent.md
  Created: provenance.json
  Created: network-analysis.md

Done. Output in ./pikoclaw-output/
Source hash: a1b2c3d4e5f6a7b8...

Exit Codes

Code Meaning
0 Success
1 Error (file not found, unsupported format, extraction failure)

Format Auto-Detection

PikoClaw detects the format automatically:

  1. PST/OST -- .pst or .ost extension (requires libpff-python)
  2. Maildir -- directory containing cur/new/tmp subdirectories, or Enron-style directory tree with message files
  3. MBOX -- .mbox or .mbx extension

If no adapter matches, PikoClaw exits with an error listing supported formats.


Search extracted email archives using the TF-IDF index.

pikoclaw search <source> <query> [options]
Flag Default Description
-n, --top-n N 10 Number of results to return
--sender EMAIL none Filter by sender email address
--after DATE none Only show results after this date (ISO 8601)
--before DATE none Only show results before this date (ISO 8601)

Examples:

pikoclaw search ./pikoclaw-output "wastewater compliance 2017"
pikoclaw search ./pikoclaw-output "budget review" --sender alice@acme.com
pikoclaw search ./pikoclaw-output "project update" --after 2024-01-01 --before 2024-06-30

pikoclaw validate

Validate an extraction.json file against the PikoClaw schema.

pikoclaw validate <source>

Prints detailed validation results with error counts and a summary of the extraction on success.


pikoclaw stats

Show quick overview statistics for an extraction.

pikoclaw stats <source>

Displays extraction metadata, email/contact/thread counts, top contacts, date range, and folder distribution.


pikoclaw viz

Generate an interactive D3 force-directed graph visualization.

pikoclaw viz <source> [-o output.html]
Flag Default Description
-o, --output FILE graph.html Output HTML file

Opens in any browser. Node size reflects message count, color reflects Louvain community. Supports drag, zoom, and hover tooltips.


pikoclaw serve

Start an HTTP API server for querying extracted archives.

Local vs. Deployed API

The pikoclaw serve command starts a local API server for development and direct agent integration. This is distinct from the production Cloudflare Worker API (api.pappas.work), which has a different set of available endpoints and does not include a /search endpoint. The local API is intended for direct access to your extracted data within your environment.

pikoclaw serve <source> [--port 8080] [--host 127.0.0.1]
Endpoint Description
GET /api/status Server status and dataset summary
GET /api/stats Extraction statistics
GET /api/emails?limit=N&offset=N&sender=EMAIL Paginated email listing
GET /api/contacts Contact directory
GET /api/threads?limit=N&offset=N Conversation threads
GET /api/provenance Extraction provenance metadata

pikoclaw schema

Print the JSON schema for PikoClaw extraction output.

pikoclaw schema

Outputs a JSON Schema (draft-07) document describing the structure of extraction.json.