Work / imessage-history
imessage-history
Open-source Python CLI
Open-source Python CLI · stdlib only
Gallery
macOS keeps every iMessage you have ever sent in a local SQLite database (chat.db), but the raw schema is awkward: outgoing rows point at the recipient instead of the sender, long messages are stuffed into an Apple typedstream blob that truncates at 255 characters if you parse the length prefix wrong, and tapbacks live as cross-referenced rows. imessage-history smooths all of that out and emits one window of one conversation as a clean, speaker-attributed dataset ready for analysis or LLM prompting.
It is Python 3.10+ with zero required runtime dependencies, opens chat.db read-only with both URI flags and PRAGMA query_only (and a regression test that proves every write statement raises and the file is byte-for-byte unchanged after close), and ships an opt-in pseudonymizer for handles, names, phone numbers, emails, and URLs for the moments when the only reasonable way to think about a long thread is to drop it into a hosted model.
An optional Textual TUI ships under the [tui] extra for a guided picker-and-window flow; the headless CLI still works exactly the same. Free and open source.
A read-only, stdlib-only Python exporter that pulls one iMessage conversation off macOS chat.db and turns it into AI-ready CSV / JSON / Markdown / TXT. Optional pseudonymization for hosted-LLM use. Free and open source.
What it is, in one paragraph
imessage-history is a single-purpose CLI for a single use case: take one conversation out of your local Messages database, in one time window, and produce clean files you can analyze, archive, or feed to an LLM. macOS keeps every iMessage in a local SQLite database (~/Library/Messages/chat.db), but the raw schema is awkward and easy to misread. This tool smooths it out and emits a speaker-attributed dataset with a clear audit trail.
Why it exists
chat.db schema has subtle traps: outgoing rows point at the recipient (not the sender); long plain-text messages live inside an Apple NSAttributedString typedstream blob whose length prefix truncates at 255 chars if you parse it wrong; tapbacks live as cross-referenced rows with prefixed GUIDs; edits, retractions, and app-payload rows can have NULL text AND NULL attributedBody.What it does, concretely
imessage_export.py plus a small package under imessage_export/. Python 3.10+. Zero required runtime dependencies.chat.db with mode=ro&immutable=1, sets PRAGMA query_only=ON, asserts the read-only PRAGMA actually took, and ships a regression test proving every write statement (DELETE / UPDATE / INSERT / CREATE / DROP / ALTER / REPLACE) raises and the file is byte-for-byte unchanged after close.author_label is the source of truth — outgoing rows are relabeled with --me-name; incoming rows resolve via your contacts.csv (phone + email normalization).--date / --start-time / --end-time / --start-datetime / --end-datetime all interpret bounds in the system's local timezone, convert to Apple's 2001-epoch nanoseconds, and write the resolved window (local + UTC + Apple-ns + detected unit) into the JSON metadata block and the AI-ready header.conversation.csv, conversation.json, conversation.txt, conversation.md, and conversation_ai_ready.txt (header + speaker-attributed body + attribution footer), plus an analysis_prompt.txt template.--redact / --redact-only swap handles, names, phone numbers, emails, and URLs for Person A / Person B / [PHONE] / [EMAIL] / [URL]. A pseudonym_map.json is written alongside for de-redaction. --suggest-names scans for likely third-party names not in your contacts.pip install 'imessage-history[tui]' gives you a guided picker-and-window flow inside the terminal. The headless CLI (imessage-export --chat-id N --date YYYY-MM-DD) still works exactly the same.umask(0o077) so new files are 600 and new dirs 700. exports/, contacts.csv, and pseudonym_map.json are gitignored.Tech and architecture
[tui] extra adds Rich + Questionary + Textual.imessage_export/ (timestamps, decoder, models, db, contacts, window, export, writers, redactor, cli). TUI under imessage_export/tui/ (Phase 1 linear questionary wizard + Phase 2 Textual app). Core modules never import TUI at module top-level.decode_attributed_body that walks Apple's typedstream, correctly parses the 0x81 two-byte little-endian length prefix (the truncate-at-255 gotcha), and strips the  (U+FFFC) attachment placeholder.write_csv, write_json, write_txt, write_markdown, write_ai_ready each take (path, messages, metadata), share no state, and are unaware of redaction — the Redactor runs as an optional second pass before the writers see the data.tui/theme.py semantic roles.unittest, runs with no external deps. Includes the read-only regression test, decoder regression tests against the typedstream edge cases, and timestamp math against known macOS DB samples.Use cases
conversation.md is structured for it.--redact-only first so names, phone numbers, emails, and URLs are swapped for Person A / [PHONE] before anything leaves the machine. The pseudonym_map.json stays local for de-redaction.What I'd talk about in an interview
handle_id is the OTHER party on outgoing rows, attributedBody length prefix is 2 bytes, tapback associated_message_guid has a p:N/ or bp: prefix, NULL text doesn't mean empty) has a comment in the code and a test case. Future-me can't accidentally re-introduce them.umask(0o077), gitignored outputs, no network calls, redactor as an opt-in pre-writer pass, pseudonym map treated like a password. The README leads with privacy and .gitignore already excludes everything sensitive.pip install imessage-history to keep working on machines where adding deps is friction.Repo, install, links
pip install imessage-historypip install 'imessage-history[tui]'chat.db, zero network calls, optional pseudonymization