Emlx is the lightweight parser for .emlx
files as used by Mail.app.
Install and update using pip
:
pip install emlx
>>> import emlx
>>> m = emlx.read("12345.emlx")
>>> m.headers
{'Subject': 'Re: Emlx library ✉️',
'From': 'Michael <[email protected]>',
'Date': 'Thu, 30 Jan 2020 20:25:43 +0100',
'Content-Type': 'text/plain; charset=utf-8',
...}
>>> m.text
"you're welcome :) ..."
>>> m.html is None
True
>>> m.plist
{'color': '000000',
'conversation-id': 12345,
'date-last-viewed': 1580423184,
'flags': {...}
...}
>>> m.flags
{'read': True, 'answered': True, 'attachment_count': 2}
Make sure the terminal or IDE you are using has access to the Mail folders. For example, if you are using PyCharm, you will need to grant the program "Full Disk Access" by going to System Settings > Privacy & Security
and turn it on for Pycharm. This will resolve errors such as Operation not permitted
.
An .emlx
file consists of three parts:
- bytecount on first line;
- email content in MIME format (headers, body, attachments);
- Apple property list (plist) with metadata.
The second part (2.) is parsed by the email
library. It is included in the Python standard library. Message objects generated by emlx
extend email.message.Message
and thus give access to its handy features. Additionally, emlx
message objects provide the attributes bytecount
(1.) as integer and plist
(3.) as a Python dictionary. For convenience, it also offers the attributes headers
, text
, html
, url
, id
, and flags
.
The emlx
file format was introduced by Apple in 2005. It is similar to eml
-files popular with other email clients; the difference is the added bytecount (start) and plist (end). For more, see here.