Email Messages, MIME, and Message 2.0

The following is just a normal, average email:

Basic email message with IMF headers encoded in yellow, and the MIME tree encoded in blue

Note that I’ve highlighted what libetpan calls the “IMF headers” (basically what’s referred to by RFC822 and its descendants) in yellow; everything in blue is the MIME tree (including the root node headers).

This is a simple message; one top-level text/plain node in the MIME tree, and nothing else, but in principle, this is how all messages start out, no matter how complex the tree.

This is how pretty much all of the emails we get are structured, regardless of what we’ve done to them. A bunch of IMF headers, which are usually what’s necessary for transport or what’s added along the way in transport (To, From, Received, X-Shark-Bait (and other optional X headers), etc. Some of these get filtered out by various products (Symantec firewalls, for one), but they are not part of the MIME tree, which is specifically related to content.

The MIME tree consists of a root node which gives, among other things, the content type of the entire message (here, we have a plaintext email with no attachments), which also indicates something about the kind of MIME tree to build. More about that later. Anyway, in blue we have the root node MIME header:

MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit

The rest of the text in blue is the context of that text/plain node - in the case, the full content of the email message. The fact that another IMF header - X-detected-operating-system - comes before the content isn’t a problem and doesn’t make it part of the MIME tree. Message format rules allow for that kind of annoying modification along the way from message creation to delivery.

In any event, this is the basic picture you need to keep in mind for the rest of the description.

Headers, MIME, and Encrypted Payloads

Let’s forget about Message 2.0 for now, and just look at what happens when we want to encrypt a message. I’ll talk about what happens in the Engine in the next section, but here, let’s stick to the emails.

Preencryption

Let’s say we want to encrypt this email with enigmail, or any other client (Ignore the subject for now. We take care of it so you don’t have to.)

We start with the unencrypted message:

Email message, preencryption, with IMF headers encoded in yellow, and the MIME tree encoded in blue

N.B. The message was produced by Thunderbird for a test, but the encryption we discuss below was done by the engine - ignore the User-Agent information here, it’s not relevant to the discussion.

Again, the non-MIME headers are in yellow, and the full MIME tree is in blue. The MIME headers are nearly the same as above:

MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8 
Content-Transfer-Encoding: 7bit

Now, here’s what’s important to understand about how this message changes from its plaintext form during encryption and subsequent encoding: when we encrypt this message, what we’re doing is taking this blue part, encrypting it as a whole (with MIME headers and all), and using that as the payload in a new MIME tree which, in the message, will replace the old one.

The encrypted message

To start the encryption process, the first thing we do is render the full plaintext message as a data blob for encryption. In other words, we turn a message from whatever internal form is inside the engine (in our case, a message struct) into one string representing the MIME tree - and only the MIME tree - you see above. The reason the IMF headers are not normally included here is because we are only encrypting the message content. In the simplest case, we are simply going to replace the above plaintext MIME tree with a new MIME tree which has the encrypted MIME tree as encrypted ciphertext within it. On decryption of this payload at the receiver end, the plaintext MIME tree we started with will replace the encrypted MIME tree that was sent, and the message can be read the same way it was written.

So above, we basically take a string that looks like this (MIME headers, newlines, and all):

MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

So this is a very exciting MIME mail. Yay!

Hi Bob. Is Eve listing in on our phone calls again? Mallory said so, but
he's always up to something, and I never know what to believe. He's
certainly no Trent.

Not as suspicious as Victor, though, always wanting my ID. Meh.

Tell Carol and Dave I said hi, and let's all invite Edmundo over to play
cards sometime.

and use the encryption engine of our choice to encrypt that blob of data as is to insert into the payload of a multipart/encrypted MIME tree, which will replace the text/plain MIME tree in the above message.

Here’s what we have post encryption as a full message:

Basic email message, post encryption and ready to send, with IMF headers encoded in yellow, and the MIME tree encoded in blue

Ignore the fact that the subject in the IMF headers has changed - that’s something to discuss in the next section and something we do intentionally.

The root node of the new MIME tree is this:

MIME-Version: 1.0
Content-Type: multipart/encrypted; boundary="515f007c5bd062c2122008544db127f8"; 
 protocol="application/pgp-encrypted"

The first thing to note in this is that we now have an multipart root node as opposed to a single part one (i.e. text/plain). Multipart nodes have subnodes separated by boundaries (515f007c5bd062c2122008544db127f8 above), and can have multipart children which have their own boundaries separating their child nodes, etc.

What you can see here is that when we encrypt the message, we create a new MIME tree of type multipart/encrypted to put in the outgoing message. The entire blue part of the last one, including MIME headers, is encrypted as a whole, and that payload - the part starting with BEGIN PGP MESSAGE and ending with END PGP MESSAGE - is now the contents of the application/octet-stream attachment of the main message.

This multipart/encrypted tree has two child nodes: an application/pgp-encrypted node containing a version number, and an application/octet-stream node containing the ASCII-armored encrypted data we produced above (the encrypted plaintext MIME tree) between —–BEGIN PGP MESSAGE—– and —–END PGP MESSAGE—– tags. The nodes are separated by the boundary separator (“–515f007c5bd062c2122008544db127f8”), and the tree is ended by the same boundary (“–515f007c5bd062c2122008544db127f8–”).

Note that with the exception of the subject change, we simply replace the MIME tree we had pre-encryption by this MIME tree, leaving the headers untouched.

This is what Message 1.0 did (subject replacement excluded), and is in principle all you need to understand if you don’t have to look inside Message 2.0 in the engine.

Message 2.0 extends and changes this, ensuring only necessary headers are present for transport on the wire, but from the outside, it will look the same and we use this basic principle.

Down in the Engine, with Special Guest: Message 2.0

Coming soon to a wiki near you!

Message 2.1

(FIXME: add the RFC section for why we can do this) - by adding an X-pEp-WrappedMessageInfo to the inner header, and a comment to the MIME header for the inner message/rfc-822 attachment, we can get around this OUTER and INNER manipulation of the plaintext. It’s coming ASAP.)

— Some more info TODO: Convert it to docu insted of chat

11:23 < darthmama> huss: as you may now, right now, we take the message we get from the app, add
                   a line to its plaintext saying "X-pEp-wrapped-message-info: INNER" and turn
                   the whole thing into a message/rfc822 attachment, create a blank outer message
                   with only selected headers from the input message copied into it (necessary
                   for transport), make a fake message-id and add the aforementioned attachment
                   to it, create the outer
11:23 < darthmama> plaintext saying "X-pEp-wrapped-message-info: OUTER", encrypt the whole mess,
                   and send it back to you
11:23 < darthmama> huss: all message 2.1 does is this:
11:25 < darthmama> huss: we don't want to modify the plaintext, but we still need something in
                   the decrypted outer message that tells us we have a wrapped message that we
                   need to pull out, and something in the inner message saying "yup, I'm really
                   an inner message, take my headers instead of the outer envelope please and
                   make me the main message"
11:26 < darthmama> huss: so 2.1, instead of adding these lines to the plaintext, does two things:
                   it adds a parameter to the MIME content-type of the message/rfc822 attachment:
                   "forwarded=no" (so that we know it's not a message forwarded as an attachment,
                   rather than an inner message), and
11:26 < darthmama> huss: it adds, on the inner message, an X-header: X-pEp-wrapped-message-info:
                   INNER (or KEY_RESET or TRANSPORT or...)
11:27 < darthmama> huss: these the engine treats as it did the plaintext wrapped message info
                   lines before and does the exact same thing as 2.1 otherwise
11:27 < darthmama> er 2.0
11:27 < darthmama> huss: additionally, we keep track of the highest pEp version we have seen from
                   other users
11:28 < darthmama> huss: so that we know whether to send them 2.0 or 2.1 messages
11:28 < darthmama> huss: (or 1.0 messages for OpenPGP only, as usual)
11:28 < darthmama> huss: this may be per-identity actually, I'd have to look.
11:29 < darthmama> huss: but the other thing is that in doing all this other stuff, we've also
                   added the sender_fpr as an explicit field on the inner message, and we read it
                   into the message struct when it's there in the inner message as well
11:29 < darthmama> huss: so X-pEp-sender-fpr
11:30 < darthmama> huss: anyway, I'd have to look at the code some more to see if there's
                   anything else, but that's more or less it. Feel free to copy to a Wiki or
                   something - I do intend to finish the docs, just have had no time.