$Id: minion-spec.txt,v 1.31 2007/04/21 16:57:47 arma Exp $ MIX3:1 Type III (Mixminion) Mix Protocol Specification George Danezis Roger Dingledine Nick Mathewson (who else?) Status of this Document This draft document ("minion-spec.txt") describes a proposed specification for Type III remailers. It is not a final version. It has not yet been submitted to any standards body. (This draft supersedes the old "minion-spec.tex", which was never really a TeX file in the first place. When you edit it, try to keep it looking like an RFC.) The organization of this document loosely follows the structure of RFC2440. Abstract This document describes the message formats and protocols required to implement an anonymizing Type III relay mix server compatible with the Mixminion reference implementation, and with the forthcoming Mixmaster 4. It does not provide information about end-to-end message encoding as performed by client software and delivery mixes; nor does it describe the formats and protocols used by servers to publish their identities and capabilities: these issues are described in the adjunct documents "E2E-spec.txt" and "dir-spec.txt" respectively. Although this document discusses security issues in implementing Type III mix software, it is not comprehensive, nor does it discuss all implementation issues. Type III mix software uses a fairly conventional mix-net design to provide sender and recipient anonymity services for (comparatively) high latency electronic messages. When senders and recipients use the system, an adversary should not be able to learn who is communicating with whom. The Type III mix system is not steganographic, nor is it designed to tunnel arbitrary TCP or IP traffic. Table of Contents Status of this Document X Abstract Table of Contents 1. Introduction 1.1. Terminology 2. System overview 2.1. Sender anonymity 2.2. Recipient anonymity 2.3. Bidirectional anonymity 3. Packet format 3.1. Preliminaries and Definitions 3.1.1. Cryptographic primitives 3.1.2. Notation 3.2. The Type III Packet Format 3.2.1. Subheaders 3.2.1.1. Routing information 3.2.2. Header structure 3.2.3. The payload 3.2.4. Constructing a Type III packet 3.2.4.1. Building a header 3.2.4.2. Encrypting the payload 3.2.4.3. Constructing the packet 3.2.5. Processing a Type III packet 3.2.5.1. Replay avoidance 3.2.5.2. Packet pooling and delivery 3.3. SURB exchange formats 4. Transport protocol A.1. Appendix: A suggested pooling rule A.2. Appendix: Compatibility with Type II mixes A.3. Appendix: Versioning and alphas 1. Introduction This document describes the operation of servers on a Type III mix network. It does not describe the history of mix-nets and other anonymity systems; nor does it provide a full design rationale for the Type III mix service. For the latter, see "Mixminion: Design of a Type III Anonymous Remailer Protocol". [XXXX In fact, until you've read the design paper above, you'll have a hard time understanding below what we mean about 'crossover points' and so on. Go read it now, then come back. XXXX] 1.1. Terminology * Mix - a server that provides anonymity to clients by accepting messages encrypted to its public key, which it then decrypts, batches, re-orders, and transmits either to another mix or to a final recipient (as specified in the messages). [The term is due to Chaum; 'remailer' is used elsewhere.] * Mixminion - a project to design an improved Mix service as described in this specification. Also, the name of the reference software to implement this service, currently under development. * Type I - an anonymity service originally written by Eric Hughes and others, based on the PGP message format, using SMTP for an underlying transport. Also known as the "cypherpunk" implementation. * Type II - an anonymity service originally designed by Lance Cottrell to address the flaws of the Type I system, and implemented in the software "Mixmaster". * Type III - the anonymity service described in this document. This document uses the terms "MUST", "SHOULD", "MAY", "MUST NOT", "SHOULD NOT", and "MAY NOT" as defined in RFC 2119. [XXXX is this wise? Observable variation=linkability. -NM] * Packet - A 32 kilobyte string transmitted anonymously through the Type III remailer network. * Payload - The 28 kilobyte portion of a packet containing a message, or part of a message, to be delivered anonymously. * Message - A variable-length sequence of octets sent anonymously through the network. Short messages are sent in a single packet; long messages are fragmented across multiple packets. See 'E2E-spec.txt' for more information about encoding messages into payloads. 2. System overview In this section, we provide an overview of the operations of Type III remailers [XXXX We need to be more explicit here, or somewhere, about crossover points and what exactly is encrypted with what. XXXX] 2.1. Sender (forward) anonymity When a sender wishes to send a message to a known recipient without revealing to any parties that the sender has done so, the following steps occur: 1. The sender creates a message, and chooses an ultimate recipient. 2. The sender compresses the message, splits it into fixed-sized chunks, and (optionally) encrypts the chunks with a known public key for the recipient. [Described in "E2E-spec.txt".] 3. The sender learns about the Type III mixes on the network. [Described in "dir-spec.txt".] For each chunk from step 2: 4. The sender chooses a list of the mixes from step 3 to constitute a path through the mix-net. [Described in "E2E-spec.txt".] 5. The sender successively wraps the chunk in a layer of encryption for each node in the chosen path, from last to first. Along with the data encrypted for each mix, the sender includes the address of the succeeding mix. Along with the data encrypted for the final mix, the sender includes the address of the recipient. [Described in section 3.2.4 below.] 6. The sender delivers the Type III packet to the first server in the path. [Described in section 4 below.] For each server in the path: 7. The server receives the Type III packet, decrypts it, and determines whether to discard it, deliver it to another mix, or deliver it to a final recipient. [Described in section 3.2.5 below.] 8. The server waits until enough time has passed, and/or enough packets have been received. [A suggested rule is described in appendix A.1 below.] 9. If the server is the final mix on the path, it delivers the message to the recipient [as described in section 3.2.5.2.], else it delivers it to the next mix [as described in section 4]. 2.2. Anonymous reply messages When a recipient wishes to receive messages from one or more senders without revealing his identity, the following steps occur: 1. As in steps 3 and 4 from section 2.1 above, the recipient learns about the available Type III mixes, and chooses one or more paths through the network. For each Type III packet the recipient wishes to receive: 2. The recipient generates a Single-Use Reply Block (SURB), by successively encrypting his address for each mix in the chosen path, from the last to the first. Along with the data encrypted for each mix, the recipient includes the address of the succeeding mix. [Described in section 3.3. below]. 3. Through some means the recipient delivers the SURBs anonymously to the sender, possibly as the contents of a forward message. 4. The sender constructs one or more packets for the recipient and inserts them into the network to be delivered, as in steps 4-9 in section 2.1 above. The sender uses a different SURB for each packet. (The only difference in the sender's behavior is that the second half of the path is already encoded in the SURB. To the mixes, the delivery process proceeds identically: rather than decrypting the packet contents, the nodes in the second half of the path effectively encrypt it.) 5. When the recipient receives the final message, he removes a layer of encryption for each of the nodes in the chosen path, and learns the message's contents. 2.3. Bidirectional anonymity Because an anonymous recipient can tell which SURB was used to encode which packet, the process in section 2.2. above does not provide bidirectional anonymity, unless the SURB was delivered using a recipient anonymous channel. For a scheme which supports this, see the Type III nymserver specification at "nym-spec.txt". [XXXX this document isn't there yet. XXXX] 3. Packet Format [XXXX] 3.1. Preliminaries and Definitions [XXXX] 3.1.1. Cryptographic primitives Implementing a Type III remailer only requires the combination of standardized cryptographic primitives. Most third party cryptographic libraries or providers should support RSA-OAEP (PKCS#1 Standard [XXXX Ref]), AES, SHA-1 (NIST Standards [XXXX Refs]), and ASN.1 notation. Furthermore a source of cryptographic random numbers should be available. 3.1.1.1. Message digest For our message digest we use SHA-1. Message digests are HASH_LEN=20 octets long. 3.1.1.2. Stream generator To generate predictable random padding, and to implement a stream cipher, we use AES (Rijndael) in counter mode with a 128-bit block size and 128-bit keys, and an all-zero Initial Vector (IV). Specifically, to generate N octets of a random stream with key K, we successively encrypt the following ceil(N/16) blocks (given in hexadecimal): [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00] [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01] [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02] [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03] .... We then return the first N octets of the concatenated result. 3.1.1.3. Super-pseudorandom permutation To encrypt an octet array so that any change in the encrypted value will make the decryption look like random bits, we use an instance of the LIONESS SPRP, with SHA-1 for a hash and the stream described in 3.1.1.2 above. Thus, in the notation described below, we encrypt a message M with a key K as follows: K1 = K K2 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01] K3 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02] K4 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03] L := M[0:20] R := M[20:Len(M)-20] R := Encrypt( Hash(K1 | L | K1)[0:16], R ) L := L ^ Hash(K2 | R | K2) R := Encrypt( Hash(K3 | L | K3)[0:16], R ) L := L ^ Hash(K4 | R | K4) SPRP_Encrypt(K, M) = L | R We decrypt a message M with a key K as follows: K1 = K K2 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01] K3 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02] K4 = K ^ [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03] L := M[0:20] R := M[20:Len(M)-20] L := L ^ Hash(K4 | R | K4) R := Encrypt( Hash(K3 | L | K3)[0:16], R ) L := L ^ Hash(K2 | R | K2) R := Encrypt( Hash(K1 | L | K1)[0:16], R ) SPRP_Decrypt(K, M) = L | R K must be 20 octets long; M must be at least 20 octets long. 3.1.1.4. Public-key cipher For a public-key cipher, we use RSA with PKCS1 encoding. To make messages non-malleable, we use OAEP padding with SHA-1 as the hash function and MGF1 as our mask function. The security parameter (P in the OAEP specification) is set to the following 84-character ASCII string (a quotation from Thomas Paine): "He who would make his own liberty secure, must guard even his enemy from oppression." (In hexadecimal, the hash of P is: [90 89 60 BC 88 EF 92 B0 9C 58 26 C4 BE 32 B3 34 A7 1C AA 5A].) For all public key operations in Type III packets, we use 2048-bit keys. We define: PK_OVERHEAD_LEN = 42 (octets added by OAEP padding) PK_ENC_LEN = 256 (size of RSA-encrypted data) PK_MAX_DATA_LEN = PK_ENC_LEN - PK_OVERHEAD_LEN = 214 (Longest octet array that can be RSA-encrypted.) All public keys must use 65537 as their exponent. 3.1.2. Notation We use the following function-style notations in describing our packet format. - Len(A) - The length, in octets, of an octet array A. - A[i:j] - If A is an octet array, A[i:j] is a subarray starting at octet i with length j. (Len(A[i:j]) = j) - Rand(n) - Generates n random octets by any secure method. Successive instances of Rand generate different and uncorrelated sequences. (Len(Rand(n)) = n) - Z(n) - Generates n octets with value 0. (Len(Z(n)) = n) - A|B - The concatenation of octet array A with octet array B. (Len(A|B) = Len(A)+Len(B)) - A^B - The bitwise exclusive or of two octet arrays A and B with equal lengths. (Len(A^B) = Len(A) = Len(B)) - Hash(M) - A cryptographic digest of an octet array M. (Len(Hash(M)) = HASH_LEN = 20) - - Denotes a keypair for our public key encryption system. - PK_Encrypt(PK, M) - The public-key encryption of an octet array M using the public key PK. (Len(M) <= PK_MAX_DATA_LEN = 214; Len(PK_Encrypt(PK,M) = PK_ENC_LEN = 256) - PK_Decrypt(SK, M) - The private-key decryption of an octet array M using the secret key SK. (0 <= Len(PK_Decrypt(SK,M)) <= PK_MAX_DATA_LEN = 214) - PK_Encode(PK) - The ASN.1 encoding of a public key PK. - PRNG(K, n) - Uses a cryptographic stream generator to produce n octets, using key K. (Len(K) = KEY_LEN = 16; Len(PRNG(K,n)) = n) - Encrypt(K, M) = M ^ PRNG(K,Len(M)) - The encryption of an octet array M with our stream cipher, using key K. (Encrypt(Encrypt(M)) = M.) - SubKey(K,S) - Derivation of a named sub key, using a master key K (Len(K) = 16) and an arbitrary length ASCII String. (Len(SubKey(K,S)) = 16) SubKey(K,S) = Hash(K | S)[0:16] - SPRP_Encrypt(K, M) - The encryption of an octet array M using our super-pseudorandom permutation with key K. (Len(K) = SPRP_KEY_LEN = 20; Len(SPRP_ENC(K,M)) = Len(M)) SPRP_Encrypt(K,S,M) is shorthand for SPRP_Encrypt(Hash(K|S),M) - SPRP_Decrypt(K, M) - The decryption of an octet array M using our super-pseudorandom permutation with key K. (Len(K) = SPRP_KEY_LEN = 20; Len(SPRP_DEC(K,M)) = Len(M)) SPRP_Decrypt(K,S,M) is shorthand for SPRP_Decrypt(Hash(K|S),M). - Int(bits, N) - A big-endian, bits-bit representation of the integer N. To improve the readability of our specification, we use the following symbolic constants: - HASH_LEN = 20 - KEY_LEN = 16 - PK_OVERHEAD_LEN = 42 - PK_ENC_LEN = 256 - PK_MAX_DATA_LEN = 214 - SPRP_KEY_LEN = 20 - HEADER_LEN=2048 All fixed strings are encoded in ASCII. All fields are packed in Internet order (MSB first). 3.2. The Type III Packet Format Each Type III packet is composed of a header section and a payload. The header section has a main header and a secondary header, both of which have identical structure. Each header is further composed of a series of subheaders, which are addressed and encrypted to the intermediate mixes. The packet is laid out as follows: Main header [2K] Secondary header [2K] Payload [28K] [Total: 32K] 3.2.1. Subheader structure A subheader is a variable-length structure containing all the information that a mix needs to decrypt a packet, check its integrity, and route it to the next hop on its path. Every subheader contains the following fields: V Version Major 1 octet V Version Minor 1 octet SK Shared Secret KEY_LEN=16 octets D Digest HASH_LEN=20 octets RS Routing Size 2 octets RT Routing Type 2 octets [Total 42 octets] RI Routing Info [Variable length; RS=Len(RI).] (We use the symbolic constant MIN_SH=42 below to indicate the length of the invariant part of a subheader.) The fields in the subheader are used as follows: * The Version field is used to manage concurrent versions of the protocol. If a Mix receives a packet with an unrecognized version, it must discard that packet. Note that the major and minor version numbers are stored using their binary representations (not the corresponding ASCII codes.) (Nodes must advertise in their status blocks what versions of the protocol they support; see "dir-spec.txt".) * The Shared Secret is the base secret that is used to generate all other keys for the operations the node performs on the packet. It must be kept secret and discarded as soon as the packet has been processed. * The Digest contains an integrity check of the part of the current header encrypted using AES in counter mode. The digest does not cover the RSA encrypted part of the header: modifications to it are detected because of the OAEP padding. * The Routing Type defines how the mix should deliver or relay the packet. If a mix receives a routing type it does not recognize, it must discard the packet (see 2.2.1.1.). * Most routing methods require additional addressing information, which is given in the variable length Routing Info field. * The Routing Size field indicates the total size in octets of the Routing Information. We will refer to the fixed-size part of the subheader structure as: FSHS(V, SK, D, RS, RT) = V|SK|D|RS|RT [MIN_SH = 42 octets] We will refer to the entire subheader as: SHS(V, SK, D, RS, RT, RI) = FSHS(V,SK,D,RS,RT) | RI [42+Len(RI) octets] 3.2.1.1. Routing information There are currently 7 defined routing types: 0x0000-0x00FF: PROTOCOL SUPPORT: NON-EXIT TYPES 0x0000 DROP (0 octets of routing information) 0x0001 FWD/IP4 (IP: 4 octets, PORT: 2 octets, KEYID: 20 octets): 26 octets 0x0002 SWAP-FWD/IPV4 (same info as FWD/IP4) 0x0003 FWD/HOST (PORT: 2 octets, KEYID: 20 octets, NAME: variable width) 0x0004 SWAP-FWD/HOST (same info as FWD/HOST) 0x0100-0x0FFF: PREDEFINED DELIVERY TYPES. 0x0100 SMTP [See "E2E-spec.txt"] 0x0101 MBOX [See "E2E-spec.txt"] 0x0102 MIX2 (EMAIL ADDRESS: variable). Type II remailer support. 0x0103 FRAGMENT (Fragment to be reassembled by exit node; see E2E-spec) 0x0104 NEWS [See "E2E-spec.txt"] 0x0105 PING (server-dependent) 0x1000-0xEFFF: UNALLOCATED 0xF000-0xFFFF: FOR EXPERIMENTAL USE A DROP routing type indicates a dummy packet. It must be discarded. To prevent servers from distinguishing among clients, every DROP packet should have a random payload. DROP messages must be discarded, and not inserted into the mix pool. A FWD/IP4 routing type indicates that the packet must be retransmitted using the TLS/Mixminion transport protocol (see section 4.) The IP field represents the IPv4 address. The KEYID field contains the SHA1 hash of the ASN.1 representation of the next node's identity public key. A SWAP routing type tells the node to exchange headers as described below. The FWD/HOST and SWAP-FWD/HOST routing type are analogous to FWD/IPV4 and SWAP-FWD/IPV4, except that they expect fully qualified hostnames rather than IPv4 addresses. Hostnames SHOULD be in lowercase. Servers SHOULD not block while resolving the hostnames. [The *FWD/HOST family first appears in Mixminion 0.0.6, and is meant to replace *FWD/IPV4. Mixminion 0.0.7 and later will not generate or accept *FWD/IPV4 messages. If a server is addressed via a static IPs, it should use a dotted quad as their hostname. As of Mixminion 0.0.7, the types formerly associated with *FWD/IPv4 will become unallocated.] See 'E2E-spec.txt' for more information about SMTP and MBOX delivery. 3.2.2. Header Structure A header contains a set of variable-length subheaders, one for each mix in the corresponding leg of the path. Each node sees the main header as a 256-octet portion encrypted with the node's public key, followed by a 1792-octet portion encrypted with a stream cipher. The key for the stream cipher is derived from the master secret SK as follows: K = SubKey(SK,"HEADER SECRET KEY") The subheader is always at the start of the RSA-encrypted portion, and is followed by header material intended for the next mix. Note that because the full subheader (including routing information) may be more or less than PK_MAX_DATA_LEN=214 octets long, some of the RI field may spill into the stream-encrypted portion, or some of the next mix's header may spill into the RSA-encrypted portion. These diagrams should help explain: 1. If the subheader is less than PK_MAX_DATA_LEN octets long, it fits into a single RSA encryption along with some of the data from the rest of the header. Before: (S is the new subheader, H is the existing header material) SSSSSSSSSS HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH..... After: (Encrypted with RSA) (Encrypted with AES) SSSSSSSSSSHHHHHHHHHH HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH..... 2. Otherwise, the subheader won't fit into a single RSA encryption, so we spill some of it over: Before: SSSSSSSSSSSSSSSSSSSSSSSSS HHHHHHHHHHHHHHHHHHHHHHHHH..... After: (Encrypted with RSA) (Encrypted with AES) SSSSSSSSSSSSSSSSSSSS SSSSSHHHHHHHHHHHHHHHHHHHHHHHHH..... In order to decode an encrypted header H addressed to a mix with public key PK, into a subheader SH and the rest of the header MH, the following operations could be performed. // Decrypts the header using the public key and the extracted key. H1 = PK_Decrypt(PK, H[0:PK_ENC_LEN]) K = SubKey(H1[2:KEY_LEN], "HEADER SECRET KEY") H2 = Encrypt(K, H[PK_ENC_LEN: 2048 - PK_ENC_LEN] // Reads the length of the routing information and extracts the // sub header. SH = (H1|H2)[0:MIN_SH + H1(38:2)] MH = (H1|H2)[Len(SH):2048-Len(SH)] 3.2.3. The payload [XXXX writeme] 3.2.4. Constructing a Type III packet The information about how to build packets, headers and SURBs is provided for information purposes only, since a pure type III remailer server is not required to be capable of creating packets. Understanding how packets are created may be beneficial for understanding why the remailer processes them in the way specified in section 3.2.5. 3.2.4.1. Building a header Each type III packet has two headers with identical structure. These headers are swapped at the crossover point. [XXXX describe crossover] A header is HEADER_LEN=2048 octets long and contains up to 2048/(PK_OVERHEAD_LEN+MIN_SH)=24 subheaders. Starting with N subheaders SH_0..SH_N containing secrets SK_0..SK_N (and placing routing extension blocks directly after their respective subheaders), the header is constructed by appending random padding to achieve a total size of 2048 octets. Then, each subheader key is used to create a key SubKey(SharedSecret, "HEADER SECRET KEY") with which the part of the header after the subheader (but including its routing extension) is encrypted using counter-mode AES. We construct the subheaders from last to first, so that each can contain a digest of the subsequent subheaders and padding data. A single headers are created as follows: First, the header constructor obtains (as input) a list of nodes, the public key and routing information for each node, a randomly generated secret to use in the subheader for each node, and the 'delivery' routing information the last node should use to deliver a packet. The constructor then determines the amount of data (SIZE_i) that will be added for each node's subheader. This amount is equal to MIN_SH, plus PK_OVERHEAD_LEN, plus the length of the routing information for the following node (or the 'delivery' routing information if this node is last). The constructor generates a chunk of random padding, equal in length to HEADER_LEN minus the sum of all SIZE_i. For each node NODE_i with master secret SK_i, the constructor calculates a header key K_i=SubKey(SK, "HEADER SECRET KEY") and a junk key JUNK_KEY_i=SubKey(SK, "RANDOM JUNK). It then calculates the node-generating junk J_i that will be generated by each node NODE_i -- this is obtained by starting with no junk (as seen by the first node), and appending to the junk each node sees, SIZE_i bytes, generated with the PRNG key JUNK_KEY_i. The junk is also encrypted by K_i, starting with the counter offset that NODE_i will use to encrypt the junk it adds. Finally, the constructor builds the header itself. It starts with the random padding, and then proceeds *in reverse order* from the last node to the first node. For each node, it constructs and encodes a subheader containing appropriate routing and key information. The 'RSA portion' of the header is taken to be the first PK_MAX_DATA_LEN bytes of the subheader-concatenated-with-the- header; the 'AES portion' of the header is taken to be the rest of the subheader-concatenated-with-the-header. The constructor encrypts the 'AES portion' with K_i, and sets the digest in the 'RSA portion' to the hash of the encrypted 'AES portion' concatenated with the junk J_i seen by node NODE_i. Finally, the constructor RSA-encrypts the 'RSA portion' with PK_i. PROCEDURE: Create a single header. Inputs: RI_1..RI_n, RT_1..RT_N (addresses of intermediate nodes), PK_1 .. PK_N (Public keys of intermediate nodes), SK_1 .. SK_N (Secret keys to be shared with intermediate nodes), R Routing type and information of last header (FWD, DROP, SWAP, etc.) Output: H (The header) Process: // Calculate the sizes of the subheaders. for i = 1 .. N // OAEP Padding plus invariant parts plus routing info. if i = N then Set RI from R else Set RI = RI_(i+1) endif SIZE_i = MIN_SH + PK_OVERHEAD_LEN + Len(RI) JUNK_KEY_i = SubKey(SK_i, "RANDOM JUNK") K_i = SubKey(SK_i, "HEADER SECRET KEY") end PADDING_LEN = 2048-SUM(SIZE_1 ... Size_N) // Calculate the Junk that will be appended during processing. // J_i is the junk that node i will append, and node i+1 will see. J_0 = "" for i = 1 .. N J_i = J_(i-1) | PRNG(JUNK_KEY_i, SIZE_i) Stream_i = PRNG(K_i, 2048 + SIZE_i) // Before we encrypt the junk, we encrypt all the data, and all // the initial padding, but not the RSA-encrypted part. // OFFSET = PADDING_LEN + SUM(SIZE_i ... SIZE_N) - 256 // = 2048 - SUM(SIZE_1 ... SIZE_N) + SUM(SIZE_i ... SIZE_N) // -256 // = 2048-256 - SUM(SIZE_1 ... SIZE_(i-1)) // = 2048 - 256 - len(J_{i-1}) OFFSET = HEADER_LEN - PK_ENC_LEN - Len(J_(i-1)) J_i = J_i ^ Stream_i[OFFSET:Len(J_i)] end // Create the Header, starting with the padding. H_(N+1) = Rand(PADDING_LEN) for i = N .. 1 if i = N then Set RT and RI from R. else Let RT = RT_(i+1), RI = RI_(i+1) endif SH0 = SHS(V, SK_i, Z(HASH_LEN), len(RI), RT, RI) SH_LEN = LEN(SH0) H0 = SH0 | H_(i+1) REST = H0[PK_MAX_DATA_LEN : Len(H0) - PK_MAX_DATA_LEN] EREST = Encrypt(K_i, REST) DIGEST = HASH(EREST | J_(i-1)) SH = SHS(V, SK_i, DIGEST, len(RI), RT, RI) UNDERFLOW = Max(PK_MAX_DATA_LEN - SH_LEN, 0) RSA_PART = SH | H0[PK_MAX_DATA_LEN - UNDERFLOW : UNDERFLOW] ESH = PK_ENCRYPT(PK_i, RSA_PART) H_i = ESH | EREST end return H_1 It is important to note that a user can create a SURB by following a similar procedure as described above. Since the objective is for SURBs to be stateless, in the sense that they do not require any secrets to be stored by their creator to be decoded, the full procedure is more intricate (see "E2E-spec.txt" for details.) 3.2.4.2. Encoding the payload The payload of a Mixminion packet has a fixed length of 32 KB - 2*16*128 octets = 28KB. Payloads indicate the size of the messages they contain as described in "E2E-spec.txt". (When sending a reply message with a SURB, we use payload encryption to prevent the crossover point from seeing an unencrypted payload. See the "E2E-spec.txt" for more information.) We denote a payload as P. 3.2.4.3. Constructing the packet Given two headers and a payload one can construct a packet. The first header must end with a subheader with routing type SWAP. To construct a Type III packet, the sender applies the SPRP steps for payload transmission in reverse order: If the sender has a forward message, she first SPRP_Encrypts the payload with the secret keys for the hops in the second header in reverse order. If she has a SURB, she SPRP_Decrypts the payload with the SURB encryption key. [We use decryption in the reply case so that the recipient only needs to use SPRP_Encrypt steps to retrieve the payload.] Next, the sender SPRP_Encrypts the second header with a hash of the payload, then SPRP_Encrypts the payload with a hash of the encrypted second header. This step (which the SWAP operation will reverse) makes the second header and the payload unreconstructable if either is damaged. Finally, the sender SPRP_Encrypts the payload and the second header with each of the first header's secret keys, in reverse order. Concatenating the first header, the second header, and the payload yields the final packet. PROCEDURE: Construct a packet Input: H1 (header containing keys SK1_1... SK1_N) and H2 (either a header containing keys SK2_1... SK2_N if we constructed it, or a header with unknown keys if we're using a reply block and a SURB secret key.) P (Payload) Output: M (the 32KB packet) Process: // Phase 1 if H2 is a reply block, then Let K_SURB = the end-to-end encryption key in H2 P = SPRP_Decrypt(K_SURB, "PAYLOAD ENCRYPT", P) else // H2 is not a reply block for i = N .. 1 P = SPRP_Encrypt(SK2_i, "PAYLOAD ENCRYPT", P) end endif // Phase 2 H2 = SPRP_Encrypt(SHA1(P), "HIDE HEADER", H2) P = SPRP_Encrypt(SHA1(H2), "HIDE PAYLOAD", P) for i = N .. 1 H2 = SPRP_Encrypt(SK1_i, "HEADER ENCRYPT",H2) P = SPRP_Encrypt(SK1_i, "PAYLOAD ENCRYPT",P) end M = H1 | H2 | P 3.2.5. Processing a Type III packet Packets are transferred from node to node using the custom Type III transport protocol (see below). A node with private key PK receiving a packet M = H1 | H2 | P performs the following operations: PROCEDURE: Process a packet M H1 = M[0:HEADER_LEN] H2 = M[HEADER_LEN:HEADER_LEN] P = M[2*HEADER_LEN:PAYLOAD_LEN] PK_PART = PK_Decrypt(PK,H1[0:PK_ENC_LEN]) If there is any problem with the OAEP padding discard the packet. If Len(PK_PART) != PK_MAX_ENC_LEN, discard the packet. FSHS(V, SK, D, RS, RT) = Check that D = Hash(H1[PK_ENC_LEN:HEADER_LEN-PK_END_LEN]), and discard if not. Check for replays, as described in 3.2.5.1. JUNK_KEY = SubKey(SK, "RANDOM JUNK") H1 = H1[PK_ENC_LEN : 2048-PK_ENC_LEN] | PRNG(JUNK_KEY, PK_OVERHEAD_LEN + MIN_SH + RS) HEADER_KEY = SubKey(SK, "HEADER SECRET KEY") H1 = Encrypt(HEADER_KEY, H1) FULL_H = PK_PART[MIN_SH:Len(PK_PART)-MIN_SH] | H1 RI = FULL_H[0:RS] H1 = FULL_H[RS:Len(FULL_H) - RS] H2 = SPRP_Decrypt(SK, "HEADER ENCRYPT",H2) P = SPRP_Decrypt(SK, "PAYLOAD ENCRYPT",P) if routing type is DROP: End. if routing type is SWAP-FWD: P = SPRP_Decrypt(Hash(H2), "HIDE PAYLOAD", P) H2 = SPRP_Decrypt(Hash(P), "HIDE HEADER", H2) Swap H1 and H2 if routing type is SWAP-FWD or FWD: Put H1 | H2 | P in queue to be sent to the address in RI. Otherwise: Give (RT, RI, P) to the delivery subsystem. The operation of the delivery subsystem is beyond the scope of this document. In general, "RT" will specify a means of message delivery (such as 'SMTP') and thereby an interpretation of RI (such as "RFC822 mailbox") and a correct disposition of P (such as "decode and deliver via SMTP"). For more information, see section 2.2.1.1 and 'E2E-spec.txt'. 3.2.5.1. Replay avoidance The nodes MUST implement a mechanism to make sure that packets cannot be replayed. To do this a hash of the secret contained in the subheader is kept for as long as the public key under which it was encrypted is in use. The Hash should be computed in the following way: X = HASH(SK | "REPLAY PREVENTION") The value X is not secret, and its secrecy should not be relied upon. The integrity of the list should be secured and the X values may be made public. 3.2.5.2. Packet pooling and delivery Instead of processing and delivering packets immediately upon receipt, a node SHOULD process packets as soon as possible (in order to prevent linking in the case of compromise), but MUST accumulate a batch of deliverable packets before delivering any of them (in order to frustrate traffic analysis). Nodes SHOULD choose a batching strategy that blends packets with one another even in the event of a flooding or n-1 attack; such an algorithm is given in Appendix A.3 below. If a packet or message cannot be delivered immediately due to a (possibly) transient error, the node SHOULD retry the packet or message periodically until either it can be delivered, or until a given interval has passed. After the interval, the packet or message is discarded. 3.3. SURB exchange formats. This section describes how to encode SURBs for storage or transmission. A SURB can be encoded in a standard binary or ASCII format. Binary Format: Begin Marker: 4 octets Version: 2 octets Use-by-Date: 4 octets SURB header: 2048 octets Routing Size: 2 octets Routing Type: 2 octets Encryption key: 16 octets Routing Info: (Routing Size) octets Total: 30 octets + Header size + Routing info size. * The begin marker is the ASCII 4-octet string 'SURB'. [53 55 52 42] * The version number contains the format version of the SURB. (should be hex 01 and 00 for this standard). * Routing type/routing size/routing info: Defined as in subheaders. These fields encode the address of the hop which the SURB user should use as an exit point. * Use-by-Date: indicates the expiry date the SURB should be used by. MUST be no later than the key rotation of any node on the SURB Path. This field must be given as a number of seconds since midnight GMT on Jan 1, 1970 -- but must be aligned to the start of a day (in other words, it must be divisible by 60*60*24). (Misaligned dates must be rejected as invalid.) (To avoid leaking parameters, clients SHOULD pick a use-by date, then pick a path accordingly, rather than leaking their choices in the use-by date. To avoid distinguishability, clients SHOULD NOT choose use-by dates that are long in the future.) [XXXX Should there be a use-after date too? -NM] * SURB data: Contains the SURB that is created as described above. * Encryption key: used to SPRP-encrypt the payload before sending it into the network. To encode a SURB for transport, implementations SHOULD support OpenPGP-style ASCII armor, with the header text "BEGIN TYPE III REPLY BLOCK" and a single armor header, "Version: 1.0". 4. Transport protocol A special channel should be established between mixes that provides forward secrecy making it impossible to recognize or decrypt any packet that went through it in the past. In order to establish this channel one of the two mixes initiates the connection but at the end of the key exchange protocol the channel is bi-directional. The protocol should be used when the SWAP-FWD/IP4 or FWD/IP4 address type is specified in a subheader. Servers must not connect to themselves over the network when routing packets to their own published IP/Port combination. The Mixminion protocol uses TLS (the IETF standardization of SSL) with the ciphersuite "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" (defined in tls-ciphersuite-03.txt). No other ciphersuite is permitted for mix-to-mix communications. [Servers must allow incoming connections via SSL3_RSA_DES_192_CBC3_SHA for clients written with older SSL libraries. However, servers must never initiate connections with this suite.] The X.509 certificate must be signed by the server's identity key, and a certificate containing the server's identity key must be included with the certificate chain set to the client. If the identity key doesn't match the KEYID portion of the header's routing data, the client closes the connection. Packets are sent from client to server. Session suspension is not permitted. Protocol outline: (Portions marked with '*' are normative; other portions are non-normative descriptions of TLS.) - A invents a new Diffie Hellman key (of at least 1024 bits modulus) and makes a certificate signed by her signing key. A then initiates the SSL Handshake protocol with B. - B invents a DH key and sends a certificate chain containing: - a certificate with B's transport public key, signed by B's identity key. - a self-signed certificate containing B's identity key. * A checks that the Hash of the identity key is the same as the one contained in the routing info of the subheader. - The SSL handshake protocol proceeds as normal until a session key has been established. All communications are then encrypted using this session key. * A sends "MMTP 1.0", CRLF. This indicates the protocol versions that A supports. (Future clients that support more protocols should transmit "MMTP ", a list of comma-separated protocol versions, and a CRLF.) * If B is not willing to use any protocol A supports, B closes the connection. B sends "MMTP 1.0", CRLF. This indicates B's choice of protocol. If A is not willing to support B's choice, A closes the connection. * Packet case: * A sends "SEND", CRLF, M, HASH(M|"SEND") (6 + 32k + 20 octets) * B sends "RECEIVED", CRLF, HASH(M|"RECEIVED") (10 + 20 octets) [Note that A SHOULD NOT wait for B's reply before sending further packets; rather, A SHOULD start sending its next packet immediately. Node B SHOULD NOT send a reply until it has committed the packet to local storage, and Node A SHOULD NOT remove the packet from local storage before it has it has received B's reply. Node A MAY pause if it is waiting for 16 or more hashes at a time.] * Padding case: * A sends "JUNK", CRLF, Junk, HASH(Junk|"JUNK") (6 + 32k + 20 octets) (where Junk is an arbitrary 32k sequence.) * B sends "RECEIVED", CRLF, HASH(Junk|"RECEIVED JUNK") (10 + 20 octets) [Note that both cases require the same number of octets and processing time. Implementations SHOULD make sure the real packet and the padding cases are indistinguishable to a third party, or even to the parties involved after the keys have been updated.] * Error case: * A sends "SEND", CRLF, M, HASH(M|"SEND") (6 + 32k + 20 octets) * B sends "REJECTED", CRLF, HASH(M|"REJECTED") (10 + 20 octets) [B must reject a packet if its hash is incorrect; if the disk is full, or if for some other reason B cannot accept the packet. Again, implementations should not be distinguishable in their timing in the case where the packet is accepted or rejected. When A receives a "REJECTED" reply, it must behave as if delivery had failed, and retry the packet later (up to a reasonable retry interval).] * If a connection persists for longer than 15 minutes, the client must initiate key renegotiation. If it has not, the server must close the connection. NOTES: The old keys must be permanently overwritten. Special care should be taken to permanently erase them from the Hard Disk and memory. The standard transport mechanism over which the Mixminion Transfer Protocol talks is TCP over IP. The standard listening TCP port should be number 48099 (until we register a port with www.iana.org) All possible checks should be performed during the transfer protocol and if any fail the connection MUST stop and all state MUST be deleted. An error MAY be logged. Current implementations do not report their version as 1.0. See appendix A.3. A.1. Appendix: A suggested pooling rule In order to allow room for future experimentation, we do not require a single batching rule. Nonetheless, we describe a recommended rule (as used in Mixmaster) which is somewhat resistant to flooding attacks. Implementors are strongly encouraged to use this algorithm, or another equally robust against active and passive attacks. (Be sure to read \cite{batching-taxonomy}. [XXXX Ref]) PROCEDURE: Choose sets of messages to transmit ("Cottrell-style batching") Inputs: Q (a queue of messages) N (the number of messages in the queue). MIX_INTERVAL (algorithm parameter; time to wait between batches of messages. Should be around XXXXX. Must be >= 0.) POOL_SIZE (algorithm parameter; minimum size of pool. Should be at least XXXXXXXX. Must be >= 0.) MAX_REPLACEMENT_RATE (algorithm parameter; largest allowable rate for messages to be removed from the pool. Should be between XXXX and XXXX. Must have 0.0 < MAX_REPLACEMENT_RATE <= 1.0) Outputs: (A set of messages sent to the network). 1. Wait for MIX_INTERVAL seconds. 2. If N > POOL_SIZE, then let 'max_send' = FLOOR(N*MAX_REPLACEMENT_RATE). [If 'max_send' < 0, let max_send = 1.] Choose Min(N-POOL_SIZE, max_send) messages from Q. Transmit the selected messages. 3. Repeat indefinitely. A.2. Appendix: Backward compatibility with Type II remailers In order to share anonymity sets with Type III remailers while retaining type II support, some remailers may wish to use Mixminion to deliver type II messages. This is done as follows: Nodes that accept both type II and type III may advertise the fact in their server descriptor by including a section of the form: [Incoming/Mix2] Address: (type II remailer's email address) Key: (type II key) KeyID: (type II keyid) Signature: (signature of identity key with type II key) (Optionally, KeyID and Signature repeated any number of times.) This section advertises that the mix can handle type II messages intended for a given type II identity (email address) and set of keys. The value of 'key' is the base-64 representation of the ASN.1 encoding of the Mix node's type II key. The value of 'Signature' must be the base-64 representation of the RSA-OAEP/PKCS1 signature (using the type II remailer key) of the SHA-1 hash of the ASN.1 representation of this node's identity key. Directory servers and bridging nodes _must_ verify that keyid and signature are correctly computed. Upon receiving a type II message via SMTP, a bridging node checks whether the destination node is also a type III node, by looking for a type III node whose KeyID matches the KeyID for the packet. [See below] If it finds one, the bridging node unbase64's the type II message's contents, and uses them (plus random padding) as the payload of a type III packet for that node. The routing type must be 'MIX2' (0x0102); the routing info must be equal to the destination mix's type II address. [[Non-normative note: extracting KeyID and message contents]] (This information is included in the Type II remailer spec; it is included here only for reference.) A type II message follows the format: "-----BEGIN REMAILER MESSAGE-----" NL PacketLength NL Checksum NL Packet NL "-----END REMAILER MESSAGE-----" NL PacketLength is the length of the packet in octets, encoded as a decimal integer."Checksum" is equal to the MD5 hash of the packet, encoded in Base 64. The packet is also encoded in base 64; The first 16 octets of the packet are the KeyID of the recipient. The KeyID of a type II node is calculated by taking the public key , expressing n and e as zero-padded, big-endian integers, concatenating them, and taking the MD5 hash of the result. When encoding a type II message for transmission in a type III payload, a type III node should include: S [2 octets] (Size, big-endian) CHK [16 octets] (MD5 checksum as given in type II packet, base-64 decoded) PKT [S octets] (Packet as given in in type II packet, base-64 decoded) PAD [28KB-16-2-S octets] (Random padding) A.3. Appendix: Versioning and alphas Today's alpha code does not publish its version as '1.0'; it uses '0.x' instead (currently '0.3' for packets, '0.3' for MMTP, and '0.1' for everything else). Production versions MUST NOT retain backward compatibility with pre-production releases.