Description

Parse MIME Messages.

Author

Shiro Kawai, Chicken-port and some additions by Hans Bulfone

Version

Requires

Usage

(require-extension mime)

Download

mime.egg

Documentation

This documentation is based on Gauches rfc.mime and rfc.quoted-printable documentation with some changes where the Chicken version differs.

This egg provides utility procedures to handle Multipurpose Internet Mail Extensions (MIME) messages, defined in RFC2045 through RFC2049. This egg is supposed to be used with the rfc822 egg.

Quoted-printable encoding/decoding

A few functions to encode/decode Quoted-printable format, defined in RFC2045, section 6.7.

procedure: (quoted-printable-encode)
Reads a byte stream from the current input port, encodes it in Quoted-printable format and writes the result character stream to the current output port. The conversion ends when it reads #!eof from the current input port.
procedure: (quoted-printable-encode-string STRING)
Converts the contents of STRING to Quoted-printable encoded format. The input string can be either a complete or incomplete string; it is always interpreted as a byte sequence.
procedure: (quoted-printable-decode)
Reads characters from the current input port, decodes them from Quoted-printable format and writes the result byte stream to the current output port. The conversion ends when it reads #!eof. If it encounters illegal character sequences (such as #\= followed by non-hexadecimal characters), it copies them literally to the output.
procedure: (quoted-printable-decode-string STRING)
Decodes a Quoted-printable encoded string STRING and returns the result as a string.

Utilities for header fields

A few utility procedures to parse MIME-specific header fields.

procedure: (mime-parse-version FIELD)
If FIELD is a valid header field for MIME-Version, returns its major and minor versions in a list. Otherwise, returns #f. It is allowed to pass #f to FIELD, so that you can directly pass the result of rfc822-header-ref to it. Given a parsed header list from rfc822-header->list, you can get the MIME version (currently, it should be (1 0)) by the following code:
(mime-parse-version (rfc822-header-ref headers "mime-version"))
Note: simple regexp such as "\d+\.\d+" doesn't do this job, for FIELD may contain comments between tokens.
procedure: (mime-get-attributes INPUT)
Reads an attribute/value list in the form ;attr1=value1;attr2=value2 from INPUT (which should be an open input-port) and returns it as an alist.
procedure: (mime-parse-content-type FIELD)
Parses the "content-type" header field, and returns a list such as:
(type subtype (attribute . value) ...)
where type and subtype are MIME media type and subtype in a string, respectively.
(mime-parse-content-type "text/html; charset=iso-2022-jp")
=> ("text" "html" ("charset" . "iso-2022-jp"))
If FIELD is not a valid content-type field, #f is returned.
procedure: (mime-decode-word WORD)
Decodes RFC2047-encoded word. If WORD isn't an encoded word, it is returned as is.
(mime-decode-word "=?iso-8859-1?q?this=20is=20some=20text?=")
=> "this is some text"

Streaming parser

The streaming parser is designed so that you can decide how to do with the message body before the entire message is read.

procedure: (mime-parse-message PORT HEADERS HANDLER)

The fundamental streaming parser. PORT is an input port from where the mssage is read. HEADERS is a list of headers parsed by rfc822-header->list; that is, this procedure is supposed to be called after the header part of the message is parsed from port:

(let* ((headers (rfc822-header->list port)))
  (if (mime-parse-version (rfc822-header-ref headers "mime-version"))
     ;; parse MIME message
     (mime-parse-message port headers handler)
     ;; retrieve a non-MIME body
     ...))

mime-parse-message analyzes headers, and calls HANDLER on each message body with two arguments:

(HANDLER PART-INFO XPORT)

PART-INFO is a :mime-part record described below that encapsulates the information of this part of the message. XPORT is an input port, initially points to the beginning of the body of message. The handler can read from the port as if it is reading from the original port. However, XPORT recognizes MIME boundary internally, and returns #!eof when it reaches the end of the part. (Do not read from the original port directly, or it will mess up the internal state of XPORT).

HANDLER can read the part into the memory, or save it to the disk, or even discard the part. Whatever it does, it has to read from XPORT until it returns #!eof.

The return value of handler will be set in the content slot of PART-INFO. If the message has nested multipart messages, HANDLER is called for each "leaf" part, in depth-first order. HANDLER can know its nesting level by examining PART-INFO record. The message doesn't need to be a multipart type; if it is a MIME message type, HANDLER is called on the body of enclosed message. If it is other media types such as text or application, HANDLER is called on the (only) message body.

record: :mime-part

A SRFI-9 record that encloses metainformation about a MIME part. It is constructed when the header of the part is read, and passed to the handler that reads the body of the part.

The following procedures for manipulating :mime-part-records exist:

procedure: (make-mime-part #:type TYPE #:subtype STYPE #:parameters PARAMS #:transfer-encoding TENC #:parent P #:index I #:headers HDRS #:content C #:attrs ATTRS
Create a :mime-part-record. The arguments default to "text", "plain", '(), "7bit", #f, 0, '(), #f, and '().
procedure: (mime-part:type PART-INFO)
procedure: (mime-part:type-set! PART-INFO TYPE)
MIME media type string. If content-type header is omitted to the part, an appropriate default value is set.
procedure: (mime-part:subtype PART-INFO)
procedure: (mime-part:subtype-set! PART-INFO SUBTYPE)
MIME media subtype string. If content-type header is omitted to the part, an appropriate default value is set.
procedure: (mime-part:parameters PART-INFO)
procedure: (mime-part:parameters-set! PART-INFO PARAMETERS)
Associative list of parameters given to content-type header field.
procedure: (mime-part:transfer-encoding PART-INFO)
procedure: (mime-part:transfer-encoding-set! PART-INFO TRANSFER-ENCODING)
The value of the content-transfer-encoding header field. If the header field is omitted, an appropriate default value is set.
procedure: (mime-part:parent PART-INFO)
procedure: (mime-part:parent-set! PART-INFO PARENT)
If this is a part of multipart message or encapsulated message, points to the enclosing part's :mime-part record. Otherwise #f.
procedure: (mime-part:index PART-INFO)
procedure: (mime-part:index-set! PART-INFO INDEX)
Sequence number of this part within the same parent.
procedure: (mime-part:headers PART-INFO)
procedure: (mime-part:headers-set! PART-INFO HEADERS)
The list of header fields, as parsed by rfc822-header->list.
procedure: (mime-part:content PART-INFO)
procedure: (mime-part:content-set! PART-INFO CONTENT)
If this part is multipart/* or message/* media type, this slot contains a list of parts within it. Otherwise, the return value of handler is stored.
procedure: (mime-part:attrs PART-INFO)
procedure: (mime-part:attrs-set! PART-INFO ATTRS)
This alist provides a place for the message parser/generator or the application to store additional information. At the moment, only the 'qp-encode-binary? attribute is used by the quoted-printable encoder to decide if the content should be treated as binary.
procedure: (mime-retrieve-body PART-INFO XPORT OUTP)

A procedure to retrieve a message body. It is intended to be a building block for a handler to be passed to mime-parse-message.

PART-INFO is a :mime-part record. XPORT is an input port passed to the handler, from which the MIME part can be read. This procedure reads from XPORT until it returns #!eof. It also looks at the transfer-encoding of PART-INFO, and decodes the body accordingly; that is, base64 encoding and quoted-printable encoding is handled. The result is written out to an output port OUTP.

This procedure does not handle charset conversion. The caller can use facilities from the charconv and/or iconv modules if conversion is desired.

A couple of convenience procedures are defined for typical cases on top of mime-retrieve-body.

procedure: (mime-body->string PART-INFO XPORT)
procedure: (mime-body->file PART-INFO XPORT FILENAME)
Reads in the body of mime message, decoding transfer encoding, and returns it as a string or writes it to a file, respectively.

Message generator

The message generator generates a RFC822/MIME message out of :mime-part objects.

procedure: (mime-part-write PART-INFO)

Formats PART-INFO as a MIME message and writes the result to the current output port.

PART-INFO may describe a single mail body or a multipart/* or message/* hierarchy.

For multipart messages a boundary may be given in (mime-part:parameters PART-INFO) which will be used as a base for the message boundary (but modified if needed).

If (mime-part:transfer-encoding PART-INFO) specifies "base64" or "quoted-printable" the body is encoded accordingly; else it is put into the message literally.

procedure: (mime-part->string PART-INFO)

Like (mime-part-write) but returns the message as a string.

Examples

The simplest form of MIME message parser would be like this:

(let ((headers (rfc822-header->list port)))
  (mime-parse-message port headers
                      (cut mime-body->string <> <>)))

This reads all the message on memory (i.e. the "leaf" :mime-part records' content fields would hold the part's body as a string), and returns the top :mime-part record. Content transfer encoding is recognized and handled, but character set conversion isn't done.

You may want to feed the message body to a file directly, or even want to skip some body according to mime media types and/or other header information. Then you can put the logic in the handler closure. That's the reason that this module provides building blocks, instead of all-in-one procedure.

A simple MIME-Message could be generated as follows:

(mime-part-write
 (make-mime-part
  #:type "multipart"
  #:subtype "mixed"
  #:headers
  '(("from" "test <test@test.com>")
    ("to"   "foo <foo@bar.com>")
    ("mime-version" "1.0")
    ("subject" "a test")
    ("message-id" "<test123@test.com>"))
  #:content
  (list
   (make-mime-part
    #:transfer-encoding "quoted-printable"
    #:content
    "This = a simple test.")
   (make-mime-part
    #:type "application" #:subtype "octet-stream"
    #:transfer-encoding "base64"
    #:content "a simple test"))))
From: test <test@test.com>
To: foo <foo@bar.com>
Mime-Version: 1.0
Subject: a test
Message-Id: <test123@test.com>
Content-Type: multipart/mixed;boundary="MIME-Message-Boundary-"
Content-Transfer-Encoding: 7bit

This message is in MIME format.

--MIME-Message-Boundary-
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

This =3D a simple test.
--MIME-Message-Boundary-
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64

YSBzaW1wbGUgdGVzdA==

--MIME-Message-Boundary---

License

Copyright (c) 2000-2004 Shiro Kawai, All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

3. Neither the name of the authors nor the names of its contributors
   may be used to endorse or promote products derived from this
   software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.