Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mailparser :: Nodemailer #363

Open
gowol64 opened this issue Sep 23, 2023 · 0 comments
Open

Mailparser :: Nodemailer #363

gowol64 opened this issue Sep 23, 2023 · 0 comments

Comments

@gowol64
Copy link

gowol64 commented Sep 23, 2023

Powered by EmailEngine

Send and receive emails easily with Outlook and Gmail using OAuth2.

  1. Nodemailer
  2. Usage
  3. Message configuration
  4. SMTP transport
  5. Other transports
  6. Plugins
  7. DKIM
  8. Extra modules
    SMTP Server
    SMTP Connection
    Mailparser
    Mailcomposer
    Node.js daemons
  9. NodemailerApp

MAILPARSER
Advanced email parser for Node.js. Everything is handled as a stream which should make it able to parse even very large messages (100MB+) with relatively low overhead.

The module exposes two separate modes, a lower level MailParser class and simpleParser function. The latter is simpler to use (hence the name) but is less resource efficient as it buffers attachment contents in memory.

Install
npm install mailparser --save
simpleParser
simpleParser is the easiest way to parse emails. You only need to provide a message source to get a parsed email structure in return. As an additional bonus all embedded images in HTML (eg. the images that point to attachments using cid: URIs) are replaced with base64 encoded data URIs, so the message can be displayed without any additional processing. Be aware though that this module does not do any security cleansing (eg. removing javascript and so on), this is left to your own application.

const simpleParser = require('mailparser').simpleParser;
simpleParser(source, options, (err, parsed) => {});
See MailParser options list

or as a Promise:

simpleParser(source, options)
.then(parsed => {})
.catch(err => {});
or even with async..await:

let parsed = await simpleParser(source);
Where

source is either a stream, a Buffer or a string that needs to be parsed
options is an optional options object
err is the possible error object
mail is a structured email object
mail object
Parsed mail* object has the following properties

headers – a Map object with lowercase header keys
subject is the subject line (also available from the header mail.headers.get(‘subject’))
from is an address object for the From: header
to is an address object for the To: header
cc is an address object for the Cc: header
bcc is an address object for the Bcc: header (usually not present)
date is a Date object for the Date: header
messageId is the Message-ID value string
inReplyTo is the In-Reply-To value string
reply-to is an address object for the Cc: header
references is an array of referenced Message-ID values
html is the HTML body of the message. If the message included embedded images as cid: urls then these are all replaced with base64 formatted data: URIs
text is the plaintext body of the message
textAsHtml is the plaintext body of the message formatted as HTML
attachments is an array of attachments
address object
Address objects have the following structure:

value an array with address details

name is the name part of the email/group
address is the email address
group is an array of grouped addresses
text is a formatted address string for plaintext context

html is a formatted address string for HTML context

Example

{
value: [
{
address: '[email protected]',
name: 'Andris Reinman'
},
{
address: '[email protected]',
name: ''
}
],
html: 'Andris Reinman <[email protected]>, [email protected]',
text: 'Andris Reinman [email protected], [email protected]'
}
headers Map
headers is a Map with lowercase header keys. So if you want to check for the Subject: header then you can do it like this:

if (mail.headers.has('subject')) {
console.log(mail.headers.get('subject'));
}
The format of a header depends on the specific key. For most header keys the value is either a string (a single header) or an array of strings (multiple headers with the same key were found).

Special header keys are the following:

All address headers are converted into address objects
from
to
cc
bcc
sender
reply-to
delivered-to
return-path
All different priority headers are converted into priority with the following values:
‘high’
‘normal’
‘low’
references is a string if only a single reference-id exists or an array if multiple ids exist

date value is a Date object

The following headers are parsed into structured objects, where value property includes the main value as string and params property holds an object of additional arguments as key-value pairs

content-type
content-disposition
dkim-signature
Some headers are also automaticaly mime-word decoded

all address headers (name parts and punycode encoded domains are converted to unicode)
subject is converted to unicode
attachment object
Attachment objects have the following structure:

filename (if available) file name of the attachment
contentType MIME type of the message
contentDisposition content disposition type for the attachment, most probably “attachment”
checksum a MD5 hash of the message content
size message size in bytes
headers a Map value that holds MIME headers for the attachment node
content a Buffer that contains the attachment contents
contentId the header value from ‘Content-ID’ (if present)
cid contentId without < and >
related if true then this attachment should not be offered for download (at least not in the main attachments list)
MailParser
MailParser is a lower-level email parsing class. It is a transform stream that takes email source as bytestream for the input and emits data objects for attachments and text contents.

const MailParser = require('mailparser').MailParser;
let parser = new MailParser();
options
skipHtmlToText boolean Don’t generate plaintext from HTML. Defaults to undefined (falsy).
maxHtmlLengthToParse number The maximum amount of HTML to parse in bytes. Defaults to undefined (Infinity).
formatDateString function Provide a custom formatting function. Defaults to undefined.
skipImageLinks boolean Skip converting CID attachments to data URL images. Defaults to undefined (falsy).
skipTextToHtml boolean Don’t generate HTML from plaintext message. Defaults to undefined (falsy).
skipTextLinks boolean Do not linkify links in plaintext content. Defaults to undefined (falsy).
Iconv object Defaults to iconv-lite
keepCidLinks boolean simpleParser-only option. Sets skipImageLinks to true.
Event ‘headers’
The parser emits ‘headers’ once message headers have been processed. The headers object is a Map. Different header keys have different kind of values, for example address headers have the address object/array as the value while subject value is string.

Header keys in the Map are lowercase.

parser.on('headers', headers => {
console.log(headers.get('subject'));
});
Event ‘data’
Event ‘data’ or ‘readable’ emits message content objects. The type of the object can be determine by the type property. Currently there are two kind of data objects

‘attachment’ indicates that this object is an attachment
‘text’ indicates that this object includes the html and text parts of the message. This object is emitted once and it includes both values
attachment object
Attachment object is the same as in simpleParser except that content is not a buffer but a stream. Additionally there’s a method release() that must be called once you have processed the attachment. The property related is set after message processing is ended, so at the data event this value is not yet available.

parser.on('data', data => {
if (data.type === 'attachment') {
console.log(data.filename);
data.content.pipe(process.stdout);
data.content.on('end', () => data.release());
}
});
If you do not call release() then the message processing is paused.

text object
Text object has the following keys:

text includes the plaintext version of the message. Is set if the message has at least one ‘text/plain’ node
html includes the HTML version of the message. Is set if the message has at least one ‘text/html’ node
textAsHtml includes the plaintext version of the message in HTML format. Is set if the message has at least one ‘text/plain’ node.
parser.on('data', data => {
if (data.type === 'text') {
console.log(data.html);
}
});
Issues
Charset decoding is handled using iconv-lite, except for ISO-2022-JP and EUCJP that are handled by encoding-japanese. Alternatively you can use node-iconv module instead for all charset decoding. This module is not included in the mailparser package, you would have to provide it to Mailparser or simpleParser as a configuration option.

const Iconv = require('iconv').Iconv;
const MailParser = require('mailparser').MailParser;
let parser = new MailParser({ Iconv });
or

const Iconv = require('iconv').Iconv;
const simpleParser = require('mailparser').simpleParser;
simpleParser('rfc822 message', { Iconv }, callback);
License
Dual licensed under MIT or EUPLv1.1+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant