I have found conflicting information about dot stuffing when transmitting an email.
- stuff a dot if the line contains a single dot (to avoid premature termination)
- stuff a dot to every line stat starts with a dot
- stuff a dot to (1) and to every line part of a quoted-printable message part only
Can anyone clarify?
According to the SMTP standard RFC 5321, section 4.5.2:
http://tools.ietf.org/html/rfc5321#section-4.5.2
To allow all user composed text to be transmitted transparently, the following procedures are used:
- Before sending a line of mail text, the SMTP client checks the first character of the line. If it is a period, one additional period is inserted at the beginning of the line.
- When a line of mail text is received by the SMTP server, it checks the line. If the line is composed of a single period, it is treated as the end of mail indicator. If the first character is a period and there are other characters on the line, the first character is deleted.
So, from the three points of your question, the second one is right.
The practical answer: If you're using quoted printable format then always translate a dot to =2E. You can't rely on all smtp servers doing the dot removal correctly.
If you want to assume the whole world is standards compliant then go with answer 2 above.
In SMTP protocol the mail is terminated by a single dot and a newline character(s)
In simple terms something like:
\r\n.\r\n
The characters:
CR LF DOT CR LF
Which corresponds to a single dot at the beginning of a line.
In case the mail data contains a single . At the beginning of line and is followed by a new line character then the SMTP protocol will consider it as mail termination and hence only a part of mail would be delivered.
So the whole idea is to avoid these type of situation by padding an extra dot.