How to encode special characters in email addresses

E-Mail-Addresses consist of not only these parts:

localpart@domain.tld

The complete line on the next line (including the part between quotes, quotes, and angle brackets) is also a valid address:

"John Doe" < localpart@domain.tld >

And when I replace "John Doe" with my name, I get an address that I can enter into my E-Mail-Client without any pretensions (note the " ö " in my last name, which is not an ascii character):

"Hubert Schölnast" < localpart@domain.tld >

So it seems (to the user of a regular email client like Thunderbird) as if the special characters in the quoted section were okay.

But when I validate this full email address in perl-script using the Email :: Valid cpan module , I get an error saying that this address does not comply with rfc822 rules, and the documentation of that module says that rfc822 does not resolve which -or a non-ascii character anywhere in the email address. (When I omit the letter ö or replace it with the letter ascii, the check indicates that the address is valid.)

So, it is obvious that any email client must encode the email address before sending email to the smtp server and must decode it when it receives a new email and displays the header information to the user. But I can't figure out how it's done and I really did my best to search the web.

I need this encoding algorithm because I want to write a perl-script that accepts any valid email address (also with special characters in the quoted section) and then sends emails to those addresses.

+3


source to share


2 answers


Perl core has Encode.pm

:

#!/usr/bin/perl
use strict;
use warnings;
use Encode;

my $from_header = decode_utf8 q{From: "Hubert Schölnast" <localpart@domain.tld>};
print encode('MIME_Header', $from_header);

1;
__END__
From: "=?UTF-8?B?SHViZXJ0IFNjaMO2bG5hc3Q=?=" <localpart@domain.tld>

      

RFC822 / 2822 has many requirements that make it difficult to work with emails.



RFC2822 also prohibits each line of a message to contain more than 998 characters. Long lines should be split across multiple lines by indenting continuation lines.

This means that we have to pay attention to the length of the lines when we only change them after converting special characters and adding a header label.

+3


source


Use MIME :: Words to encode or decode addresses, subject, etc.

eg. when creating an email:

#!/usr/bin/perl
use strict;
use warnings;
use utf8;

use MIME::Words qw{ encode_mimeword };

my $encoded = encode_mimeword('Hubert Schölnast');

      



To encode a name with an address, use encode_mimewords

.

When processing email, use instead decode_mimewords

.

+1


source







All Articles