Python IMAP search using iso-8859-1 encoded object
From another account, I sent an email with a subject Test de réception en local
. Now using IMAP I want to find this email search by topic.
When searching ALL
and searching for email among the output, I see:Subject: =?ISO-8859-1?Q?Test_de_r=E9ception_en_local?=
So now, looking with imap, I am trying:
M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login('user@gmail.com', 'password')
M.select('[Gmail]/All Mail')
subject = Header(email_model.subject, 'iso-8859-1').encode() #email_model.subject is in unicode, utf-8 encoded
typ, data = M.search('iso-8859-1', '(SUBJECT "%s")' % subject)
for num in data[0].split():
typ, data = M.fetch(num, '(RFC822)')
print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()
print 'Fin'
If you print subject
, you can see that the result is the same as what I get from the IMAP server in my previous, broader search. However, it doesn't seem like a match when doing this more specific search.
For searching, I tried everything I could:
typ, data = M.search('iso-8859-1', '(HEADER subject "%s")' % subject)
typ, data = M.search('iso-8859-1', 'ALL (SUBJECT "%s")' % subject)
And others that I can't remember at the moment, all with no luck.
I can search (and match) emails that have topics that only use ASCII, but it doesn't work with any subject that applies encoding. So that...
With IMAP, what is the correct way to search for email using an object that enforces encoding?
thank
source to share
When talking to IMAP servers, select the IMAP RFC checkbox .
You should remove extra quotes and you shouldn't encode strings. Also, charset indicates the encoding of the search query, not the encoding of the message header. This should work (works for me):
M.search("utf-8", "(SUBJECT %s)" % u"réception".encode("utf-8"))
# this also works:
M.search("iso8859-1", "(SUBJECT %s)" % u"réception".encode("iso8859-1"))
Edit:
Apparently some servers (at least gmail as of August 2013) only support utf-8 strings when sent as literals. Python imaplib has very limited support for literals, the best one can do is something like:
term = u"réception".encode("utf-8")
M.literal = term
M.search("utf-8", "SUBJECT")
source to share