Reading email from a mail file
I have a huge number of mail archives that I want to delete and sort. The archives are either in mbox format or contain a single mail message. To add a bit of complexity, some of the files have EOL window sequences and some have unix EOL windows.
Using C #. how can I read an archive and split it into separate messages or read a single message file? In python, I would use the mailbox.mbox class, but I don't see the relevant functionality in the C # documentation.
source to share
You are unlikely to find a library to read this file for C # - there are not many Unix users who also use C #.
I would do like this:
- Read Python code and then put it in C #
- Find a description of the mbox format online. Since this is a Unix system, most likely the format is just a text file and should be simple enough to parse.
source to share
Most standard Unix mail files limit entries to the line starting with "From"
So, if you read in a mail file as a text file and switch to a new mail record every time you see the "From" line at the beginning of a line, it should work - Any lines elsewhere should already be limited by the email program
source to share
If this is a one-off job I think the simplest steps to sort the messages are:
- combine all mbox files into one
- upload compilation to thunderbird as local folder
- run one of the Duplicate message finder Add-Ons in the folder
- remove found duplicates
- compact folder
- take a list of free emails :)
Duplicate Eliminators (Thunderbird Add-ons)
I used this: Remove duplicate messages (Alternate)
source to share