C # Parsing Best Practices

I need to parse some known file formats, one of them is CUSCAR , I firmly believe RegEx will do the job, any suggestions?

+1


source to share


1 answer


I just looked at the CUSCAR spec and I think you end up with some pretty ugly regex code to parse this. You can get away from it if you only take apart part of it. You will have to test the speed, as the main obstacle for you will be I / O.

I did something similar with vendor files that came from QWEST. These beasts were hierarchical text files. Thinking about what to suck! I am currently creating and parsing text files from 4 to 50 million lines each (every day).



There is a nice framework called the FileHelpers Library . This structure will help you create an object-oriented representation of records (text strings). He even has a good wizard to guide you through creating these objects to represent the records. It will handle basic, split and fixed formats easily.

+2


source







All Articles