Group results by value after splitting
I searched but couldn't find an answer. Disclaimer: I'm new to C #, but I have a task to create the following program: reading from existing log files, parsing them by tab, limiting results to a specific status (process email), grouping by groups (i.e. Investment bank), then calculate the statistics of the number of conversions of letters by division and print to a new log file.
I would like to give some information about the program itself before asking a question. I am currently at the point where I would like to group the unit and cannot figure out how to do it.
EDIT: original data:
Status Division Time Run Time Zip Files Conversions Returned Files Total E-Mails
Process E-mail Investment Bank 12:00 AM 42.8596599 1 0 1 1
End Processing 12:05 AM 44.0945784 0 0 0 0
Process E-mail Investment Bank 12:10 AM 42.7193253 2 1 0 1
Process E-mail Treasury 12:15 AM 4.6563394 1 0 2 2
Here is the code I have up to this point:
static void Main()
{
{
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(Settings.LogPath + "2012-3-10.log"))
{
string line;
int i = 0;
while ((line = reader.ReadLine()) != null)
{
list.Add(line);
i++;
string[] split = line.Split('\t');
string processing = split[0];
if(processing.StartsWith("Process"))
{
string division = split[1];
int zipFiles;
int.TryParse(split[4], out zipFiles);
int conversions;
int.TryParse(split[5], out conversions);
int returnedFiles;
int.TryParse(split[5], out returnedFiles);
int totalEmails;
int.TryParse(split[5], out totalEmails);
So, I have a program to the point where it will spit something out to the console:
Investment Bank
1
0
1
1
Treasury
1
0
2
2
Investment Bank
2
1
0
1
What I'm looking for now is the Investment Bank group, Treasury, etc., and then can calculate the totals.
The final log file will look like this:
Division Zip Files Conversions Returned Files Total E-mails
Investment Bank 3 1 1 2
Treasury 1 0 2 2
source to share
The following code does what you need:
string filename = @"D:\myfile.log";
var statistics = File.ReadLines(filename)
.Where(line => line.StartsWith("Process"))
.Select(line => line.Split('\t'))
.GroupBy(items => items[1])
.Select(g =>
new
{
Division = g.Key,
ZipFiles = g.Sum(i => Int32.Parse(i[2])),
Conversions = g.Sum(i => Int32.Parse(i[3])),
ReturnedFiles = g.Sum(i => Int32.Parse(i[4])),
TotalEmails = g.Sum(i => Int32.Parse(i[5]))
});
Console.Out.WriteLine("Division\tZip Files\tConversions\tReturned Files\tTotal E-mails");
statistics
.ToList()
.ForEach(d => Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
d.Division,
d.ZipFiles,
d.Conversions,
d.ReturnedFiles,
d.TotalEmails));
It can be even shorter (albeit less clear) if you don't mess up anonymous classes and instead work with arrays. Let me know if you are interested in such code.
source to share