Group results by value after splitting
I searched but couldn't find an answer. Disclaimer: I'm new to C #, but I have a task to create the following program: reading from existing log files, parsing them by tab, limiting results to a specific status (process email), grouping by groups (i.e. Investment bank), then calculate the statistics of the number of conversions of letters by division and print to a new log file.
I would like to give some information about the program itself before asking a question. I am currently at the point where I would like to group the unit and cannot figure out how to do it.
EDIT: original data:
Status Division Time Run Time Zip Files Conversions Returned Files Total E-Mails
Process E-mail Investment Bank 12:00 AM 42.8596599 1 0 1 1
End Processing 12:05 AM 44.0945784 0 0 0 0
Process E-mail Investment Bank 12:10 AM 42.7193253 2 1 0 1
Process E-mail Treasury 12:15 AM 4.6563394 1 0 2 2
Here is the code I have up to this point:
static void Main()
{
{
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(Settings.LogPath + "2012-3-10.log"))
{
string line;
int i = 0;
while ((line = reader.ReadLine()) != null)
{
list.Add(line);
i++;
string[] split = line.Split('\t');
string processing = split[0];
if(processing.StartsWith("Process"))
{
string division = split[1];
int zipFiles;
int.TryParse(split[4], out zipFiles);
int conversions;
int.TryParse(split[5], out conversions);
int returnedFiles;
int.TryParse(split[5], out returnedFiles);
int totalEmails;
int.TryParse(split[5], out totalEmails);
So, I have a program to the point where it will spit something out to the console:
Investment Bank
1
0
1
1
Treasury
1
0
2
2
Investment Bank
2
1
0
1
What I'm looking for now is the Investment Bank group, Treasury, etc., and then can calculate the totals.
The final log file will look like this:
Division Zip Files Conversions Returned Files Total E-mails
Investment Bank 3 1 1 2
Treasury 1 0 2 2
The following code does what you need:
string filename = @"D:\myfile.log";
var statistics = File.ReadLines(filename)
.Where(line => line.StartsWith("Process"))
.Select(line => line.Split('\t'))
.GroupBy(items => items[1])
.Select(g =>
new
{
Division = g.Key,
ZipFiles = g.Sum(i => Int32.Parse(i[2])),
Conversions = g.Sum(i => Int32.Parse(i[3])),
ReturnedFiles = g.Sum(i => Int32.Parse(i[4])),
TotalEmails = g.Sum(i => Int32.Parse(i[5]))
});
Console.Out.WriteLine("Division\tZip Files\tConversions\tReturned Files\tTotal E-mails");
statistics
.ToList()
.ForEach(d => Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}",
d.Division,
d.ZipFiles,
d.Conversions,
d.ReturnedFiles,
d.TotalEmails));
It can be even shorter (albeit less clear) if you don't mess up anonymous classes and instead work with arrays. Let me know if you are interested in such code.
I would build a class to handle this.
something like
public class xxxx
{
Public string Division {get;set}
Public Dictionary<string,int> something{get;set;}
}
Then you can just encapsulate them with
List<xxx> Divisions;
Not sure if this is optimal, but it will work.