Group DateTime of strings in datatable by DateTime - C #
I have a LARGE datatable (500k-1m rows) without going into details, this is a requirement as the end user needs / wants to see all the data. This is on the local server, so bandwidth etc. Don't bother me.
I have a DateTime field in a DataTable that I need to group, let me explain what I mean by grouping ... This is probably not what you think (I'm looking at other questions here!).
var table = new DataTable();
table.Columns.Add("EventTime", typeof(DateTime));
table.Columns.Add("Result", typeof(String));
table.Columns.Add("ValueOne", typeof(Int32));
table.Columns.Add("ValueTwo", typeof(Int32));
table.Rows.Add("2012-02-06 12:41:45.190", "A", "7", "0");
table.Rows.Add("2012-02-06 12:45:41.190", "B", "3", "89");
table.Rows.Add("2012-02-06 12:59:41.190", "C", "1", "0");
table.Rows.Add("2012-02-06 13:41:41.190", "D", "0", "28");
table.Rows.Add("2012-02-06 17:41:41.190", "E", "0", "37");
table.Rows.Add("2012-02-07 12:41:45.190", "F", "48", "23");
I expect the above table to be grouped to get the sum of the "ValueOne" column and the average of the "ValueTwo" column. I need the grouping to be a little flexible so that I can specify what I want to group by minutes (only the first and last lines will be grouped, the rest will just provide their values) or by days (everything but the last line will be grouped into one line) etc.
I have tried this several times, but nothing works for me. My LINQ knowledge is not great, but I thought I could do it!
Note. The DataTable is already on the compute / view machine that cannot be modified, so "Stop being an idiot filter in SQL !!!" this is the correct answer, just useless for me!: - D
Also, if you missed it in the title, I need this in C # - I'm working with .NET 4.0 ...
Thanks in advance if you decide to help! :-)
The other three answers are close, but as you pointed out, they group events that happened in one second of a minute, not events that happened in the same second as you want. Try the following:
var query = from r in table.Rows.Cast<DataRow>()
let eventTime = (DateTime)r[0]
group r by new DateTime(eventTime.Year, eventTime.Month, eventTime.Day, eventTime.Hour, eventTime.Minute, eventTime.Second)
into g
select new {
g.Key,
Sum = g.Sum(r => (int)r[2]),
Average = g.Average(r => (int)r[3])
};
You can customize what information you pass to the DateTime constructor to group by different hours.
The only thing you need to change is the property you want to group with.
var query = from x in DataSource
group x by x.EventTime.Minute into x
select new
{
Unit = x.Key,
SumValueOne = x.Sum(y => y.ValueOne),
AverageValueTwo = x.Average(y => y.ValueTwo),
};
Something like this should work:
DataTable dt = GetDataTableResults();
var results = from row in dt.AsEnumerable()
group row by new { EventDate = row.Field<DateTime>("EventTime").Date } into rowgroup
select new
{
EventDate = rowgroup.Key.EventDate,
ValueOne = rowgroup.Sum(r => r.Field<int>("ValueOne")),
ValueTwo = rowgroup.Average(r => r.Field<decimal>("ValueTwo"))
};
This is what your basic code would look like:
var query = table.Rows.Cast<DataRow>()
.GroupBy(r => ((DateTime)r[0]).Second)
.Select(g => new
{
g.Key,
Sum = g.Sum(r => (int)r[2]),
Average = g.Average(r => (int)r[3])
});
To add flexibility, you can have something like this:
IEnumerable<IGrouping<object, DataRow>> Group(IEnumerable<DataRow> rows, GroupType groupType)
{
// switch case would be preferable, but you get the idea.
if(groupType == GroupType.Minutes) return rows.GroupBy(r => ((object)((DateTime)r[0]).Minute));
if(groupType == GroupType.Seconds) return rows.GroupBy(r => ((object)((DateTime)r[0]).Second));
...
}
var baseQuery = table.Rows.Cast<DataRow>();
var grouped = Group(baseQuery, groupType);
var query = grouped
.Select(g => new
{
g.Key,
Sum = g.Sum(r => (int)r[2]),
Average = g.Average(r => (int)r[3])
});