Time series segmentation
I have time series arrays, on average about 1000 values per array. I need to independently define time series segments in each array.
I couldn't find much information on standards on how to do this. The approach I'm currently using is to calculate the average of the array and segment elements when the elapsed time between each element exceeds it. I'm sure there are more suitable methods.
This is the code I am currently using.
def time_cluster(input)
input.sort!
differences = (input.size-1).times.to_a.map {|i| input[i+1] - input[i] }
mean = differences.mean
clusters = []
j = 0
input.each_index do |i|
j += 1 if i > 0 and differences[i-1] > mean
(clusters[j] ||= []) << input[i]
end
return clusters
end
Pair samples from this code
time_cluster([1, 2, 3, 4, 7, 9, 250, 254, 258, 270, 292, 340, 345, 349, 371, 375, 382, 405, 407, 409, 520, 527])
Outputs
1 2 3 4 7 9, sparsity 1.3
250 254 258 270 292, sparsity 8.4
340 345 349 371 375 382 405 407 409, sparsity 7
520 527, sparsity 3
Another array
time_cluster([1, 2, 3, 4 , 5, 6, 7, 8, 9, 10, 1000, 1020, 1040, 1060, 1080, 1200])
Outputs
1 2 3 4 5 6 7 8 9 10, sparsity 0.9
1000 1020 1040 1060 1080, sparsity 16
1200
+3
source to share
2 answers
Use K-Remedy. http://ai4r.rubyforge.org/machineLearning.html
gem install ai4r
Differences in singular value may also interest you. http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/
If you can't do it in Ruby, here's a great example in Python.
+1
source to share