Time series segmentation

I have time series arrays, on average about 1000 values ​​per array. I need to independently define time series segments in each array.

I couldn't find much information on standards on how to do this. The approach I'm currently using is to calculate the average of the array and segment elements when the elapsed time between each element exceeds it. I'm sure there are more suitable methods.

This is the code I am currently using.

def time_cluster(input)
  input.sort!
  differences = (input.size-1).times.to_a.map {|i| input[i+1] - input[i] }
  mean = differences.mean

  clusters = []
  j = 0

  input.each_index do |i|
    j += 1 if i > 0 and differences[i-1] > mean
    (clusters[j] ||= []) << input[i]
  end

  return clusters
end

      

Pair samples from this code

time_cluster([1, 2, 3, 4, 7, 9, 250, 254, 258, 270, 292, 340, 345, 349, 371, 375, 382, 405, 407, 409, 520, 527])

      

Outputs

1  2  3  4  7  9, sparsity 1.3
250  254  258  270  292,  sparsity 8.4
340  345  349  371  375  382  405  407  409, sparsity 7
520  527, sparsity 3

      

Another array

time_cluster([1, 2, 3, 4 , 5, 6, 7, 8, 9, 10, 1000, 1020, 1040, 1060, 1080, 1200])

      

Outputs

1  2  3  4  5  6  7  8  9  10, sparsity 0.9
1000  1020  1040  1060  1080, sparsity 16
1200

      

+3


source to share


2 answers


Use K-Remedy. http://ai4r.rubyforge.org/machineLearning.html

gem install ai4r

      

Differences in singular value may also interest you. http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/



If you can't do it in Ruby, here's a great example in Python.

Unsupervised clustering with an unknown number of clusters

+1


source


You can try clustering algorithms (like k-means).

Some links:



0


source







All Articles