How to split a sequence by increasing subsequence in Clojure?
I have a sequence of integers and I would like to split them into increasing segments and I want to have as few segments as possible. So I want to have
(segmentize [1 2 3 4 3 8 9 1 7] <=)
;=> [[1 2 3 4][3 8 9][1 7]]
I have implemented sharding as follows:
(defn segmentize [col lte]
(loop [col col s [] res []]
(cond (empty? col) (conj res s)
(empty? s) (recur (rest col) (conj s (first col)) res)
(lte (last s) (first col)) (recur (rest col) (conj s (first col)) res)
:else (recur col [] (conj res s)))))
But I was wondering if there is some convenient clojure function that does exactly this, or if there is a more idiomatic way to do it.
source to share
You can build this one withpartition-by
(defn segmentize [cmp coll]
(let [switch (reductions = true (map cmp coll (rest coll)))]
(map (partial map first) (partition-by second (map list coll switch)))))
(segmentize <= [1 2 3 4 3 8 9 1 7])
;=> ((1 2 3 4) (3 8 9) (1 7))
The first two map
last lines can be changed to mapv
if you really want vectors and not lazy sequences.
source to share
Another lazy implementation. Basically figure out how many consecutive pairs of numbers the "lte" (take-while + segment) function returns true, and then splits the original collection by that number. Repeat with collection of reminders:
(defn segmentize
[coll lte]
(lazy-seq
(when-let [s (seq coll)]
(let [pairs-in-segment (take-while (fn [[a b]] (lte a b)) (partition 2 1 s))
[segment reminder] (split-at (inc (count pairs-in-segment)) s)]
(cons segment
(segmentize reminder lte))))))
source to share
This is a special case of some sequence processing functions in org.flatland/useful
, in particular flatland.useful.seq/partition-between
:
(partition-between (partial apply >) xs)
If you need an implementation from scratch, with no external dependencies, I would prefer dAni's answer.
source to share
Because everyone loves lazy sequences:
(defn segmentize [coll cmp]
(if-let [c (seq coll)]
(lazy-seq
(let [[seg rem] (reduce (fn [[head tail] x]
(if (cmp (last head) x)
[(conj head x) (next tail)]
(reduced [head tail])))
[(vec (take 1 c)) (drop 1 c)]
(drop 1 c))]
(cons seg (segmentize rem cmp))))))
The code for calculating each segment could probably be made a little less verbose using a / recur loop, but I find it reduce
more readable more often than not.
source to share