Why is Clojure spending so much time on clojure.lang.Iterate.first?

Question

Why is Clojure spending so much time on clojure.lang.Iterate.first?

I really love the story of Frank Nelson Cole , who in 1903 demonstrated the first factorization of 2 ^ 67 - 1 into the famous "Lecture Without Words". These days factorization can be easily found using the following naive algorithm:

(def mersenne67 (dec (expt 2 67)))

(->> (iterate inc 2)
     (filter #(zero? (rem mersenne67 %)))
     (first))

However, I noticed that this Clojure code is about twice as long as the equivalent Java or Kotlin code. (~ 40 vs ~ 20 seconds on my machine)
Here's Java I compared it to:

  public static BigInteger mersenne() {
    BigInteger mersenne67 = 
      BigInteger.valueOf(2).pow(67).subtract(BigInteger.ONE);

    return Stream.iterate(BigInteger.valueOf(2), (x -> x.add(BigInteger.ONE)))
      .filter(x -> mersenne67.remainder(x).equals(BigInteger.ZERO))
      .findFirst()
      .get();
  }

Rewriting Clojure's code at a lower level didn't matter:

(def mersenne67 (-> (BigInteger/valueOf 2)
                (.pow (BigInteger/valueOf 67))
                (.subtract BigInteger/ONE)))

(->> (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))
     (filter #(= BigInteger/ZERO (.remainder ^BigInteger mersenne67 %)))
     (first))

After profiling the code with VisualVM, the main suspect seems to clojure.lang.Iterate.first()

be that it almost accurately determines the difference in the duration of these functions. The Java equivalent java.util.stream.ReferencePipeline.findFirst()

only works on a fractional basis (~ 22 vs ~ 2 seconds). This leads to my question: How can Java (and Kotlin) get away with spending much less time on this task?

+3

java performance clojure

Sebastian Oberhoff June 13. 17 at 20:05

source to share

2 answers

Your problem is that you are inefficiently iterating through iterate

. This is why you see first

at the top when profiling. This is, of course, the result of all the basic clojure functions dealing with many different data structures.

The best way to avoid this is to use reduce

which gives the object the task of calling the function in a loop.

So, this is about as fast:

(reduce
      (fn [_ x]
        (when (identical? BigInteger/ZERO (.remainder ^BigInteger mersenne67 x))
          (reduced x)))
      nil
      (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2)))

+2

ClojureMostly June 14. 17 at 17:05

source to share

Bojan Horvat · Accepted Answer · 2017-09-07T17:47:33+0000

I apologize in advance for the burial, but I'm a little worried about ClojureMostly's answer. This certainly solves the problem in a timely manner, but it looks like a dirty hack to me: passing in an anonymous reduction function that ignores the current result (_) and ends as soon as the first factor is found (reduced).

How to use transducers and transducer function:

(defn first-factor-tr 
  [^BigInteger n] 
  (transduce
    (comp (filter #(identical? BigInteger/ZERO (.remainder ^BigInteger n %))) (take 1))
    conj nil
    (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))))

Filter all values in the collection with a remainder of zero (filter ...) and take one first (take ...).

The execution time of this solution is on par with the reduction time:

(defn first-factor-reduce 
  [^BigInteger n] 
  (reduce
    (fn [_ x]
      (when (identical? BigInteger/ZERO (.remainder ^BigInteger n x))
            (reduced x)))
    nil
    (iterate #(.add ^BigInteger % BigInteger/ONE) (BigInteger/valueOf 2))))

=> (time (first-factor-tr mersenne67))
"Elapsed time: 20896.594178 msecs"
(193707721)
=> (time (first-factor-reduce mersenne67))
"Elapsed time: 20424.323894 msecs"
193707721

Why is Clojure spending so much time on clojure.lang.Iterate.first?

More articles: