Clojure switch and modify performance

There is a slightly modified example from clojure.org/refs

(defn mod-nth [v i f] (assoc v i (f (v i))))
(defn run [oper nvecs nitems nthreads niters]
  (let [vec-refs (vec (map (comp ref vec)
                        (partition nitems (repeat (* nvecs nitems) 0))))
        sum  #(reduce + %)
        swap #(let [v1 (rand-int nvecs)
                    v2 (rand-int nvecs)
                    i1 (rand-int nitems)
                    i2 (rand-int nitems)]
                (dosync
                  (let [temp (nth @(vec-refs v1) i1)]
                    (oper (vec-refs v1) mod-nth i1 inc)
                    (oper (vec-refs v2) mod-nth i2 dec))))
        report #(do
                  (prn (map deref vec-refs))
                  (println "Sum:"
                    (reduce + (map (comp sum deref) vec-refs))))]
    (report)
    (dorun (apply pcalls (repeat nthreads #(dotimes [_ niters] (swap)))))
    (report)))

(time (run alter 100 10 10 100000))

      

Sample output

([0 0 0 0 0 0 0 0 0 0] [...])
Sum: 0
([15 -14 -8 57 -26 -12 -49 -29 33 -3] [...])
Sum: 0
"Elapsed time: 1995.938147 msecs"

      

Instead of exchanging unique numbers, I transfer them from one vector element to another.

This operation can be taken as commutative, so there is another test - it is used commute

instead ofalter

(time (run commute 100 10 10 100000))

      

with sample output like

([0 0 0 0 0 0 0 0 0 0] [...])
Sum: 0
([8 48 -10 -41 -17 -32 -4 50 -31 88] [...])
Sum: 0
"Elapsed time: 3141.591517 msecs"

      

Surprisingly, the first example works roughly in 2 seconds

, and in the second -3 seconds

But as mentioned in this SO answer

commute

- an optimized version of alter for those times when the order of things really doesn't matter.

How can it be optimized while it takes longer to do the same job in this simple case? What's the purpose commute

?

+3


source to share


2 answers


I used VisualVM monitoring functions clojure.core

involved at the start of example, using both alter

, and so commute

.

alter

alter

commute

commute



If my interpretation of the results is correct , the accumulated time taken for each function shows which commute

is actually faster than alter

. It seems that the overhead of all the other operations that need to be done to run the code in parallel are the ones that hurt performance.

The benchmarking code is quite complex and the usage is time

sometimes confusing. The information provided by VisualVm might not even be the final word, although profiling and using tools like criterium might be the best way to make reliable results trustworthy.

Another important fact is that the operations performed inside the block dosync

do not take that long, so even if one of them tries again, the extra time it takes is not that important. Adding a little delay inside dosync

makes the difference between repetition ( alter

) more noticeable than repetition ( commute

).

(defn mod-nth [v i f] (assoc v i (f (v i))))
(defn run [oper nvecs nitems nthreads niters]
  (let [vec-refs (vec (map (comp ref vec)
                        (partition nitems (repeat (* nvecs nitems) 0))))
        sum  #(reduce + %)
        swap #(let [v1 (rand-int nvecs)
                    v2 (rand-int nvecs)
                    i1 (rand-int nitems)
                    i2 (rand-int nitems)]
               (dosync
                 (let [temp (nth @(vec-refs v1) i1)]
                   (Thread/sleep 1)                     ;; This was added
                   (oper (vec-refs v1) mod-nth i1 inc)
                   (oper (vec-refs v2) mod-nth i2 dec))))
        report #(do
                  (prn (map deref vec-refs))
                  (println "Sum:"
                    (reduce + (map (comp sum deref) vec-refs))))]
    (doall (apply pcalls (repeat nthreads #(dotimes [_ niters] 
                                            (swap)))))))

(time (run alter 100 10 10 5000))
;= "Elapsed time: 15252.427 msecs"
(time (run commute 100 10 10 5000))
;= "Elapsed time: 13595.399 msecs"

      

+3


source


It is important to understand that the optimization done commute

, in particular: commute

allows you to avoid unnecessary repetition of code within your block in situations where alter

you need to discard results.

The constant factor overhead between implementations commute

and is alter

not specified, so what you see here does not violate any part of the Clojure specification. However, as the amount of time spent by individual transactions within your block dosync

grows, the usage penalty alter

when you could use commute

will grow in a similar way.



Generally:

  • Microbenchmarks are evil (in the sense that they reward bad practices that don't scale for real world use). Note the performance behavior in real-world scenarios, not contrived test cases.
  • Use commute

    Clojure STM whenever you can.
+1


source







All Articles