Lazy seqs and heap exhaustion


#1

Hi,

While the code below solves the Euler problem #1 I was curious to explore the results for much higher limits. At 10,000,000 I run into heap exhaustion. in fn multiples-of-3-and-5-below-n everything is lazy seqs until the reduce, so I don’t understand why there would be a heap issue at all. I would understand an integer overflow, but not a heap issue.

Any ideas?

Thanks,

—Ralph

project-euler.core=> (multiples-of-3-or-5-below-n 10000000)
project-euler.core=> Exception in thread "nREPL-worker-8" java.lang.OutOfMemoryError: Java heap space


(ns project-euler.core)

(def natural-numbers
 (iterate inc 1))

(defn multiple-of-n?
 [n number]
 (zero? (rem number n)))

(defn multiple-of-3-or-5?
 [number]
 (or (multiple-of-n? 3 number)
     (multiple-of-n? 5 number)))

(defn multiples-of-3-and-5-below-n
 [n]
 (->> natural-numbers
      (take (dec n))
      (filter multiple-of-3-or-5?)
      (reduce +)))


(defn -main
 []
 (spit "answers.txt"
       (str
         (str "Problem 001: " (multiples-of-3-and-5-below-n 1000) "\n"))))

#2

Hi Ralph!

This is a great question. Your code is really nice and clear. It was a pleasure to read and I think it’s essentially correct.

Your solution is totally lazy, which is great. The problem with running out of memory is because you are defining a lazy seq and “holding onto the head”, so it cannot be garbage collected. This is one issue with infinite or very large lazy data structures.

You define natural-numbers at the top level. It’s indeed lazy, but once it has been realized, it will stay in memory because it is pointed to by a var. Because it’s potentially infinite, it will eat up all memory if it can’t be collected.

My recommendation is to generate the natural numbers each time multiples-of-3-and-5-below-n is called. So you can either inline the iterate expression or turn natural-numbers into a function that returns a new seq of numbers each time.

This is a great lesson and one of the reasons I like the Project Euler problems. They really push the limits of these abstractions.

Rock on!
Eric


#3

Thanks Eric. I changed natural-numbers to a fn and had no issues calculating (multiples-of-3-and-5-below-n 1000000000), except for waiting a while for the result.

Great answer. This helps a lot.

–Ralph


#4

Lazy seqs are great for this. If it takes too long, I would suggest trying it with a loop/recur instead of a lazy seq to avoid creating the list in the first place. A lot of the other bits you’ve made will still be helpful.


#5

I get 10x speed improvement with the loop/recur approach. But the code is 10x less readable as well:

(defn multiples-of-3-and-5-below-n-loop
  "loop/recur instead of laze seqs"
  [n]
  (loop [acc 0
         i 1]
    (if (>= i n)
        acc
        (recur (if (multiple-of-3-or-5? i)
                 (+ acc i)
                 acc)
               (inc i)))))

; project-euler.core=> (time (multiples-of-3-and-5-below-n-loop 1000000000))
; "Elapsed time: 25984.120406 msecs"
; 233333333166666668
; project-euler.core=> (time (multiples-of-3-and-5-below-n 1000000000))
; "Elapsed time: 277033.267078 msecs"
; 233333333166666668

#6

BTW, the transduce version is almost as quick as the loop/recur version. Readability is almost as good:

(defn multiples-of-3-and-5-below-n-transduce
  [n]
  (let [d (comp (filter multiple-of-3-or-5?)
                (take-while #(< % n)))]
    (->> (natural-numbers)
         (transduce d +))))

; project-euler.core=> (time (multiples-of-3-and-5-below-n-transduce 1000000000))
; "Elapsed time: 36757.548489 msecs"
; 233333333166666668