You’re asking very detailed and poignant questions. I think we’ll have to go deep into the source for this one. Let’s follow the trail through the source code and see where that leads us.
In the code for
clojure.core/future, we see the docstring:
Takes a body of expressions and yields a future object that will
invoke the body in another thread, and will cache the result and
return it on all subsequent calls to deref/@. If the computation has
not yet finished, calls to deref/@ will block, unless the variant of
deref with timeout is used. See also - realized?.
You said that there’s “no guarantee that the task will start running before the call to future returns”. But certainly the task has to be on some call stack somewhere for it to be run. But let’s trace it through entirely.
We see that it’s a macro that expands to a call to
clojure.core/future-call. Here’s the source:
Here are the relevant lines:
(let [f (binding-conveyor-fn f)
fut (.submit clojure.lang.Agent/soloExecutor ^Callable f)]
clojure.core/binding-conveyor-fn is not something you use every day. But it makes a new function that appears to reset vars to their root values. Vars have a root value and a per-thread value, per their thread-safe semantics. That’s not very relevant to the question. After that function gets called,
(.submit clojure.lang.Agent/soloExecutor f) is called.
clojure.lang.Agent/soloExecutor is a standard
So, that’s the end of its journey.
.submit returns a Java Future, which is then used by Clojure. The code is just wrapped up a little in some utility function and passed off to an
ExecutorService that’s part of the Clojure runtime. The ExecutorService puts it on a queue and it will be put on a thread to be run.
That was a very detailed question, and actually deals with stuff most Clojure programmers don’t deal with very often, if at all. But I appreciate the curiosity. Clojure’s concurrency/thread parallelism story is very solid. This assurance that the task will be run at some point is handled completely by the runtime.
core.async does have its own thread pool that actually runs the go blocks. Yes, you need to worry about infinite loops in go blocks. You also need to worry about deadlock. Luckily, go blocks tend to be small and easy to understand.
Now, that’s not to say you don’t want an infinite loop in your go block. If you’re waiting on a channel inside of an infinite loop, that go block will be “parked” while it’s waiting and won’t be blocking any threads. Infinite loops are fine as long as you take or put to channels inside the loop. In fact, go block infinite loops are very common for implementing local event loops.
Because you’ve been asking for lots of gory details, I’ll go deeper. Go blocks are essentially callback functions that are parked in the channels they are waiting on. So when the channels are out of scope and the go block is parked, the go block can be collected along with the channel. That’s a very nice feature because you almost never have to worry about ending a go block like you do in the Go language. Channels and go blocks are just objects and the GC takes care of everything.
Watch this talk by Rich Hickey if you like these details.