I have a fairly general question: in practice, when are Clojure’s concurrency primitives useful as opposed to a database? In what kinds of applications and contexts does an app need to correctly handle concurrent access to data, but not also need to persist that data to disk?
I’m asking this because I’m noticing a mistake I’ve made twice, and trying to reflect on the lesson of it. What I’ve done is, first, carefully implement a webapp that uses Clojure’s own collection structures to manage data, therefore using Clojure’s concurrency reference types to manage multithreaded access. Then, second, I realize that I really want the app to persist the data, so that it will survive process exits, server reboots, etc…
At this point, I realize that I need a database instead of in-memory data structures. And then I see that the care I’ve taken in using Clojure’s concurrency primitives, like the transform function defined to work with
swap!, or the code in a
dosync block, was wasted work. Because if I’m moving the data into a database, then I will need to let the database manage the synchronization of access. So I will need to rewrite my Clojure transform function into a Postgres stored procedure, or a datomic database function, or whatever.
Given that I’ve made this error twice, one lesson I take from it is I suppose fairly obvious in retrospect: if it’s remotely likely I’m going to want to make the storage durable in the future, then I should just architect to use a database from the beginning. I could even use an in-memory database in the beginning and then configure it to hit the disk later, only when that’s relevant. It occurs to me, however, that taking this lesson to heart amounts to doing away with a significant part of part of what makes Clojure distinctive and basically adopting the strategies that apps in other languages use: punt on data synchronization issues by saving everything in a database and relying on the database to handle them.
So that’s why I’m wondering, just as a practical matter, what kinds of apps, or what parts of apps, most often have requirements to handle data concurrently but not persist it?
I’d love to hear about people’s experiences on this.