Given the following example, how do I ensure `expe...
# coroutines
r
Given the following example, how do I ensure
expensiveLookup
only executes once (for a given id) and subsequent executions receive the cached value?
Copy code
suspend fun expensiveLookup(id: String): ExpensiveResult {
  val cached = cache.get(id)
  if (cache != null) {
    return cached
  }

  val result = // logic here
  
  cache.put(id, result)
  return result
}
This blog post from Doordash in 2018 has an interesting solution. Is there a 'better' way of doing this now? https://doordash.engineering/2018/08/03/avoiding-cache-stampede-at-doordash/
m
What I did recently was store a Deferred<T> in the cache, and then have multiple calls to lookup wait for the response. Has some complications. 1. The owner of the cache, needs its own scope so that it can call
async
. This was to make the deferred keep running until all the callers were no longer using it. 2. Needed to implement some ref counting, so I could cancel the work if all the calls to
expensiveLookup
were canceled. There are probably other ways to solve the above issues, but have a scope owned by the cache worked for me. Other thoughts would be for the first guy creating the Deferred to do the work and if it gets canceled, to single that in the Deferred so another caller can try to start the work.
t
a simple Mutex*.*withLock around the function (
suspend fun expensiveLookup(id: String): ExpensiveResult = myCacheMutex.withLock {
) will achieve the same thing. Using a map will give you a little more control over the locking, but
computeIfAbsent
will need some kind of lock mechanism (
ConcurrentHashMap
just uses
synchronized
for this method so will be no faster than Mutex, and with the deferred overhead (which also uses locks I suspect) probably slower) They are using the wrong pattern to avoid a cache miss storm” however, as they say:
and the remaining coroutines should wait on the returned 
Deferred
 rather than spinning up their own read
While this fixes one half of the cache storm (exhausting your upstream level with many request), now suddenly for a short while all your requests which were blinding fast before are now almost as slow as this one request refreshing the cache. The real solution is to always keep serving the cached content and to update it asynchronously when it gets out of date. Ironically this is super easy to implement. just serve the cached value (always) using lock free reads, only then check if it’s out of date (can be done probabilistically, i.e. not every request if you have really heavy loads), and then only if it’s out of date lock, check noone else is refreshing the cache, and then asynchronously (like @mkrussel suggests you don’t want to use the request’s scope for this) start the update and unlock. Of course this still leaves the case when your cache is simply empty (e.g. on startup or if you have some cache pressure mechanism), then you fall back on the above methods. The way you manage this in your infra is to first “warm up” an instance before you pelt it with requests. The Doordash solution is terrible for this, because your loadbalancer will think this server is warmed up, but every time a cache entry disappears the average request time shoots up a bit. They probably just paper over this with larger margins / more infra, or just have such low loads it’s just a few requests. Or maybe just drop some requests every now and then blob shrug
j
You could also use
StateFlow
r
I appreciate everyone's input. 🙂 @Tijl: I agree that the implementation is suboptimal, however, I don't think
Mutex.withLock
is semantically equivalent. The 1st method would only lock requests for the same
id
(forgive the terrible ad-hoc code):
Copy code
launch {
    println("Starting A1...")
    debouncer.debounce("A", this) { expensiveLookup("A") }.await()
    println("Done A1")
}
launch { 
    println("Starting B1...")
    debouncer.debounce("B", this) { expensiveLookup("B") }.await()
    println("Done B1")
}
launch {
    println("Starting A2...")
    debouncer.debounce("A", this) { expensiveLookup("A") }.await()
    println("Done A2")
}
---
Starting A1...
Starting B1...
Starting A2...
[A] Waiting..
[B] Waiting..
[A] Done!
Done A1
Done A2
[B] Done!
Done B1
Wouldn't wrapping the entire function in a Mutex cause it to synchronize for every request, regardless of the owner?
Copy code
launch {
    println("Starting A1...")
	expensiveLookup("A")
    println("Done A1")
}
launch {
    println("Starting B1...")
    expensiveLookup("B")
    println("Done B1")
}
launch {
    println("Starting A2...")
    expensiveLookup("A")
    println("Done B1")
}
---
Starting A1...
[A] Waiting..
Starting B1...
Starting A2...
Exception in thread "main" java.lang.IllegalStateException: Already locked by A
 at kotlinx.coroutines.sync.MutexImpl.tryLock (Mutex.kt:177) 
 at kotlinx.coroutines.sync.MutexImpl.lock (Mutex.kt:188) 
 at FileKt.expensiveLookup (File.kt:73)
Your comments about properly managing caches and infrastructure are on point. That said, I don't actually care about cache stampedes.
Of course this still leaves the case when your cache is simply empty
This is my use case: we're parsing 100GBs of logs, which requires external look-up for things like user info, IP info, resource info, etc., and want to avoid extraneous requests. e.g., if we have 10k unique users spread across 10M log lines, ideally we'd make 10k requests (at most).
t
the write lock (computeIfAbsent) in the linked implementation is on the map, so it’s one global lock. Only on top of that do you get additional deffered locking, but indeed this is per id. it’s true the lock I suggest should only be for filling the cache (not when there is a cache item), so after
val result
(not your whole method), naming threw me off there
actually I’m probably incorrect, seems for even quite old version computeIfAbsent uses a key reservation system so locking is done per object. Using a deferred in combination with that kind of makes sense even if the contrast with “approaches using locking” does not.