`collect()` vs `collectLatest()` : I see the docs ...
# coroutines
m
collect()
vs
collectLatest()
: I see the docs for
collectLatest()
say: when the original flow emits a new value, action block for previous value is cancelled. But specifically for
StateFlow
just about all the examples I see use
collect()
. So I’m wondering if the decision whether to use
collect()
or
collectLatest()
can/should be made on the basis of the nature of the action block (for example, an action block of non-suspending code I suppose the choice is irrelevant - or more generally, is there a problem with cancelling the action block before completion) or whether the nature of
StateFlow
also impacts the decision somehow?
w
I’d say for
StateFlow
you’d usually be concerned about the latest value only, so
collectLatest
would be typically correct. I think this is the correct answer:
should be made on the basis of the nature of the action block
If we’re collecting values from somewhere to show on the UI for example, you should use
collectLatest
. If you have a flow with messages that you want to show one by one (with an animation perhaps), then you should use
collect
because it’s important that each emission is handled
m
Thanks, but in your last example, wouldn’t you not be using
StateFlow
since you care about all values?
👌 1
g
If we’re collecting values from somewhere to show on the UI for example, you should use 
collectLatest
.
Not sure that such broad separation is correct, usage of collect vs collectLatest should be decided depending on nature of collect block, should it be cancelled or not UI also has different cases
☝️ 1
wouldn’t you not be using 
StateFlow
 since you care about all values
Of course, StateFlow has buffer with size 1, it’s possible that new value will override previous one which is not collected yet
👍 1
w
Not sure that such broad separation is correct,
That’s what I meant, the second example with messages is also UI 🙂
wouldn’t you not be using 
StateFlow
 since you care about all values?
Yep, that’s why I think usually with
StateFlow
I’d use
collectLatest
.
g
I don’t see why usually you would use collectLatest with StateFlow
m
To cancel unnecessary work?
g
but how it different between StateFlow and any other Flow?
w
Because
StateFlow
represents state, and with state I’m usually concerned with the latest value only. Regular
Flow
, just in my experience, represents values that are usually important individually
m
The question of what (if anything) is special about collecting
StateFlow
is really a side question. I’m currently just working out a general default strategy for when collecting
StateFlow
.
StateFlow.collectLatest
seems to be a good reasonable default. Only switch to using
collect
if there is a good reason for the the action block not to be cancelled (uncommon scenario I suspect).
1
1️⃣ 1
g
You even may not kknow that you collecting StateFlow, usually it’s just an implementation detail
m
Fair point, but let’s assume you are given a
StateFlow
w
My default is simply
collectLatest
(for any flow), unless it would be incorrect to ignore a value
👍 2
g
sound reasonable, no issue with it, but this is my point that StateFlow is not very different in this case comparing to any other flow
👍 2
a
picking
collectLatest
as a default without explicitly needing the cancellation behavior in that particular case is kind of silly. It introduces some rather nontrivial overhead over a collect: https://github.com/Kotlin/kotlinx.coroutines/blob/master/kotlinx-coroutines-core/common/src/flow/internal/Merge.kt#L13
👍 1
I would reject any code review that included an unnecessary usage of it
w
@Adam Powell many thanks for pointing this out! The overhead is not so obvious from the documentation, does it come specifically from creating another channel, or from the need to cancel some previous work (even if it has completed)? What’s your suggestion for using
*latest
operators, then? Only use them if the work inside the action block is expensive and worth cancelling?
I also realised that if my collect block doesn’t even support cancellation (e.g. doesn’t yield or use cancellable operators) then
collectLatest
doesn’t actually do anything except add the overhead, is that right? Do you mean such situation by unnecessary usage?
a
Yes, that would be one kind of unnecessary usage. It's more than just whether the action is expensive, there are always semantics of what you're doing at play too. If you have an operation that suspends to do work before reaching a result, it's possible for a collectLatest to emit at regular enough intervals that none of the iterations complete for a long time due to getting repeatedly cancelled before it can finish. There isn't a one size fits all approach to backpressure.
In some situations it's better to reach slightly stale intermediate results before computing the next result than it is for the whole thing to spin in confusion until the upstream quiesces
In that regard, plain collect has a safer set of tradeoffs
w
Right. So, do I understand correctly that
collectLatest
is less efficient even if there’s no backpressure, when collect block always completes before the next event comes? I understand collectLatest doesn’t give any benefits then, so it’s only slower than
collect
?
a
Basically yes
w
Got it. Thanks for the insight 🙂
👍 1
m
I also didn’t realise
collectLatest
had an automatic cost, so this is really good to know. Use
collect
by default.