regarding `List` vs `MutableList`: first, I unders...
# getting-started
y
regarding
List
vs `MutableList`: first, I understand that these are interfaces and not concrete classes. my question is, how cheap is it to go from a
List
to a
MutableList
, and if the answer is not "pretty cheap", is there anything I can do to make it cheaper? operations on collections (more specifically, on
Iterable
) return
List
and that's rather irritating if I want to continue editing the thing. is it at least cheap to then call
toMutableList()
?
h
Just look at the source code of the functions:
toMutableList
has a fast path if the runtime instance already is a MutableList.
y
but is it usually going to stay a
MutableList
after calling some std method on it?
or am I 1) collecting to a new list (not cheap) and then 2) constructing a new
MutableList
from this list (not cheap)
in general, do I need to read every single method I'm calling to understand whether or not an allocation is reused
w
The best way is indeed to look at source code. Otherwise it's safe to assume that vast majority of operations will create a new list. If you want micro optimizations with mutable state, you have to be explicit. Most of the time this means that you have to write the loops or utility functions yourself. It's usually not worth it, but for these situations it can be interesting to look at
xxxTo
functions.
Copy code
.map { ... }.toMutableList()
.filter { ... }.toMutableList()
// And some others are the same as:
.mapTo(mutableListOf()) { ... }
.filterTo(mutableListOf()) { ... }
y
thanks for the answer, as usual the answers here are top-notch however I disagree that wishing to avoid an entire allocation is a micro optimization in cases where the buffer can be re-used. this just feels very opaque and the source code is often interface/delegation salad, so it's not always trivial to tell the behavior at a glance. also, this pattern of chaining
.somethingTo(mutableListOf()) { ... }
sure looks like it's allocating a new list...?
w
Sorry you are right. You were asking for facts, not opinions. It's just hard for experts who've burned their fingers on this to stay only factual
y
I'm here to blow off steam and thank you for being my target 🙂
😁 1
w
also, this pattern of chaining
.somethingTo(mutableListOf()) { ... }
sure looks like it's allocating a new list...?
Correct. compared to
something { ... }.toMutableList()
it only saves a single allocation
h
If you want to chain many operations and do care about the allocation, you can also test/benchmark if you could use a sequence instead.
☝️ 2
y
right. maybe the overhead here ("enough" operations chained together) would warrant a sequence, good point
e
creating a new list is cheap and the transformations being inlineable often makes them faster than sequence which always requires calls to non-inline functions
thank you color 1
y
@ephemient this one's new to me, thanks, so is this info
creating a new list is cheap yes, but isn't re-allocating expensive?
e
not sure what you mean by that
in the standard generational GC, nursery is cheap, and mutation is (a bit) expensive (since it requires additional tracking)
💯 1
y
I was gonna say, this is a benefit of a garbage collected language?
looks like I gotta do some more reading here.
l
Individual allocations are incredibly cheap in modern Java. Before being worried about them, I recommend benchmarking. I'm constantly being surprised by just how fast it is.
y
it is apparent to me that I have a knowledge gap here, so (when free time allows) I intend to re-evaluate my understanding here
e
quick summary: nursery is typically just a bump allocator. that's as fast as stack allocation if nothing is mutable, then there are no pointers into the nursery from older objects, so incremental GC only needs to look at roots that point directly into the nursery, which makes it also very fast unfortunately we do have mutation, so when an object in an older generation is mutated to have a reference to an object in a younger generation, we need to know about it. to minimize those cases, this is typically accomplished with a card table - basically a bitmap of memory zones which contain additional objects to scan (at lower resolution to save on memory; it's safe to over-scan) that means that all reference mutations require a write barrier to ensure that any changes to the card table observable at the same time or before as mutations to the object itself, which is not free
l
@ephemient that only happens when you have mutation though, no? When you have an immutable object that is used purely as a holder, it should be almost free.
e
exactly
to be clear: the cost of GC depends on the behavior of the whole program, not just the part of it creating these allocations
y
@ephemient @loke just wanted to say that I really appreciate this.