https://kotlinlang.org logo
#coroutines
Title
# coroutines
j

jsijsling

07/23/2019, 10:19 AM
Did some basic performance evaluation of file loading on Android with different scheduling methods. I found that
<http://Dispatchers.IO|Dispatchers.IO>
produced significant garbage overhead. Log snippet:
Copy code
Background concurrent copying GC freed 1692(904KB) AllocSpace objects, 127(2MB) LOS objects, 49% free, 3MB/7MB, paused 390us total 103.292ms
Background concurrent copying GC freed 1965(1129KB) AllocSpace objects, 163(3MB) LOS objects, 50% free, 3MB/7MB, paused 552us total 118.408ms
Long wait of 8.836ms for Thread[20,tid=16153,Native,Thread*=0xe0c63c00,peer=0x12c40078,"DefaultDispatcher-worker-6"] suspension!
Background concurrent copying GC freed 2006(1273KB) AllocSpace objects, 197(4MB) LOS objects, 50% free, 3MB/7MB, paused 613us total 120.900ms
Background concurrent copying GC freed 2420(1437KB) AllocSpace objects, 216(4MB) LOS objects, 37% free, 10MB/16MB, paused 1.202ms total 256.912ms
Waiting for a blocking GC ProfileSaver
WaitForGcToComplete blocked ProfileSaver on HeapTrim for 29.549ms
Background concurrent copying GC freed 2242(1268KB) AllocSpace objects, 189(4MB) LOS objects, 49% free, 3MB/6MB, paused 520us total 102.964ms
Background concurrent copying GC freed 1782(1071KB) AllocSpace objects, 154(3MB) LOS objects, 49% free, 5MB/10MB, paused 386us total 112.534ms
Any clue why this might be?
m

Marko Mitic

07/23/2019, 10:24 AM
IO thread pool is dynamic, it can grow to 64 threads, so probably new thread allocations (but there's probably something else too)
j

jsijsling

07/23/2019, 10:55 AM
Thread count is indeed higher when using
<http://Dispatchers.IO|Dispatchers.IO>
but I still think the amount of garbage is too much though. In my test the number of distinct threads spawned for 1000 iterations is 5~6 on IO versus 2 on the default dispatcher.
m

Marko Mitic

07/23/2019, 10:56 AM
could you provide the test code?
j

jsijsling

07/23/2019, 11:00 AM
Copy code
val timeline = mutableListOf<Long>()
repeat(1000) {
  val ms = measureTimeMillis {
    runBlocking {
      withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
        ApplicationProvider.getApplicationContext<Application>().assets
          .open("testAssetFile.json").use { it.reader().readText() }
      }
    }
  }
  timeline += ms
}

println("Avg=${timeline.average()}ms, max=${timeline.max()}, min=${timeline.min()}")
Only with
<http://Dispatchers.IO|Dispatchers.IO>
I see the GC working overtime.
m

Marko Mitic

07/23/2019, 11:03 AM
maybe if you put
runBlocking
around
repeat
j

jsijsling

07/23/2019, 11:03 AM
That wouldn’t make the test representative. I’m comparing directly blocking the main thread versus offloading to background threads.
Measure time until the result is delivered to the main thread hence
runBlocking
Scratch that. On second thought it should be equivalent. Tried it, same result
In this case it doesn’t spawn extra threads. Just one. But the garbage overhead remains the same
m

Marko Mitic

07/23/2019, 11:33 AM
what would happen if you started one coroutine that would do the json reading 1000x times?
j

jsijsling

07/23/2019, 11:34 AM
You mean this? That’s what I just described.
Copy code
runBlocking {
  repeat(1000) {
    val ms = measureTimeMillis {
      withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
        loadAsset("testAssetFile.json")
      }
    }
    timeline += ms
  }
}
m

Marko Mitic

07/23/2019, 11:34 AM
is there a chance that reading assets in test on background thread is causing garbage?
j

jsijsling

07/23/2019, 11:36 AM
Repeating the reads within the same context doesn’t produce the garbage overhead, but that’s to be expected.
Copy code
runBlocking {
  withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
    repeat(1000) {
      val ms = measureTimeMillis {
        loadAsset("testAssetFile.json")
      }
      timeline += ms
    }
  }
}
m

Marko Mitic

07/23/2019, 11:37 AM
@elizarov is this GC expected on 1000
withContext
switches?
d

Dominaezzz

07/23/2019, 11:39 AM
Hmm, what if you do,
runBlocking(Dispatchers.Default)
?
j

jsijsling

07/23/2019, 11:45 AM
This produces far less overhead.
Copy code
runBlocking(Dispatchers.Default) {
  repeat(1000) {
    val ms = measureTimeMillis {
      withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
        loadAsset("testAssetFile.json")
      }
    }
    timeline += ms
  }
}
Produces more overhead than
runBlocking  withContext(Dispatchers.Default)
but not as much as with just the IO dispatcher.
I see ~6 concurrent sweeps with
runBlocking(Dispatchers.Default)
. Versus 1~2 concurrent sweeps with
runBlocking withContext(Dispatchers.Default)
d

Dominaezzz

07/23/2019, 11:50 AM
Default and IO share thread pools, so I guess the overhead is in spawning new threads in the pool.
I see this as a wontfix. Not much can be done about thread creation but perhaps the threads could be kept alive for a bit.
Using an actor is an option but not a very nice one.
j

jsijsling

07/23/2019, 11:59 AM
As fas as I could tell the number of threads spawned was too low to explain this amount of garbage, purely by thread creation. If I find the time I’ll do some memory profiling and share the results.
g

gildor

07/24/2019, 1:38 AM
withContext for sure does allocations (lambda, new context, coroutine scope) and code above is suboptimal, more question what it actually allocating and how big this gc pressure Also 64 threads by itself is 32 Mb off memory
l

louiscad

07/24/2019, 1:57 PM
@gildor I thought it was about 1MB per thread, so 64MB
g

gildor

07/24/2019, 2:34 PM
0.5mb is default for desktop as I remember, but yeah, it may be also 64 for sure
37 Views