I reported this originally on < C1CFAFJSK|coroutines> but I kotlinlang #ktor

I reported this originally on <#C1CFAFJSK|coroutin...

rocketraman

03/07/2019, 5:28 AM

I reported this originally on #coroutines but I think it might be ktor specific, as ktor is in the stack. Basically, I've got a ktor thread that is in a tight loop, using 100% CPU. It seems to get into this state randomly, and I don't have a reproducer, but it has happened multiple times on various versions of ktor, up to and including 1.1.3. Here is the stack:

Copy code

"DefaultDispatcher-worker-2" #33 daemon prio=5 os_prio=0 cpu=31665720.38ms elapsed=34858.54s tid=0x00007febc1c2a800 nid=0x3d runnable  [0x00007feb809a8000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPoll.wait(java.base@11.0.1/Native Method)
        at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.1/EPollSelectorImpl.java:120)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.1/SelectorImpl.java:124)
        - locked <0x000000060a49d520> (a sun.nio.ch.Util$2)
        - locked <0x000000060a49d3d0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.selectNow(java.base@11.0.1/SelectorImpl.java:146)
        at io.ktor.network.selector.ActorSelectorManager.process(ActorSelectorManager.kt:77)
        at io.ktor.network.selector.ActorSelectorManager$process$1.invokeSuspend(ActorSelectorManager.kt)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:32)
        at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:233)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:594)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.access$runSafely(CoroutineScheduler.kt:60)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:742)

👀 1

Sergey Bondari

03/07/2019, 6:05 PM

I saw exactly the same (100% CPU on socket epoll), it was related to HttpClient for me. I had a server that returns 204 No Content, and I used response.receive<Void>(). Replacing it with response.receive<ByteReadChannel>().discard(Long.MAX_VALUE) or response.receive<ByteArray>() solved it. Also I switched to the completely manual client.call.use { } pattern everywhere to make sure all responses / error responses are properly read and close. Without that I also observed random memory leaks of request objects (but not responses) on garbage collector. Creating and closing HttpClient object everytime also solves it

Sergey Bondari

03/07/2019, 6:05 PM

There was no reliable repro to isolate into a single case, did not bother with creating issues

rocketraman

03/07/2019, 6:08 PM

Hmm, this service actually doesn't use

HttpClient

at all, so the cause is definitely something else in my case...

3 Views

Open in Slack

Previous Next