Hey everyone I m experiencing a super weird bug that I can t kotlinlang #ktor

Hey everyone, I'm experiencing a super weird bug t...

MrPowerGamerBR

04/15/2023, 3:40 AM

Hey everyone, I'm experiencing a super weird bug that I can't find the answer to it, so that's why I'm here asking about it When I'm querying my Ktor (server) behind a nginx reverse proxy with Ktor (client), sometimes the request fails with

Copy code

Exception in thread "main" java.io.EOFException: Chunked stream has ended unexpectedly: no chunk size
	at io.ktor.http.cio.ChunkedTransferEncodingKt.decodeChunked(ChunkedTransferEncoding.kt:77)
	at io.ktor.http.cio.ChunkedTransferEncodingKt$decodeChunked$3.invokeSuspend(ChunkedTransferEncoding.kt)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.internal.LimitedDispatcher.run(LimitedDispatcher.kt:42)
	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:95)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)

and nginx logs

upstream prematurely closed connection while reading upstream

... Keep in mind the sometimes, it only happens randomly with no rhyme or reason about why this happens. But here's the catch: I'm not able to reproduce this bug with curl! If I attempt to replicate the bug with curl, nginx does log

upstream prematurely closed connection while reading upstream

when this happens, BUT curl successfully returns theresponse http status code (200) and I can also download the http response body without any issues! So the

upstream prematurely closed connection while reading upstream

error log doesn't even make any sense since the response is correct. I'm still attempting to debug this issue, but the weird part is that ktor fails to parse the request, but curl successfully parses the request without any issues.

MrPowerGamerBR

04/15/2023, 3:42 AM

There is a fix that I found, where if I add this to the location block

Copy code

proxy_http_version 1.1;
proxy_set_header Connection "";

the issue goes away altogether, but why??? it doesn't make any sense why!!! I thought that maybe it was something related to keep alive, but nope, it seems that nginx forwards a http/1.0 request + Connection: close to the webapp, so it couldn't be a keep alive connection causing issues.

MrPowerGamerBR

04/15/2023, 3:44 AM

There is this issue that also talks about the "Chunked stream has ended unexpectedly: no chunk size" bug, but I don't know if is related or not: https://stackoverflow.com/questions/75758637/ktor-chunked-stream-has-ended-unexpectedly-no-chunk-size

MrPowerGamerBR

04/15/2023, 4:46 PM

Another thing: The server is using Netty engine. I wasn't able to reproduce the bug with the CIO engine yet

MrPowerGamerBR

04/15/2023, 6:38 PM

After testing a bit more: I cannot reproduce the bug with the CIO engine, while with Netty it randomly fails. I have a theory: I think the Netty engine has a race condition that causes the connection to be closed BEFORE nginx has received the entire data. This only affects http/1.0 clients that are using "Connection: close" (which nginx does use)

MrPowerGamerBR

04/15/2023, 6:40 PM

by default nginx uses

http/1.0

with the

Connection: close

header the

Connection: close

header means that the client will close the connection after the response is received maybe there's a race condition on Ktor's Netty engine implementation, where the server closes the connection before the client has fully read the response so when nginx detects that upstream closed the connection prematurely, it thinks "oh shit" and flushes the received data downstream and then it flushes the rest of the received data downstream, because it did receive a complete request (and it knows that it did, after the error is thrown, nginx logs that it sent a http 200 response to the client), it just didn't receive the entire data before flushing the data downstream and that incomplete request borks out most http clients, curl is unaffected because curl is curl so I guess curl has some failsafes the reason why using

http/1.1

and removing the

Connection

header works is because the server, according to the http/1.1 spec, defaults to

keep-alive

if the Connection header isn't present, so the server ends up not closing the connection because it thinks that nginx will reuse the connection later on making the client (in this case, nginx) handle the connection close step, which avoids the issue this is the only plausible theory that I have thought about right now, which would explain why setting http/1.1 + removing the connection header OR switching http engines fixes the issue

212 Views

Open in Slack

Previous Next