I'm having problem configuring `HttpRequestRetry` ...
# ktor
l
I'm having problem configuring
HttpRequestRetry
correctly. This is the current configuration:
Copy code
fun HttpClientConfig<*>.configureRetries(
        methods: List<HttpMethod> = listOf(
            HttpMethod.Get,
            HttpMethod.Put,
            HttpMethod.Delete
        )
    ) {
        install(HttpRequestRetry) {
            retryIf(10) { request, response ->
                if (request.method in methods) {
                    response.status.value in listOf(502, 503, 504)
                } else {
                    false
                }.also {
                    if (it) {
                        println("Retry request because of status ${response.status.value}")
                    }
                }
            }
            retryOnExceptionIf(10) { _, cause ->
                if (cause is CancellationException) {
                    return@retryOnExceptionIf false
                }
                val shouldRetry = when (cause) {
                    is HttpRequestTimeoutException,
                    is ConnectTimeoutException,
                    is SocketTimeoutException,
                    is java.net.SocketTimeoutException,
                    is java.net.SocketException,
                    is java.io.EOFException -> true

                    is java.io.IOException -> cause.message == "HTTP/1.1 header parser received no bytes"
                            || cause.message == "chunked transfer encoding, state: READING_LENGTH"

                    else -> false
                }
                if (shouldRetry) {
                    println("Retry request because of exception $cause")
                } else {
                    println("Don't retry request because of exception $cause")
                }

                shouldRetry
            }
            exponentialDelay()
        }
    }
However in our production environment requests fail without being retried, with an
java.io.IOException: chunked transfer encoding, state: READING_LENGTH
error. Neither
Retry request because of exception $cause
nor
Don't retry request because of exception $cause
is printed in the console, so it looks like the exception isn't catched at all by
HttpRequestRetry
. We are using the
Java
client and making a GET requests. Ktor version is
2.3.7
a
Can you please tell me how can I reproduce the
java.io.IOException: chunked transfer encoding, state: READING_LENGTH
?
l
This happens randomly multiple times a day in our production workloads. We are doing millions of requests every day and thousands fail and are retried and only some errors like this are not retried and this is very annoying. This is just some kind of transient network failure mode which produces this special exception. Here is the full stack trace:
Copy code
java.io.IOException: chunked transfer encoding, state: READING_LENGTH
	at java.net.http/jdk.internal.net.http.common.Utils.wrapWithExtraDetail(Utils.java:351)
	at java.net.http/jdk.internal.net.http.Http1Response$BodyReader.onReadError(Http1Response.java:760)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.checkForErrors(Http1AsyncReceiver.java:302)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.flush(Http1AsyncReceiver.java:268)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
	at java.net.http/jdk.internal.net.http.HttpClientImpl$DelegatingExecutor.execute(HttpClientImpl.java:157)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:305)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:274)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.onReadError(Http1AsyncReceiver.java:511)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver$Http1TubeSubscriber.onComplete(Http1AsyncReceiver.java:596)
	at java.net.http/jdk.internal.net.http.common.SSLTube$DelegateWrapper.onComplete(SSLTube.java:276)
	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.complete(SSLTube.java:440)
	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.onComplete(SSLTube.java:541)
	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.checkCompletion(SubscriberWrapper.java:472)
	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run1(SubscriberWrapper.java:334)
	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run(SubscriberWrapper.java:259)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:303)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:256)
	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.outgoing(SubscriberWrapper.java:232)
	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader.processData(SSLFlowDelegate.java:513)
	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader$ReaderDownstreamPusher.run(SSLFlowDelegate.java:268)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
	at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:115)
	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:103)
	... 4 more
Caused by: java.io.EOFException: EOF reached while reading
	... 24 more
a
Can you switch to the
OkHttp
engine to see if that's the
Java
engine-specific issue?
l
I tried many engines before (CIO, OkHttp, Apache) but all had performance issues with concurrent requests and the throughput was very low. With CIO for example we had this issue
@Aleksei Tirman [JB] I changed to the engine OkHttp which results in performance degradation in the request throughput. We do batch processing and need to make huge amount of requests to an endpoint which scales horizontally, so we do concurrent requests with coroutines (max 200 open requests via Semaphore): with Java engine the throughput was 2000 requests/second and now with OkHttp this dropped to 200 requets/seconds resulting in 10x increase in the runtime of the batch processing job.
a
I understand. Are you able to isolate the Java engine issue and make it reproducible?
l
I can't isolate it, but here is a link to the exception which is thrown https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/19fb8f93c59dfd791f62d41f332db9e[…]ttp/share/classes/jdk/internal/net/http/Http1AsyncReceiver.java. I don't care about the exception, I just want that I can retry it in ktor. How can it be that some exception is not catched by the
HttpRequestRetry
plugin?
a
Unfortunately, I cannot reproduce your problem with the following code:
Copy code
val client = HttpClient(Java) {
    install(HttpRequestRetry) {
        retryOnExceptionIf(maxRetries = 3) { _, cause ->
            cause is IOException
        }

        modifyRequest {
            println("Retried...")
        }
    }
}

val r = client.get("<http://localhost:5555>")
println(r.bodyAsText())
Server:
Copy code
val selectorManager = SelectorManager(Dispatchers.IO)
val server = aSocket(selectorManager).tcp().bind("127.0.0.1", 5555)

while (true) {
    val socket = server.accept()
    launch {
        val sendChannel = socket.openWriteChannel(autoFlush = true)
        sendChannel.writeStringUtf8("""
            HTTP/1.1 200 OK
            Content-Length: 9
            Content-Type: text/plain; charset=utf-8
        """.trimIndent())
        sendChannel.flush()
        socket.close()
    }
}
The output:
Copy code
Retried...
Retried...
Retried...
Exception in thread "main" java.io.IOException: parsing HTTP/1.1 header, receiving [Content-Type:...], parser state [HEADER]
	at java.net.http/jdk.internal.net.http.common.Utils.wrapWithExtraDetail(Utils.java:330)
	at java.net.http/jdk.internal.net.http.Http1Response$HeadersReader.onReadError(Http1Response.java:673)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.checkForErrors(Http1AsyncReceiver.java:297)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.flush(Http1AsyncReceiver.java:263)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SynchronizedRestartableTask.run(SequentialScheduler.java:175)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:147)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:198)
	at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:115)
	at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:103)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:793)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:697)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:684)
Caused by: java.io.EOFException: EOF reached while reading
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver$Http1TubeSubscriber.onComplete(Http1AsyncReceiver.java:591)
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadSubscription.signalCompletion(SocketTube.java:632)
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription.read(SocketTube.java:833)
	at java.net.http/jdk.internal.net.http.SocketTube$SocketFlowTask.run(SocketTube.java:175)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:198)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:271)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:224)
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription.signalReadable(SocketTube.java:763)
	at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent.signalEvent(SocketTube.java:941)
	at java.net.http/jdk.internal.net.http.SocketTube$SocketFlowEvent.handle(SocketTube.java:245)
	at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.handleEvent(HttpClientImpl.java:957)
	at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.lambda$run$3(HttpClientImpl.java:912)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
	at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.run(HttpClientImpl.java:912)
As you can see, the underlying exception is the one you pointed to.
l
Thanks for your efforts, I will try to reproduce the issue with your code. The exception you produced has a different stack trace than the exception I see.
@Aleksei Tirman [JB] I was able to reproduce the issue with this server code:
Copy code
suspend fun main() {
    withContext(Dispatchers.Default) {
        val selectorManager = SelectorManager(<http://Dispatchers.IO|Dispatchers.IO>)
        val server = aSocket(selectorManager).tcp().bind("127.0.0.1", 5555)

        while (true) {
            val socket = server.accept()
            println("accepted connection")
            launch {
                val sendChannel = socket.openWriteChannel(autoFlush = true)
                sendChannel.writeStringUtf8("""
                    HTTP/1.1 200 OK
                    content-type: application/json;charset=UTF-8
                    transfer-encoding: chunked
                    
                    
                """.trimIndent().replace("\n", "\r\n"))
                sendChannel.flush()
                socket.close()
            }
        }
    }
}
a
Thank you. I've created an issue.
👍 1
j
Just ran into this with the latest ktor-client. This is still an issue. I left a comment on the issue.
344 Views