Hey all I m investigating an issue between OkHttp and a remo kotlinlang #squarelibraries

Hey all. I'm investigating an issue between OkHttp...

Ian Botsford

10/17/2024, 5:24 PM

Hey all. I'm investigating an issue between OkHttp and a remote server that seems to be closing connections pretty quickly—faster than the connection pool's configured

keepAliveDuration

. From the TCP/SSL dumps I've examined, it appears the server is properly sending an SSL

close_notify

message followed by a TCP FIN packet but it seems OkHttp doesn't recognize those when they happen. Much later, it may retrieve that connection from the pool, send a request, then attempt to read a response, upon which it throws an

IOException: unexpected end of stream on <server-name>

. So broad questions to start with: Is it expected that OkHttp ignores remote SSL/TCP closure events? And only discovers them after sending a request and attempting to read the response?

yschimke

10/17/2024, 6:47 PM

It's not surprising to be honest. More generally given the java blocking Socket API, those read or write operations are generally where you will detect it.

yschimke

10/17/2024, 6:48 PM

But especially on Android the network stacks can delay suppress socket closing.

Ian Botsford

10/17/2024, 6:54 PM

I see. Do you know why sending the request appears to succeed but reading the response is where it fails? Is

retryOnConnectionFailure(true)

the only way to handle this?

yschimke

10/17/2024, 6:55 PM

That should be true by default

Ian Botsford

10/17/2024, 6:56 PM

That's true but we disabled it for other reasons.

yschimke

10/17/2024, 6:57 PM

It should hopefully be safe, retrying requests that are either safe to retry, or it's clear they can't have started on the server.

yschimke

10/17/2024, 6:58 PM

The health checks should minimise the failures, but maybe not guaranteed

Ian Botsford

10/17/2024, 6:59 PM

In this particular case, the call can't have started on the server because the socket is closed but OkHttp doesn't know that and appears to send the request anyway. How does OkHttp determine this is "safe" to retry?

yschimke

10/17/2024, 7:21 PM

https://github.com/square/okhttp/blob/master/okhttp%2Fsrc%2Fmain%2Fkotlin%2Fokhttp3%2FOkHttpClient.kt#L739

yschimke

10/17/2024, 7:22 PM

And one more I'll dig out

yschimke

10/17/2024, 7:26 PM

Some more logic here https://github.com/square/okhttp/blob/master/okhttp%2Fsrc%2Fmain%2Fkotlin%2Fokhttp3%2Finternal%2Fhttp%2FRetryAndFollowUpInterceptor.kt#L135

Ian Botsford

10/17/2024, 7:53 PM

Thanks for the links. My concern is about non-idempotent calls to the server—say, mutating some server-side state. The end-of-stream error that

retryOnConnectionFailure(true)

would retry is only triggered after the request has been "sent". Now, because in this case the server has closed the connection, the data haven't really been _sent across the wire—_merely sent to some buffer somewhere, which presumably would timeout at some point. But at that point in the call execution, it doesn't seem like OkHttp knows whether the data have actually made it to the wire. If the connection was broken after the request was sent and the call is not idempotent, it would not be safe to retry. Am I missing some other validation OkHttp uses to be safe about this? Or would setting

retryOnConnectionFailure(true)

potentially expose us to re-executing non-idempotent calls?

yschimke

10/17/2024, 8:02 PM

The errors that okhttp treats as connection errors are meant to account for whether it could be executing. But I'm a bit rusty on this.

yschimke

10/17/2024, 8:04 PM

If you can reproduce it doing the wrong thing, it's easy to diagnose. I'd have to check for that particular error you mention.

Ian Botsford

10/17/2024, 8:09 PM

OK, let me see if I can assemble a minimal repro. Thanks for the help so far Yuri. 😊

yschimke

10/17/2024, 8:16 PM

Some discussion from @jessewilson https://publicobject.com/2023/11/20/idempotent-apis/ that indicates some caution needed

🫡 1

jessewilson

10/17/2024, 10:32 PM

OkHttp special cases GET requests on the assumption that retrying them is safer

Ian Botsford

10/18/2024, 11:24 PM

I don't see that logic in

RetryAndFollowUpInterceptor

...where is the special casing for GETs?

jessewilson

10/19/2024, 12:20 AM

It’s here: https://github.com/square/okhttp/blob/f2771425cb714a5b0b27238bd081b2516b4d640f/okhttp/src/main/kotlin/okhttp3/internal/connection/RealConnectionPool.kt#L127

jessewilson

10/19/2024, 12:21 AM

When it takes a connection from the pool it does an expensive connection health check unless it’s a GET

Ian Botsford

10/21/2024, 4:48 PM

OK, so it appears that

doExtensiveHealthChecks

returns false for GET. Then

RealConnection.isHealthy

uses that plus idle time to determine whether to invoke an additional socket-level test. So I can see how non-GET requests get extra verification of socket health on connection acquisition...is that the thing which prevents resending a non-idempotent request? Because I don't yet see how that affects the retry logic in

RetryAndFollowUpInterceptor

. 🤔

jessewilson

10/21/2024, 5:19 PM

It’s intended to reduce the opportunities for non-GET calls to require retries

5 Views

Open in Slack

Previous Next