Somewhat philosophical question why is it `body body InputSt kotlinlang #http4k

Somewhat philosophical question: why is it `body(b...

mbonnin

09/07/2023, 10:58 AM

Somewhat philosophical question: why is it

body(body: InputStream, length: Long?)

and not

body(wire: OutputStream, length: Long?)

? My body produces bytes so I would expect to be given something I can write these bytes into (a socket?). I can buffer but it feels somewhat uneeded?

fredrik.nordin

09/07/2023, 11:31 AM

If all you have is an array of bytes, then a straight-forward way is to do:

Copy code

val bytes = ....
response.body(Body(bytes.asByteBuffer()))

fredrik.nordin

09/07/2023, 11:34 AM

Or maybe I've misunderstood the question 😄

mbonnin

09/07/2023, 11:44 AM

I want to avoid allocating the array of bytes in memory

mbonnin

09/07/2023, 11:46 AM

the API I'm working with (or more exactly currently designing) takes a "sink":

Copy code

generateResponse(sink: OutputStream)

The response might be several GB

s4nchez

09/07/2023, 11:55 AM

The design choice there is that

Body

holds a reference to the actual content, and writing to the wire is the responsibility of the different client/server integrations. @mbonnin I'd be curious to see an example of what you're trying to achieve. We've designed APIs to deal with large payloads (both as input and output, including streaming content to/from services like s3) without issues using the current model.

mbonnin

09/07/2023, 11:58 AM

writing to the wire is the responsibility of the different client/server integrations

This implies buffering, right?

I'd be curious to see an example of what you're trying to achieve

It's mostly curiosity at this point, I'm most likely going to return JSON payloads that are no more than a few 100kBs so certainly ok. But on the other hand if I can get a few % more requests/s by avoiding an extra memcpy, it's always less costs for me and better for the planet.

s4nchez

09/07/2023, 12:04 PM

This implies buffering, right?

Yes, although the implementation may vary from client to client and from server to server. For instance, in

Jetty

we use a simple

input.copyTo(output)

with the default JVM buffer size to transfer the content (see

Http4kJakartaServletAdapter

)

👍 1

Jordan Stewart

09/07/2023, 9:57 PM

it would have to be

body(wire: (OutputStream) -> Unit, length: Long?)

right? (the output stream is provided by the underlying http server, and isn't available at the point you construct the response) that would be more efficient, but it would also mean that the "it's just data" aspect would be lost (I mean, an InputStream already takes this away a little bit -- but you can at least buffer that if you're interested in reading it twice. A function could give you a different answer the second time . . .) if you're not against starting a new thread you could do something like this which uses only a minimal buffer:

Copy code

...
    val in = PipedInputStream()
    val out = PipedOutputStream(in)
    thread {
        out.use(generateResponse)
    }
    return Response(OK).body(StreamBody(input))

👍 3

mbonnin

09/08/2023, 10:03 AM

body(wire: (OutputStream) -> Unit, length: Long?)

Right, good call 👍

mbonnin

09/08/2023, 10:08 AM

if you're not against starting a new thread

Sounds like there's no way around threads. How would you collect them? Use daemon threads and just let them die? I guess I could also launch a global coroutine or so I guess

mbonnin

09/08/2023, 10:09 AM

In that specific case though unless the response becomes huge (several MB), buffering everything sounds like the solution

👍 1

mbonnin

09/08/2023, 10:10 AM

I'll do this for now 👍 One day I might hack together a benchmark to get a feeling of what we could gain from response streaming

💯 1

Mikael Ståldal

09/08/2023, 3:33 PM

You should have a thread pool and reuse the threads.

👍 1

Jordan Stewart

09/09/2023, 6:30 PM

If you do go down the route of using piped streams please be careful about using thread pools or coroutines - they use the death of the writer thread for failure detection (if it dies without closing the stream then the reader knows something’s gone wrong and fails too) so I would generally just create a new thread for each stream (if you are doing this then the overhead of a single thread should be minimal compared to the data transfer costs — not worth doing something this complex for small response bodies 🤷 )

👍 1

💙 1

13 Views

Open in Slack

Previous Next