Somewhat philosophical question: why is it `body(b...
# http4k
m
Somewhat philosophical question: why is it
body(body: InputStream, length: Long?)
and not
body(wire: OutputStream, length: Long?)
? My body produces bytes so I would expect to be given something I can write these bytes into (a socket?). I can buffer but it feels somewhat uneeded?
f
If all you have is an array of bytes, then a straight-forward way is to do:
Copy code
val bytes = ....
response.body(Body(bytes.asByteBuffer()))
Or maybe I've misunderstood the question 😄
m
I want to avoid allocating the array of bytes in memory
the API I'm working with (or more exactly currently designing) takes a "sink":
Copy code
generateResponse(sink: OutputStream)
The response might be several GB
s
The design choice there is that
Body
holds a reference to the actual content, and writing to the wire is the responsibility of the different client/server integrations. @mbonnin I'd be curious to see an example of what you're trying to achieve. We've designed APIs to deal with large payloads (both as input and output, including streaming content to/from services like s3) without issues using the current model.
m
writing to the wire is the responsibility of the different client/server integrations
This implies buffering, right?
I'd be curious to see an example of what you're trying to achieve
It's mostly curiosity at this point, I'm most likely going to return JSON payloads that are no more than a few 100kBs so certainly ok. But on the other hand if I can get a few % more requests/s by avoiding an extra memcpy, it's always less costs for me and better for the planet.
s
This implies buffering, right?
Yes, although the implementation may vary from client to client and from server to server. For instance, in
Jetty
we use a simple
input.copyTo(output)
with the default JVM buffer size to transfer the content (see
Http4kJakartaServletAdapter
)
👍 1
j
it would have to be
body(wire: (OutputStream) -> Unit, length: Long?)
right? (the output stream is provided by the underlying http server, and isn't available at the point you construct the response) that would be more efficient, but it would also mean that the "it's just data" aspect would be lost (I mean, an InputStream already takes this away a little bit -- but you can at least buffer that if you're interested in reading it twice. A function could give you a different answer the second time . . .) if you're not against starting a new thread you could do something like this which uses only a minimal buffer:
Copy code
...
    val in = PipedInputStream()
    val out = PipedOutputStream(in)
    thread {
        out.use(generateResponse)
    }
    return Response(OK).body(StreamBody(input))
👍 3
m
body(wire: (OutputStream) -> Unit, length: Long?)
Right, good call 👍
if you're not against starting a new thread
Sounds like there's no way around threads. How would you collect them? Use daemon threads and just let them die? I guess I could also launch a global coroutine or so I guess
In that specific case though unless the response becomes huge (several MB), buffering everything sounds like the solution
👍 1
I'll do this for now 👍 One day I might hack together a benchmark to get a feeling of what we could gain from response streaming
💯 1
m
You should have a thread pool and reuse the threads.
👍 1
j
If you do go down the route of using piped streams please be careful about using thread pools or coroutines - they use the death of the writer thread for failure detection (if it dies without closing the stream then the reader knows something’s gone wrong and fails too) so I would generally just create a new thread for each stream (if you are doing this then the overhead of a single thread should be minimal compared to the data transfer costs — not worth doing something this complex for small response bodies 🤷 )
👍 1
💙 1