Hi all I m trying to find the best practice for uploading la kotlinlang #ktor

Hi all, I'm trying to find the best practice for u...

Dzmitry Neviadomski

08/08/2025, 12:47 PM

Hi all, I'm trying to find the best practice for uploading large files via

multipart/form-data

with

HttpClient

without running out of memory. I've summarized all the different methods I could find as of Ktor

3.2.3

in the first message in thread. The official documentation primarily highlights the first method (using a

ByteArray

), which seems unsuitable for large files due to its high memory consumption. From the streaming options I've listed, which is considered the most reliable and efficient? Also, would it be helpful to file a documentation issue to add examples for this use case? Any insights would be greatly appreciated. Thanks!

🧵 1

Dzmitry Neviadomski

08/08/2025, 1:14 PM

Code sample:

Copy code

import io.ktor.client.HttpClient
import io.ktor.client.request.forms.ChannelProvider
import io.ktor.client.request.forms.InputProvider
import io.ktor.client.request.forms.append
import io.ktor.client.request.forms.formData
import io.ktor.client.request.forms.submitFormWithBinaryData
import io.ktor.client.statement.bodyAsText
import io.ktor.http.HttpHeaders
import io.ktor.http.escapeIfNeeded
import io.ktor.http.headers
import io.ktor.http.isSuccess
import io.ktor.util.cio.readChannel
import io.ktor.utils.io.ByteReadChannel
import kotlinx.coroutines.runBlocking
import kotlinx.io.asSource
import kotlinx.io.buffered
import java.io.File
import kotlin.use

private const val FILE_KEY = "file"

fun main() = runBlocking {
    val client = HttpClient()
    val url = "<http://example.com/api/v1/multiPartUploadPath>" // Some external API.
    val uploadFilePath = "/path/to/file/upload" // Some huge (500+ Mb) artifact to upload.

    // Explicit part size calculation is omitted for the sake of simplicity of the examples.
    val response = client.submitFormWithBinaryData(
        url = url,
        formData =  formData {
            // 1. With ByteArray, from the Ktor documentation <https://ktor.io/docs/client-requests.html#upload_file>
            // Loads everything entirely into ByteArray, then copies it into Buffer, can be slow and lead to OOMs.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L50>
            append(
                key = FILE_KEY,
                // Can be done in KMP with Kotlinx IO like this:
                // value = SystemFileSystem.source(Path(uploadFilePath)).buffered().readByteArray(),
                value = File(uploadFilePath).readBytes(),
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            )


            // 2. With InputProvider of Source
            // Loads content as needed, but possibly blocks under the hood when new bytes are requested.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L63>
            append(
                key = FILE_KEY,
                // Can be done in KMP with Kotlinx IO like this:
                // value = InputProvider { SystemFileSystem.source(Path(uploadFilePath)).buffered() },
                value = InputProvider { File(uploadFilePath).inputStream().asSource().buffered() },
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            )
            // Or the same (InputProvider of Source is built inside .appendInput(...))
            appendInput(
                key = FILE_KEY,
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            ) {
                // Can be done in KMP with Kotlinx IO like this:
                // SystemFileSystem.source(Path(uploadFilePath)).buffered()
                File(uploadFilePath).inputStream().asSource().buffered()
            }


            // 3. With Source directly
            // Loads content as needed, but possibly blocks under the hood when new bytes are requested.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L56>
            append(
                key = FILE_KEY,
                // Can be done in KMP with Kotlinx IO like this:
                // value = SystemFileSystem.source(Path(uploadFilePath)).buffered(),
                value = File(uploadFilePath).inputStream().asSource().buffered(),
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            )


            // 4. With ChannelProvider of ByteReadChannel of Source
            // Loads content as needed, but possibly blocks under the hood when new bytes are requested.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L70>
            append(
                key = FILE_KEY,
                // Can be done in KMP with Kotlinx IO like this:
                // ChannelProvider { ByteReadChannel(SystemFileSystem.source(Path(uploadFilePath)).buffered()) }
                value = ChannelProvider { ByteReadChannel(File(uploadFilePath).inputStream().asSource().buffered()) },
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            )


            // 5. With ChannelProvider with of File.readChannel(...): ByteReadChannel
            // Loads content as needed, but possibly blocks under the hood when new bytes are requested.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L70>
            append(
                key = FILE_KEY,
                // Cannot be done in KMP, as there's no alternative to readChannel(...) for kotlinx.io.files.Path
                value = ChannelProvider { File(uploadFilePath).readChannel() },
                headers = headers {
                    append(HttpHeaders.ContentDisposition, "filename=${uploadFilePath.escapeIfNeeded()}")
                }
            )


            // 6. With convenient .append(...) of `Sink.() -> Unit` builder
            // Loads everything entirely into Buffer under the hood, can be slow and lead to OOMs.
            // Internals: <https://github.com/ktorio/ktor/blob/3.2.3/ktor-client/ktor-client-core/common/src/io/ktor/client/request/forms/formDsl.kt#L231>
            append(
                key = FILE_KEY,
                filename = uploadFilePath, // `Content-Disposition: filename="..."` is calculated from this conveniently.
            ) {
                // Can be done in KMP with Kotlinx IO like this:
                // SystemFileSystem.source(Path(uploadFilePath)).use { transferFrom(it) }
                File(uploadFilePath).inputStream().asSource().use { transferFrom(it) }
            }
        },
    )

    check(response.status.isSuccess())

    println(response.bodyAsText())
}

Aleksei Tirman [JB]

08/11/2025, 9:47 AM

I would go with

value = SystemFileSystem.source(Path(uploadFilePath)).buffered()

because this API can be used in KMP, and the source returned from

buffered

, buffers reads from the original source, so the file will be read by chunks.

Dzmitry Neviadomski

08/11/2025, 10:07 AM

That would be number 3? I spent some time benchmarking after posting initial question, and this is the slowest one, by far.

Copy code

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart1>"` | 167.3 ± 5.2 | 159.0 | 178.4 | 1.15 ± 0.04 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart2>"` | 148.4 ± 7.6 | 139.5 | 165.4 | 1.02 ± 0.06 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart3>"` | 1254.9 ± 33.9 | 1215.8 | 1341.0 | 8.59 ± 0.31 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart4>"` | 146.0 ± 3.4 | 140.5 | 156.0 | 1.00 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart5>"` | 146.1 ± 5.8 | 140.0 | 165.8 | 1.00 ± 0.05 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/100M" "<http://0.0.0.0:4040/uploadFileByMultiPart6>"` | 174.5 ± 5.8 | 164.5 | 187.0 | 1.20 ± 0.05 |

Aleksei Tirman [JB]

08/11/2025, 10:11 AM

Yes, number 3. Does your benchmarking code use

SystemFileSystem.source(Path(uploadFilePath)).buffered()

File(uploadFilePath).inputStream().asSource().buffered()

Dzmitry Neviadomski

08/11/2025, 10:15 AM

I have just tried both now, head to head, results are the same. Also numbers grow exponentially with increase in payload size:

Copy code

| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart1>"` | 1.621 ± 0.025 | 1.577 | 1.663 | 1.22 ± 0.02 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart2>"` | 1.325 ± 0.015 | 1.307 | 1.350 | 1.00 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart3>"` | 208.400 ± 19.233 | 194.784 | 257.638 | 157.25 ± 14.62 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart4>"` | 1.429 ± 0.025 | 1.387 | 1.476 | 1.08 ± 0.02 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart5>"` | 1.620 ± 0.359 | 1.417 | 2.536 | 1.22 ± 0.27 |
| `curl -X POST -H "UPLOAD_FILE_PATH: /home/nevack/work/test_data/1000M" "<http://0.0.0.0:4040/uploadFileByMultiPart6>"` | 1.956 ± 0.062 | 1.846 | 2.065 | 1.48 ± 0.05 |

Dzmitry Neviadomski

08/11/2025, 10:22 AM

I do not have proof, as I have not profiled the code yet, but my hypothesis is

{ value.peek() }

, that is done on passed Source, is doing an inefficient copy of original Source.

RealSource::peek

creates

PeekSource(this).buffered()

https://github.com/Kotlin/kotlinx-io/blob/0.7.0/core/common/src/RealSource.kt#L145

Aleksei Tirman [JB]

08/11/2025, 10:31 AM

Then number 5 (

ChannelProvider { File(uploadFilePath).readChannel() }

). Which platforms do you need to support?

Dzmitry Neviadomski

08/11/2025, 10:37 AM

I build CLIs with Kotlin/Native and backend with Kotlin/JVM. I lean to 2 or 4, as they are similar performance wise and can be used in

common

sources. Returning to the original question: Would it be helpful to file a documentation issue to add examples for multipart uploads of large files?

Aleksei Tirman [JB]

08/11/2025, 10:39 AM

It would be greatly appreciated if you do so.

👍 1

Dzmitry Neviadomski

08/13/2025, 2:09 PM

Before filing the issue I have taken a look at ktor documentation repo. For my surprise, I found this PR https://github.com/ktorio/ktor-documentation/pull/659 I have no access to the issue https://youtrack.jetbrains.com/issue/KTOR-7365 But seems like this was already addressed recently.

Aleksei Tirman [JB]

08/18/2025, 11:15 AM

Yes, but it doesn't show all the different methods of adding a file part and their respective pros and cons.

13 Views

Open in Slack

Previous Next