https://kotlinlang.org logo
Channels
100daysofcode
100daysofkotlin
100daysofkotlin-2021
advent-of-code
aem
ai
alexa
algeria
algolialibraries
amsterdam
android
android-architecture
android-databinding
android-studio
androidgithubprojects
androidthings
androidx
androidx-xprocessing
anime
anko
announcements
apollo-kotlin
appintro
arabic
argentina
arkenv
arksemdevteam
armenia
arrow
arrow-contributors
arrow-meta
ass
atlanta
atm17
atrium
austin
australia
austria
awesome-kotlin
ballast
bangladesh
barcelona
bayarea
bazel
beepiz-libraries
belgium
benchmarks
berlin
big-data
books
boston
brazil
brikk
budapest
build
build-tools
bulgaria
bydgoszcz
cambodia
canada
carrat
carrat-dev
carrat-feed
chicago
chile
china
chucker
cincinnati-user-group
cli
clikt
cloudfoundry
cn
cobalt
code-coverage
codeforces
codemash-precompiler
codereview
codingame
codingconventions
coimbatore
collaborations
colombia
colorado
communities
competitive-programming
competitivecoding
compiler
compose
compose-android
compose-desktop
compose-hiring
compose-ios
compose-mp
compose-ui-showcase
compose-wear
compose-web
confetti
connect-audit-events
corda
cork
coroutines
couchbase
coursera
croatia
cryptography
cscenter-course-2016
cucumber-bdd
cyprus
czech
dagger
data2viz
databinding
datascience
dckotlin
debugging
decompose
decouple
denmark
deprecated
detekt
detekt-hint
dev-core
dfw
docs-revamped
dokka
domain-driven-design
doodle
dsl
dublin
dutch
eap
eclipse
ecuador
edinburgh
education
effective-kotlin
effectivekotlin
emacs
embedded-kotlin
estatik
event21-community-content
events
exposed
failgood
fb-internal-demo
feed
firebase
flow
fluid-libraries
forkhandles
forum
fosdem
fp-in-kotlin
framework-elide
freenode
french
fritz2
fuchsia
functional
funktionale
gamedev
ge-kotlin
general-advice
georgia
geospatial
german-lang
getting-started
github-workflows-kt
glance
godot-kotlin
google-io
gradle
graphic
graphkool
graphql
graphql-kotlin
graviton-browser
greece
grpc
gsoc
gui
hackathons
hacktoberfest
hamburg
hamkrest
helios
helsinki
hexagon
hibernate
hikari-cp
hire-me
hiring
hiring-french
hongkong
hoplite
http4k
hungary
hyderabad
image-processing
india
indonesia
inkremental
intellij
intellij-plugins
intellij-tricks
internships
introduce-yourself
io
ios
iran
israel
istanbulcoders
italian
jackson-kotlin
jadx
japanese
jasync-sql
java-to-kotlin-refactoring
javadevelopers
javafx
javalin
javascript
jdbi
jhipster-kotlin
jobsworldwide
jpa
jshdq
juul-libraries
jvm-ir-backend-feedback
jxadapter
k2-early-adopters
kaal
kafka
kakao
kalasim
kapt
karachi
karg
karlsruhe
kash_shell
kaskade
kbuild
kdbc
kgen-doc-tools
kgraphql
kinta
klaxon
klock
kloudformation
kmdc
kmm-español
kmongo
knbt
knote
koalaql
koans
kobalt
kobweb
kodein
kodex
kohesive
koin
koin-dev
komapper
kondor-json
kong
kontent
kontributors
korau
korean
korge
korim
korio
korlibs
korte
kotest
kotest-contributors
kotless
kotlick
kotlin-asia
kotlin-beam
kotlin-by-example
kotlin-csv
kotlin-data-storage
kotlin-foundation
kotlin-fuel
kotlin-in-action
kotlin-inject
kotlin-latam
kotlin-logging
kotlin-multiplatform-contest
kotlin-mumbai
kotlin-native
kotlin-pakistan
kotlin-plugin
kotlin-pune
kotlin-roadmap
kotlin-samples
kotlin-sap
kotlin-serbia
kotlin-spark
kotlin-szeged
kotlin-website
kotlinacademy
kotlinbot
kotlinconf
kotlindl
kotlinforbeginners
kotlingforbeginners
kotlinlondon
kotlinmad
kotlinprogrammers
kotlinsu
kotlintest
kotlintest-devs
kotlintlv
kotlinultimatechallenge
kotlinx-datetime
kotlinx-files
kotlinx-html
kotrix
kotson
kovenant
kprompt
kraph
krawler
kroto-plus
ksp
ktcc
ktfmt
ktlint
ktor
ktp
kubed
kug-leads
kug-torino
kvision
kweb
lambdaworld_cadiz
lanark
language-evolution
language-proposals
latvia
leakcanary
leedskotlinusergroup
lets-have-fun
libgdx
libkgd
library-development
lincheck
linkeddata
lithuania
london
losangeles
lottie
love
lychee
macedonia
machinelearningbawas
madrid
malaysia
mathematics
meetkotlin
memes
meta
metro-detroit
mexico
miami
micronaut
minnesota
minutest
mirror
mockk
moko
moldova
monsterpuzzle
montreal
moonbean
morocco
motionlayout
mpapt
mu
multiplatform
mumbai
munich
mvikotlin
mvrx
myndocs-oauth2-server
naming
navigation-architecture-component
nepal
new-mexico
new-zealand
newname
nigeria
nodejs
norway
npm-publish
nyc
oceania
ohio-kotlin-users
oldenburg
oolong
opensource
orbit-mvi
osgi
otpisani
package-search
pakistan
panamá
pattern-matching
pbandk
pdx
peru
philippines
phoenix
pinoy
pocketgitclient
polish
popkorn
portugal
practical-functional-programming
proguard
prozis-android-backup
pyhsikal
python
python-contributors
quasar
random
re
react
reaktive
realm
realworldkotlin
reductor
reduks
redux
redux-kotlin
refactoring-to-kotlin
reflect
refreshversions
reports
result
rethink
revolver
rhein-main
rocksdb
romania
room
rpi-pico
rsocket
russian
russian_feed
russian-kotlinasfirst
rx
rxjava
san-diego
science
scotland
scrcast
scrimage
script
scripting
seattle
serialization
server
sg-user-group
singapore
skia-wasm-interop-temp
skrape-it
slovak
snake
sofl-user-group
southafrica
spacemacs
spain
spanish
speaking
spek
spin
splitties
spotify-mobius
spring
spring-security
squarelibraries
stackoverflow
stacks
stayhungrystayfoolish
stdlib
stlouis
strife-discord-lib
strikt
students
stuttgart
sudan
swagger-gradle-codegen
swarm
sweden
swing
swiss-user-group
switzerland
talking-kotlin
tallinn
tampa
teamcity
tegal
tempe
tensorflow
terminal
test
testing
testtestest
texas
tgbotapi
thailand
tornadofx
touchlab-tools
training
tricity-kotlin-user-group
trójmiasto
truth
tunisia
turkey
turkiye
twitter-feed
uae
udacityindia
uk
ukrainian
uniflow
unkonf
uruguay
utah
uuid
vancouver
vankotlin
vertx
videos
vienna
vietnam
vim
vkug
vuejs
web-mpp
webassembly
webrtc
wimix_sentry
wwdc
zircon
Powered by
Title
a

Adam Miskiewicz

10/22/2020, 4:56 PM
At Airbnb, we use Kotlin and coroutines in some of our server-side code. The workload for each request is fully async — there’s no blocking code that runs during the life of the request. For the most part, things work just fine, as you’d expect. However, as we load test these services and bring them to a breaking point, we see latency sky rocket, queuing start to occur at the web server layer (we use Dropwizard and non-blocking jetty APIs, so under normal operation, there is absolutely no queuing), and eventually heap exhaustion as all of this work is queued and requests never complete. Now of course, some of this behavior is expected — every service has its breaking point 🙂. In our Java services, however, which has a more mature service framework, we use standard patterns such as CoDel (controlled delay, https://en.wikipedia.org/wiki/CoDel), adaptive queuing, etc, to perform some load shedding to ensure that, even under heavy load, while the service may degrade, it never gets into an unrecoverable state. This has worked well for us, and I’d like to do similar things in our Kotlin services that use Coroutines. Does anyone have any insight into how to apply a load shedding algorithm (bulkheading, controlled delay, etc) for coroutines, while still leveraging the default work-stealing dispatcher?
CoroutineScheduler
doesn’t expose any public APIs to get it’s current “state”, meaning that there’s no way to really get much insight into the state of the default dispatcher. TL;DR — I want to figure out how to structure my application such that, when we’re saturating the CPU (and thus all the default dispatcher threads are super busy doing “stuff”), I can start to throw some back pressure and reject some requests, rather than continuing to accept hundreds/thousands of requests and causing the server to completely become unresponsive. And notably, I want to be able to cause this behavior with a mechanism that doesn’t require a ton of tuning per-service — I don’t want to do this strictly with a “request rate limit” or something like that. I’d like it to be much more adaptive, ideally.
l

louiscad

10/22/2020, 7:02 PM
Do you have any idea regarding the API surface you would want to use for your use case?
a

Adam Miskiewicz

10/22/2020, 7:05 PM
I think probably I’d want some insight into the queue size inside CoroutineScheduler and whether or not it is “filling up” (e.g., look at how quickly things are leaving the queue)
but honestly i’m not sure
open to ideas
l

louiscad

10/22/2020, 7:06 PM
What would be the unit of "how quickly things are leaving the queue"?
I think you could make your own
CoroutineDispatcher
wrapping
Dispatchers.Default
and adding enter/exit listeners (with `try`/`finally`). With that, you could experiment with the API and its usage and report back on how it's going, so we can see if that could be a great addition or inspiration for first-party support into kotlinx.coroutines
a

Adam Miskiewicz

10/22/2020, 7:10 PM
if we wanted to implement something similar to the CoDel mechanism I was mentioning, then you’d basically have two timeouts of how long something would live in the queue before actually running — one timeout which is the steady state timeout, and one value which is the “overloaded” timeout…e.g, after you know that you’re starting to be overloaded (because things aren’t being taken off the queue within the time specified in the steady state timeout), you would make the timeout more stringent, and cancel tasks that take longer than this to schedule, so as to lower the size of the queue and shed the load.
So I actually thought about this dispatcher approach
when you say try/finally, you mean around the runnable itself right
like wrap the runnable
to know when it started and ended
?
l

louiscad

10/22/2020, 7:15 PM
Yes, but now that I read about your use case, I don't think you want to do it at the dispatcher level, that's too low level and not running queued runnables in the default dispatcher might/would/will cause issues that'd prevent proper operation of the program.
How is the machine overloaded? RAM because of too many in flight tasks to run, or CPU because computations happening on that machine's CPU, or blocking I/O like file read/write (which should run on
<http://Dispatchers.IO|Dispatchers.IO>
), or something else?
a

Adam Miskiewicz

10/22/2020, 7:20 PM
not blocking I/O — there’s no blocking IO for these services. CPU happens first — we can saturate the CPU with X number of requests. As the CPU is saturated, as you would expect, we see higher latency. We run in Kubernetes, so eventually the CPU starts to get heavily throttled. As the CPU starts to get throttled, the service starts really misbehaving, and requests start taking longer and longer. The service uses Y amount of heap per request, so if you end up in a place where there’s a ton of inflight requests that are never completing, then you end up, eventually, with heap exhaustion.
this isn’t terribly unique to our services of course — I think this pattern is very common with most types of services with workloads that are more CPU bound.
the absolute simplest approach, of course, is to just basically rate limit the number of requests coming into the service to some certain level
but this has obvious downsides
and in the case of the server that I have in mind — it’s a graphql server, and thus each request is considerably different from the next. One query might fetch 10 fields, another query might fetch 1000 fields and do some business logic.
l

louiscad

10/22/2020, 7:23 PM
I understand why you want to limit the requests dynamically instead of setting a magic number.
a

Adam Miskiewicz

10/22/2020, 7:23 PM
yep
(just being explicit 😛 haha)
l

louiscad

10/22/2020, 7:23 PM
Or verbose? 😜
a

Adam Miskiewicz

10/22/2020, 7:24 PM
that too
🙂
l

louiscad

10/22/2020, 7:24 PM
😁
I think you need a wrapper for these requests, so it's agnostic to the number of suspension points. You can have it hold a pair of these timeouts if it depends on the work requested. And you have a cancellable coroutine that'd wait for the first timeout to pass. If the work finishes first, it's cancelled while suspended on
delay(…)
, otherwise, it signals the work is slow, so other requests can be throttled or what needs to be done is done to avoid making the situation worse. After signaling this, the work is still allowed to run until the second timeout (mind the subtraction with first timeout if needed). If the second delay completes without being cancelled from that work completion, then it cancels the work (and also sends some signal to somewhere if needed?).
a

Adam Miskiewicz

10/22/2020, 7:33 PM
yah, this makes sense. I think one worry I have with this is that it’s possible a request could take a long time not because the service is overloaded, but because some downstream service has elevated latency
l

louiscad

10/22/2020, 7:40 PM
Here's an example if you were to launch these in a fire and forget fashion (except knowing the work took longer than expected). You can make a suspending version that could wrap only the parts involving CPU work and no external service calls. You could also wrap CPU operations differently from external service operations. And you could throw or return whatever you want in the coroutine racing for the expected execution time window + allowed extra time.
BTW,
raceOf
is from my open source library #splitties, you can find it documented here: https://github.com/LouisCAD/Splitties/tree/main/modules/coroutines#racing-coroutines
e

elizarov

10/23/2020, 7:37 AM
Please, drop an issue to http://github.com/kotlin/kotlinx.coroutines/issues with your use-case. Internally, Default dispatcher implementation actually knows how long request have been waiting in the queue (because it keeps a timestamp of each task for its own needs), so it could expose this information in some sort of API. In the near future we plan to work on the design of the API for “sliceable” dispatchers and this new API might be a good entry point to also expose additional “dispatcher load” information to the outside world.
j

Jilles van Gurp

10/23/2020, 9:48 AM
I'd recommend exposing some telemetry from your server and start looking at the metrics instead of making assumptions and drawing conclusions about where you think the bottlenecks are. It might be as simple as spinning up more servers. If you are not maxing out the CPU there are usually two explanations: 1) you are blocking on something. I once profiled a server and found our DNS lookups were taking hundreds of ms and weren't cached. Solution: use a cloud provided dns. 2) you are saturating disk or network IO somehow. Bottom line is profile and measure where it is spending time before you start tinkering with your architecture. Maybe do some load and stress testing and autoscale when you hit certain connection thresholds.
k

kenkyee

10/23/2020, 11:17 AM
I'd also ask whether you should be using k8s or some service mesh health monitoring so it doesn't send the traffic to services that are slowing down. You can expose queue length or response time as a health metric. I'd echo what Jilles said as well.
a

Adam Miskiewicz

10/23/2020, 4:29 PM
@Jilles van Gurp @kenkyee thanks for your thoughts here. Rest assured that a) we’ve been doing a bunch of load and stress testing to figure out where the limits are and b) we do use Kubernetes and a service mesh with Envoy to be able to do some of this in the mesh layer. We’re moving to Istio at Airbnb over the course of the next year or so, and as we do that it unlocks a few more advanced envoy features that can help us with this problem and move it out of the application for sure.
@louiscad thanks for your thoughts! This approach inspired me to do more research on this topic specifically around measuring latency and inflight requests, and I stumbled upon this netflix blog post: https://medium.com/@NetflixTechBlog/performance-under-load-3e6fa9a60581, and this repo: https://github.com/Netflix/concurrency-limits. I implemented this in a service last night, and it’s pretty successful. At a high level, after thinking about it, I think I like this approach better than anything that ties deeper into the dispatcher. While I do think it would be good to expose an API from the dispatcher to be able to tell if its healthy (and perhaps use that as input to load shedding), I think leveraging latency of a particular workload as a primary input to a resilience mechanism makes a ton of sense.
👌 1
@elizarov i’d love to know more about this “sliceable” dispatchers concept! Is there anything written publicly about this yet?