How often do people tweak the default Jetty Undertow server kotlinlang #http4k

How often do people tweak the default (Jetty,Under...

Andrew O'Hara

10/31/2023, 8:22 PM

How often do people tweak the default (Jetty,Undertow) server settings? A common issue I run into when a burst of requests come in is that the server (presumably) runs out of threads, and gets killed by the health check. It could be happily chugging along the best it can, but because it can't respond to the health check fast enough, its gets killed. I'm unsure whether I should be tweaking thread pool settings or something. I also wonder if Netty/NIO might be better in these scenarios, but I'm wary of it because of past reliability / websocket issues. 🤷

dave

10/31/2023, 8:53 PM

For anything but extreme scenarios (generally reverse proxies), we have always found that the out-of-the-box backends work very well - I favour Undertow personally. I'll be looking to maybe try out Helidon when I get the chance as well.

Andrew O'Hara

10/31/2023, 9:10 PM

What kind of request rates would you consider normal? Thousands/TenThousnads/HundredThousands per server per minute?

dave

10/31/2023, 10:02 PM

That's a bit of a hard thing to answer without knowing a decent amount more about setup! Are you in k8s, own DC, heroku, fargate? How beefy are the boxes - cores/memory or request/CPU limits ? Number of nodes ? Can you horizontally scale and not worry about it if it's just bursty? What's the model of data access (caches/dB/network)? If dB, are you using write and/read replicas? How beefy is that dB - how many connections can it handle? And so on 🙃. It's all a bit of a juggling act.. If you pull one lever over here then something over there will react. It's no good turning on the traffic hose and killing your dB for every node. Servers like undertow use worker pools that are directly scaled to the number of cores, so very good for general use without modification. I tend to believe that if you have "those problems", then you'll know because your observability should show you - and you have to measure for your scenario and noone elses. Until then, engineers should concentrate on building stuff which will make the business more money 🙃 Or to put it another way: engineers are expensive, compute is cheap 🙃

Andrew O'Hara

10/31/2023, 10:14 PM

engineers are expensive, compute is cheap

Try telling that to my boss 😉. Yeah, I'll keep tweaking things. It's possible I've made my ECS containers too small to deal with the bursts in this case. I wanted to scale outwards, but if the containers are killed too quickly, it never has the chance to scale up.

James Richardson

11/01/2023, 7:47 AM

Are you running health on its own server? Do your endpoints have a mix of different service times? Might be a case that they belong in different services. ECS autoscaling cannot deal with a rush of requests, it will only scale things slowly - it cannot cope with traffic bursts. Consider scaling on a schedule if this is time based. Have you done any modelling? Simple queue modelling can help show settings for workers and queue sizes. The server shouldn't really ever die cos of too many requests - they should be rejected (by e.g. undertow because of queue too long) before it takes the server out... so possibly you are servicing too many simultaneous requests for the size of sever, so for stability consider reducing numbers not increasing them... the requests will then be rejected elsewhere. Are you using discovery in ECS? It's got some hardcoded limits. Once you have service stability use HDR histogram or similar to understand service times. ....

James Richardson

11/01/2023, 8:15 AM

Also don't forget you can run all this locally.. using wrk and by running your docker image, and reducing the memory and cpu limits, ( https://docs.docker.com/config/containers/resource_constraints/ ) you can see how it runs,more or less, in ECS, but locally. We'll, if you run on linux, dont know if thats supported on mac, but you can get temporary ec2 running if needed.

56 Views

Open in Slack

Previous Next