That's a bit of a hard thing to answer without knowing a decent amount more about setup! Are you in k8s, own DC, heroku, fargate? How beefy are the boxes - cores/memory or request/CPU limits ? Number of nodes ? Can you horizontally scale and not worry about it if it's just bursty? What's the model of data access (caches/dB/network)? If dB, are you using write and/read replicas? How beefy is that dB - how many connections can it handle?
And so on 🙃. It's all a bit of a juggling act.. If you pull one lever over here then something over there will react. It's no good turning on the traffic hose and killing your dB for every node.
Servers like undertow use worker pools that are directly scaled to the number of cores, so very good for general use without modification.
I tend to believe that if you have "those problems", then you'll know because your observability should show you - and you have to measure for your scenario and noone elses. Until then, engineers should concentrate on building stuff which will make the business more money
🙃
Or to put it another way: engineers are expensive, compute is cheap 🙃