dave08
02/20/2023, 10:43 AMRuntime.getRuntime().addShutdownHook(hook)
, is that only SIGINT? It seems like in your video it was clearly using SIGINT, but in my current version not...simon.vergauwen
02/20/2023, 10:46 AMShutdownHook
also work on SIGTERM
and other termination signals.
If you prefer using addShutdownHook { }
instead of Resource
then also be sure to add Thread.sleep(30_000)
to take into account the delay for the LoadBalancer/Ingress and to manually close
all your resources like PrometheusMetricRegistry
, HikariDataSource
, Kafka, etc.dave08
02/20/2023, 10:48 AMResource
doesn't need to sleep? I mean, k8s wouldn't put up the new instance until the old one is down, so for a bunch of instances it would be another 30sec. per service to wait...simon.vergauwen
02/20/2023, 10:50 AMsuspendapp-ktor
integration takes care of that of the delay, like I showed an explained in the webinar.
I mean, k8s wouldn't put up the new instance until the old one is downWith
RollingUpdates
or auto-scaling the problem is that the load balancer will still send requests to your terminating pods, and that will result in 502 Bad Gateway
without the delay.
You can find the full example with more details, and references to other Kubernetes resource in the repo.
https://github.com/nomisRev/ktor-k8s-zero-downtimedave08
02/20/2023, 10:51 AMapplication.environment.monitor.subscribe(ApplicationStopped)
hook + that addShutdownHook would I need that wait?
is that the load balancer will still send requests to your terminating podsI thought the readinessProbe takes care of that? I'm surprised k8s isn't smart enough to stop sending requests to terminating pods... thanks for that piece of knowledge!
simon.vergauwen
02/20/2023, 10:57 AMI thought the readinessProbe takes care of that?Readiness probe is used for start-up not shutdown. Health probe is also not sufficient for this concern, also at least not by my experience in practice and all resources I come across and open issues I linked in the repo seem to confirm that.
dave08
02/20/2023, 11:03 AMsimon.vergauwen
02/20/2023, 11:15 AMbin/sleep
in the preStop
hook of the K8s yaml files. Which delays the SIGTERM
going to the pod, instead of delaying the SIGTERM
inside the pod.
Doing a quick search on their Github seems they is some open & closed issues related to this on Istio. Some are closed in December 2022, so possible fixed.
If you see 502 Bad Gateway
around times that you're doing rolling updates / up-and-downscaling then it's probably not fixed. That might be a good indicator in your metric system to keep an eye out for.