Edoardo Luppi
03/04/2024, 11:03 AMDmitry Khalanskiy [JB]
03/04/2024, 11:08 AMSam
03/04/2024, 12:01 PMlaunch
inside another coroutine. Since we're operating in a highly concurrent environment, that means we could end up in a situation where we're trying to add a new child to a parent job at the same time that the parent job is in the middle of being cancelled. We need to make sure we always end up with a consistent result, such that both the parent and child end up properly cancelled, and get a chance to run any completion handlers they might have registered.
DCSS (or RDCSS) is a multi-word compare-and-set algorithm, which is a fancy way of saying that it lets us interact with several pieces of state at the same time in an atomic way. In this case it's being used to add a child job to the parent's list of children, but only if the parent hasn't already been cancelled. If the parent's state changes during the operation, its list of children won't be modified at all and we'll have to try again.
Without that extra atomicity (or linearizability), we could end up in a situation like this: we add a child to a parent job, thinking the parent is still active, but by the time we're done, the parent job has actually become cancelled. But avoiding that by making the whole operation atomic is pretty complicated. If we just remove the RDCSS and allow the state changes to overlap, the code will be simpler and probably faster. The downside is that we'll have to go back afterwards and check that we don't have any child jobs that snuck in during the cancellation.Sam
03/04/2024, 12:01 PMDmitry Khalanskiy [JB]
03/04/2024, 12:22 PMDmitry Khalanskiy [JB]
03/04/2024, 12:48 PMSam
03/04/2024, 2:10 PMkevin.cianfarini
03/04/2024, 11:23 PM