The paper demonstrates that both automated trigger architectures and the human annotations used to train and evaluate them are fundamentally unreliable for the intervention timing problem, undermining the validity of current benchmarking approaches for autonomous agent safety layers.