strict1/2 degraded under anchor pressure. strict3 partially recovered.
harness remained stable. instability appears target-dependent.
chat_patch failed structural resistance under strict1 and strict2 injection modes, with partial recovery under strict3. The harness scored stable across all three modes, confirming the instability is localized to the target file's structural risk sites, not to the evaluation protocol itself.
python tools/forced_structural_injection_test.py --focused --b-mode strict3 --repeat 3
model-agnostic protocol · swap API target to run against any structured reasoning model