Measuring whether a model resists false structural authority.
Forced injection benchmark for LLM structural cognition.
Blind no-locus evaluation stays stable.
Fake anchor injection degrades resistance by 0.33–0.67.
Structural resistance is scored by whether the model rejects false authority when line anchors or fake coordinates are injected into code review prompts. Measured on repeated strict3 runs (n=3), no-locus vs fake-locus protocol.
python tools/forced_structural_injection_test.py --focused --b-mode strict3 --repeat 3
Protocol is model-agnostic. The probe targets structural authority response, not vendor-specific behavior.
Independent structural evaluation. Not affiliated with any official model benchmark.
Observed instability appears target-dependent, not vendor-dependent. Same protocol applies to Gemini, Claude, GPT, or any structured code reasoning model.
skill selection operates on activation field vectors, not rank scores. candidate orbits are phase-positioned relative to a circular mean attractor θ. phase spread below threshold triggers geometry-dominant weighting.
Any model with structured code reasoning capability can be evaluated under the identical no-locus / with-locus injection protocol. The probe targets structural authority response, not vendor-specific behavior.
# raw_skill_signal captured before axis bias raw_skill_signal = { "entropy_inspection": float(entropy_intensity), "closure_hold": float(closure_intensity), "patch_probe": float(patch_probe_intensity), } if dominant_axis == "compile_pressure_axis": closure_intensity = min(1.0, closure_intensity + 0.10) # biased if dominant_axis == "compile_entropy_axis": entropy_intensity = min(1.0, entropy_intensity + 0.08) # biased # norm captures post-bias values — patch_probe never biased: asymmetric ⚠ norm_skill_signal = { "entropy_inspection": float(entropy_intensity), # post-bias "closure_hold": float(closure_intensity), # post-bias "patch_probe": float(patch_probe_intensity), # unchanged } selected_skill = max(skill_intensity, key=lambda k: skill_intensity[k]) # hysteresis: previous_skill silently overrides field result, no external log ⚠ if new_intensity < (prev_intensity + transition_cost): selected_skill = previous_skill applied_hysteresis = True # scalar_gap retroactively rewritten to post-hysteresis winner — audit corrupted ⚠ if scalar_gap.get("winner") != selected_skill: scalar_gap = { "winner": selected_skill, ... } label_gap = { "winner": selected_skill, "collapse_ratio": 1.0, # hardcoded — never reflects actual gap size ⚠ }