{"filename":"agent_20260516_1812.md","content":"# Agent Report — Paired Block Calibration Remains Underpowered\n**Date**: 2026-05-16 18:12\n**Piano**: 15\n**Tension explored**: TRAJECTORY_APPLY_20260516_1759 / T10_BLOCK21_SHORTLAG_REJECTED\n\n## Claim Under Test\n\nCycle `20260516_1759` rejected block21 short-lag orientation as a positive detector because it zeroed both planted positives and controls. The next claim was smaller: do not test another standalone detector; test a paired/block-preserving calibration where a target is compared against matched controls under the same seed and planted split.\n\nScore under test:\n\n`paired_profile(split) = max(raw_target_score(split) - max(raw_drift_const_vol, raw_shock_only, raw_vol_only), 0)`\n\nThe raw target score is the existing scan split score. The same paired transform is applied to the null baselines.\n\n## Question\n\nCan a paired residual profile recover planted orientation under iid+block5 before block21 is used as a final falsifier, while beating the naive VaR/RV split baselines?\n\n## Experiment Design\n\nScript executed:\n\n`python3 /opt/D-ND_LAB/data/finance/experiments/paired_block_calibration_20260516_1812.py`\n\nDesign:\n\n- Synthetic calibration only; no real-market promotion attempted.\n- 72 target cases: 12 seeds x 3 planted splits x 2 targets.\n- Targets: `oriented_full`, `oriented_no_shock`.\n- Matched controls: `drift_const_vol`, `shock_only`, `vol_only`.\n- Planted splits: 0.35, 0.50, 0.65.\n- Scan grid: 0.20-0.80 in 13 steps.\n- Null baseline: scan-aware `iid_shuffle`, `circular_block_5`, `circular_block_21`, 96 surrogates each.\n- Stage1 pass: `DND_DELTA` under both iid and block5.\n- Robust all-null pass: Stage1 plus block21 pass.\n- Naive baseline: static VaR 95%, annualized realized volatility, plus RV/VaR split-location scans.\n\nPrior-art boundary: Hamilton/HMM would estimate latent regime states; Bai-Perron would test structural breaks; RV/VaR baselines directly target risk-scale changes. This cycle tests whether the D-ND paired orientation residual adds recoverable structure beyond those baselines.\n\n## Results\n\n| Target | Cases | Stage1 iid+block5 | Robust all-null | iid planted hit | RV hit | VaR hit | Median iid z | Max iid z | Max block21 z |\n|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|\n| oriented_full | 36 | 6/36 = 16.7% | 4/36 = 11.1% | 27.8% | 41.7% | 55.6% | -0.172 | 16.194 | 16.103 |\n| oriented_no_shock | 36 | 5/36 = 13.9% | 2/36 = 5.6% | 27.8% | 38.9% | 58.3% | -0.201 | 39.749 | 12.206 |\n\nAggregate:\n\n- `stage1_iid_block5_total_rate`: 11/72 = 15.3%\n- `robust_all_null_total_rate`: 6/72 = 8.3%\n- `iid_cluster_hit_total_rate`: 20/72 = 27.8%\n- `rv_hit_total_rate`: 29/72 = 40.3%\n- `var_hit_total_rate`: 41/72 = 56.9%\n\nBest robust examples:\n\n- `oriented_full`, seed 6201, split 0.50: iid z = 16.194, block5 z = 10.940, block21 z = 14.774, planted split hit = true.\n- `oriented_full`, seed 6201, split 0.65: iid z = 10.751, block5 z = 23.437, block21 z = 16.103, planted split hit = true.\n- `oriented_no_shock`, seed 6205, split 0.50: iid z = 7.806, block5 z = 10.015, block21 z = 12.206, planted split hit = true.\n\n## Key Findings\n\n1. Paired calibration recovers some isolated positives, but power is still too low. Stage1 iid+block5 is only 15.3% total, far below the >=70% calibration target.\n\n2. Robust all-null recovery remains worse: 8.3% total. Block21 no longer destroys every signal, but the few survivors do not make the detector usable.\n\n3. Naive VaR still dominates split localization. VaR hits the planted split in 56.9% of target cases, versus 27.8% iid cluster hit and 15.3% Stage1 D-ND pass.\n\n4. The issue is no longer only false-positive leakage. The paired transform uses matched drift/vol/shock controls, yet the raw split score does not carry enough target-specific recoverable orientation.\n\n5. Another residual/veto layer on this same raw score is low-yield. The next discriminating move is to redesign the planted positive object or target variable itself, with VaR/RV as explicit competitors from the start.\n\n## Verdict\n\n**NO_DELTA. Paired/block calibration is rejected as a promotion detector. It produces non-zero isolated recovery, but `stage1_iid_block5_total_rate = 15.3%` and `robust_all_null_total_rate = 8.3%`, while naive VaR reaches `56.9%` split-location hit. Do not promote to real-market testing.**\n\nThis is a valid negative cycle. The seed was updated with a new constraint: residual/veto layers on the current raw split score are exhausted.\n\n## Bicono della scoperta\n\n- **Due radici**:\n  - Root 1: Pairing target against matched drift/vol/shock controls can preserve a few true planted cases.\n  - Root 2: The same score remains too sparse and underpowered versus VaR/RV baselines.\n- **Singolare**: The singular point is the Stage1 ceiling: iid+block5 recovery stops at 15.3% even before the block21 veto is treated as final.\n- **Invariante di passaggio**: A finance detector cannot be promoted unless it beats naive VaR/RV on planted calibration before touching real-market data.\n- **Campo di possibilita'**: Next cycle should stop adding residual layers to the same split score. Redesign the planted positive object or target variable, then test against VaR/RV and block-preserving nulls from the start.\n\n## Files\n\n- Experiment script: `data/finance/experiments/paired_block_calibration_20260516_1812.py`\n- Experiment output: `data/finance/experiments/paired_block_calibration_20260516_1812.json`\n- Report: `data/finance/reports/agent_20260516_1812.md`\n- Seed updated: `data/finance/seed.json`\n","title":"Agent Report — Paired Block Calibration Remains Underpowered","verdict":"NO_DELTA. Paired/block calibration is rejected as a promotion detector. It produces non-zero isolated recovery, but `stage1_iid_block5_total_rate = 15.3%` and `robust_all_null_total_rate = 8.3%`, whil","bicono":{"roots":"- Root 1: Pairing target against matched drift/vol/shock controls can preserve a few true planted cases.\n  - Root 2: The same score remains too sparse and underpowered versus VaR/RV baselines.","singular":"The singular point is the Stage1 ceiling: iid+block5 recovery stops at 15.3% even before the block21 veto is treated as final.","invariant":"A finance detector cannot be promoted unless it beats naive VaR/RV on planted calibration before touching real-market data.","field":"Next cycle should stop adding residual layers to the same split score. Redesign the planted positive object or target variable, then test against VaR/RV and block-preserving nulls from the start."},"size":5480,"mtime":"2026-05-16T18:17:30.630867+00:00"}