Drawdown Focus: BTC/USDT 1h Analysis | Freqtrade | Kiploks

WHY THIS MATTERS

Backtests show what worked.

Kiploks shows what can survive.

This page answers one question:

"Can I safely deploy capital here?"

Strategy:SampleStrategy (Freqtrade)03 Mar 2026, 20:53

Asset:BTC/USDT·Timeframe:1h·Exchange:binance

Test Period:2024-01-01→2026-02-08

Freqtrade config symbol: BTCUSDT, endDate: 2026-02-08, exchange: binance, fee_open: 0.0011, fee_close: 0.0011, startDate: 2024-01-01, timeframe: 1h, initialBalance: 1000, Trades: 507

FINAL VERDICT

DO NOT DEPLOY - Tail Risk

Diagnostic case: The Black Swan Magnet (© Kiploks)

Kiploks Robustness Score: 0 / 100Bayesian pass probability: 50%

One or more hard gates failed. DO NOT DEPLOY until blocking modules are fixed.

Deployment Gate

Validation Gates

Execution Buffer - Net Edge (Net Profit > 15 bps, period-level)(-32.56 bps vs 15 bps) ESTIMATED - Net edge below 15 bps or edge deficit after fees
Stability (WFE > 0.5)(0.05 vs 0.5) - WFE below 0.5 (OOS/IS ratio too low)
Data Quality Guard (test period ≥ 2 years)(769 vs 730 days)

Statistical Confidence

Statistical Significance (t-Stat > 1.96)
t-Stat (OOS Edge) > 2.0 (same metric as above, stricter threshold)(-1.13 vs 2) - t-Stat below 2.0 (OOS edge not significant). Same metric as Statistical Significance; 2.0 is stricter than 1.96. Single OOS window - interpret with caution.

Critical Failures

Deployment is blocked because the following hard gate(s) failed: Execution Buffer - Net Edge (Net Profit > 15 bps, period-level); Stability (WFE > 0.5).
Kurtosis > 15 indicates fat tails.
Negative skewness - losses cluster in extreme events.

Execution Note: Backtest execution settings were missing. The system applied a standard Safety Buffer (0.05% slippage, 0.1% fee). For Institutional Grade (AAA), provide exact exchange API fees and liquidity-based slippage. You can set slippage and commission in your backtest or integration config to use exact values and remove this note.

Operational Insight: Net edge is negative; no slippage headroom. Status: Edge Deficit.

Recommended Action

DO NOT DEPLOY. Address failing hard gate(s): Execution Buffer - Net Edge (Net Profit > 15 bps, period-level); Stability (WFE > 0.5). Then re-run analysis.

Robustness score is 0 because a module blocks (e.g. Risk, Execution, or Stability). Potential score if unblocked: 3. Fix blocking modules first. Even unblocked, score remains in TRASH range (0-20) - no meaningful improvement.

ROBUSTNESS SCORE

Formula (multiplicative penalties)

0 / 100 (FAIL)

Blocked by Walk-Forward & OOS, Risk Profile, Execution Realism modules

Diagnosis

Execution score (5/100) is below the blocking threshold of 10. Edge does not survive 10 bps slippage - strategy may not be realizable in live conditions. Review transaction costs, reduce turnover, or improve edge.

░░░░░░░░░░░░░░░░░░░░

Methodology Note[?]

Breakdown (contributing factors)

Data Quality Guard

████████████

100

→ Adequate data period for full audit

Walk-Forward & OOS(40%)(blocking)

░░░░░░░░░░░░

→ BLOCKED

Risk Profile(30%)(blocking)

░░░░░░░░░░░░

→ BLOCKED

Parameter Stability(20%)

█████████░░░

→ Parameters stable across sensitivity tests

Execution Realism(10%)(blocking)

░░░░░░░░░░░░

→ BLOCKED (raw 5, threshold 10)

DATA QUALITY GUARD

Data Quality Guard (DQG)

Score: 100%PASSDQG Factor: 1.00Contribution: 40.0

Robust Net Edge (Safe Edge): No net profit - outlier check N/A

Module	Score	Verdict
Gap Density	100%	PASS
Outlier Influence	n/a	N/A
Look-Ahead Bias	100%	PASS
Spread/Liquidity	100%	PASS
Sampling & Over-fitting	100%	PASS
Price Integrity	100%	PASS

BENCHMARK METRICS

[Benchmark methodology – included when copying for context] Benchmark methodology: OOS Retention = sum(OOS)/sum(IS) over all WFA windows. When both IS and OOS are negative, ratio >100% means OOS losses larger than IS (Relative Loss Magnitude); verdict n/a in Kill Switch. Relative Change = (mean OOS - mean IS)/|mean IS|; when mean(IS)<0, Relative Change = -(Retention - 1). WFE (median OOS/IS) uses only windows with IS>0; min/median/max and variance same N. At small N (e.g. 3 windows), median WFE and Dominance are low-signal. Strategy Max DD = full backtest; OOS Max DD = validation windows only. Alpha = geometric excess (1+Rs)/(1+Rb)-1. Sharpe = return/volatility; Calmar = return/max DD. OOS Dominance = share of windows where OOS > 90% of IS; Profitable Windows = share with OOS > 0. --- Definitions / tooltips --- OOS Sharpe [A]: Same metric as Avg OOS Sharpe (window-level); both use canonical window-level OOS Sharpe from WFA. WFE distribution: Median of per-window OOS/IS ratios over windows with IS>0 only. Min/median/max same N; median middle value (odd N) or average of two middle (even N). Variance is population. When overall OOS negative, median >1 means loss driven by subset of windows; large spread suggests outliers. OOS Retention / Relative Loss Magnitude: sum(OOS)/sum(IS). When both negative: label shows Relative Loss Magnitude; >100% = OOS losses larger than IS; not interpretable as profit retention. Kill Switch verdict n/a. Relative Change: (mean OOS - mean IS)/|mean IS|. When mean(IS)<0: Relative Change = -(Retention - 1). Same N as Retention. Avg OOS Sharpe: Mean of per-window OOS Sharpe. Sharpe uses volatility; Calmar uses max DD - they can diverge. OOS much worse than Full Sharpe may indicate overfitting. Avg OOS Calmar: return/|Max DD|. Different denominator than Sharpe; can be more negative when DD is large. Full Sharpe / Full Calmar: Full backtest. Sharpe and Calmar use different denominators (vol vs max DD). Profitable Windows: Share of WFA windows with OOS return > 0. Not the same as OOS Dominance (OOS > 90% of IS). OOS Dominance Ratio: Share of windows where validation return exceeded 90% of optimization return. Not the same as Profitable Windows (OOS > 0). Can be high even when total OOS negative if one window has extreme loss. Small N: At 3 or fewer valid windows (or 5 total), median WFE and Dominance are low-signal (noise rather than reliable signal). Max Drawdown: Strategy Max DD = full backtest period. OOS Max DD = validation windows only; can be smaller than full-period DD. Tracking Error: Volatility of (strategy - benchmark) returns. Same aligned return series as volatility and correlation. Alpha (Excess CAGR): Geometric excess (1+R_strategy)/(1+R_benchmark)-1. Same as Excess in CAGR row. Differs from arithmetic difference.

Walk-Forward validation: [A] OOS-only metrics, [B] period-level (WFE, retention, profitable windows), [C] full backtest. Kill Switch and verdict use these values; at small N interpret with caution.

Quick Win (WFA summary)

[A] OOS equity-based OOS equity-based metrics from validation windows (canonical).From: WFA window OOS

OOS Sharpe: -0.56(same as below)OOS Calmar: -0.16OOS Max DD (validation only): -13.99%

[B] WFA period-levelWFA period-level: WFE, retention, profitable windows, trend match, win rate degradation.

Relative Loss Magnitude: 166.0%

[C] Full backtest contextFull backtest (IS) context; canonical full_backtest.From: full backtest (no WFA split)

Full Sharpe: -0.04Full Calmar: -0.34

Profitable Windows[?]3/6 all windows (50%) FAIL

Profitable OOS among IS>0 windows(same base as WFE)2/4

OOS/IS Trend MatchYES

Win Rate Change (OOS − IS, pp)[?]-2.1 pp RED FLAG

IS Win Rate / OOS Win Rate23.7% / 21.6%

IS: full backtest; OOS: OOS trades

Statistical Robustness (OOS Validation)

WFE is conditional on IS>0 only - not a full-sample metric. Do not interpret WFE median as a full-strategy summary; the strategy may be loss-making overall. Min/median/max and variance use the same N (windows with IS > 0) (N=4): median is the middle value (odd N) or the average of the two middle values (even N). When overall OOS is negative, median WFE > 1 means the loss is driven by a subset of windows (in some windows OOS was better than IS). Large spread (min to max) with negative total OOS suggests a few windows or outliers dominate; interpret with caution. OOS Retention and Relative Change use all windows (N=6). Profitable counts use all windows. Retention over all windows reflects full P&L.

WFE (Median OOS/IS) distributionn/a[?]

median(OOS/IS per window)

WFE Variancen/a[?]

Parameter Stability Index (PSI)[?]n/a[?]

Edge Half-Life (T1/2, OOS)[?]2 windows (~240.0d)

WFA Windows6

Exp. OOS Return ± Vol[?]-2.2% ± 4.0%

Worst Window Return (N=6, 1 obs)[?]

-9.50%

Avg OOS Sharpe (window-level)[?]-0.56

Avg OOS Calmar (approx)[?]-0.16

Optimization Gain (IS)[?]5.24%

Relative Loss Magnitude (OOS/IS)[?]166.0%(N=6 windows)

When both IS and OOS are negative, this ratio is not interpretable as retention (share of profit preserved); it is shown for transparency only and indicates relative magnitude of losses (|OOS| vs |IS|). >100% = OOS losses larger than IS (degradation); <100% = OOS losses smaller than IS. Retention 119% and Relative Change -19% describe the same thing: when mean(IS) < 0, Relative Change = -(Retention − 1), so both mean OOS worse than IS.Raw: sum(IS) = -7.93%, sum(OOS) = -13.17% (both negative).

sum(OOS)/sum(IS) over all windows(N=6 windows)

Relative Change (OOS−IS)/|IS|(mean OOS vs mean IS)[?]-66.0%

When mean(IS) < 0: Relative Change = -(Retention − 1).

Negative = OOS worse than IS (degradation).

(mean(OOS)-mean(IS))/|mean(IS)|(N=6 windows)

Advanced Diagnostic Indicators

Verdict[?]

🔴REJECT

Immediate Kill Switch triggered. Net Edge < 10 bps (current: -32.56 bps); Bayesian pass probability < 65% (current: 50%); Regime adaptability: 0/3 pass (min 1); Consecutive OOS drawdown windows: 2 (limit: 1)

Capital Kill Switch - The Red Line[?]

2 consecutive (all windows) (limit: 1)

Next OOS window in minus → turn off bot

Summary (possible)

•Regime Failure: Strategy failed across all tested market regimes (0/3 pass). Logic is not adapted to current market.

•OOS Retention 166.0% (OOS losses larger than IS - does not imply no overfitting). REJECT driven by regime failure (0/3) and low Bayesian pass probability - insufficient evidence for deployment.

•Statistical Confidence: Bayesian pass probability 50% with 6 WFA windows - REJECT driven by regime failure and insufficient evidence for deployment.

•Verdict: REJECT. Strategy is not viable.

WALK-FORWARD VALIDATION

[Walk-Forward Validation - single source of truth for this block] Verdict rule: Failure rate > 30% leads to verdict FAIL. When verdict is FAIL and failure rate > 30%, institutional grade is capped to BBB - RESEARCH ONLY at submit time (runProfessionalWfa) and stored in the payload; displayed grade matches the snapshot (no runtime re-grade on read). Walk-Forward Validation methodology (single source for this block): WFE (Efficiency) = median(OOS/IS) over windows with IS > 0 only (min 3 such windows; otherwise N/A). Consistency = share of windows with positive IS return that have positive OOS return; show as X% (Y/Z) when Z = positive-IS windows. Failed Windows = windows with validation return <= 0 or insufficient OOS trades; same rule for count and details. Performance Degradation = (mean OOS - mean IS) / |mean IS| aggregate over all windows (not per-window); when mean(IS) < 0, degradation = -(OOS Retention - 1). Value from full-precision period returns; hand calc from rounded display % may differ (e.g. -17% vs -18.9%). Good = OOS > 0 and OOS/IS > 0.7; Fragile = OOS > 0 and OOS/IS <= 0.7; Fail = OOS <= 0. WFE Advanced (rank WFE + permutation p-value) and other sub-scores are pre-verdict composites; overall verdict can still be FAIL. --- Definitions (same as tooltips) --- WFE (Efficiency): Median of per-window OOS/IS ratios over windows with IS > 0 only. Not mean over all windows. Requires at least 3 such windows. Value from full-precision period returns; hand calc from displayed % may give ~1.28 when report shows 1.26. Consistency: Share of windows with positive IS return that have positive OOS return. Denominator is only windows with IS > 0 (e.g. 67% = 2/3); different from Failed Windows (e.g. 4/6 = windows with OOS <= 0). Failed Windows: Windows where validation return <= 0 or insufficient OOS trades (below threshold). Same definition for count and Failed Windows Details list. In some failed windows OOS may be better than IS (smaller loss); classification is by OOS <= 0 only. Performance Degradation: Aggregate over all windows: (mean OOS - mean IS) / |mean IS|. When mean(IS) < 0: degradation = -(OOS Retention - 1). From full-precision period returns; hand calc from rounded % may differ (e.g. -17% vs -18.9%). Transfer % (Good windows): OOS return / IS return as percent, using same precision as displayed Opt/Val so hand calc matches (e.g. 2.3/1.8 = 128%). Only when both IS and OOS > 0. Can exceed 100%. Grade override: When verdict FAIL and failure rate > 30%, grade is capped to BBB - RESEARCH ONLY and recommendation set to research-only; WFE Advanced score is pre-verdict composite. WFE Advanced (why ROBUST vs FAIL): WFE Advanced uses rank-based walk-forward efficiency (mean OOS rank / mean IS rank per window, 1-based tie-aware ranks) and a one-sided permutation p-value versus random OOS shuffles. It does not use failure rate or failed-window count. Overall verdict can still be FAIL (e.g. failure rate > 30%), so ROBUST here can appear together with verdict FAIL. Monte Carlo P(positive): From simulation (bootstrap/distribution), not the observed proportion of positive OOS windows. P(positive) can differ significantly from observed (e.g. 11% vs 33%); large gaps may warrant reviewing simulation parameters. Stress / Recovery: From drawdown-recovery model; not from failure rate. Can show RESILIENT even when verdict is FAIL. --- Reconciliation with hand calc (copy of tooltip content) --- Reconciliation with hand calc from displayed %: WFE uses full-precision period returns so report may show 1.26 while hand calc from displayed Opt/Val gives ~1.28. Performance Degradation uses full-precision so report may show -18.9% while hand calc from mean IS/OOS in % gives -17%. Transfer % is computed from same precision as displayed Opt/Val (1 decimal) so hand calc matches (e.g. 2.3/1.8 = 128%, 2.9/0.9 = 322%). --- Current report values (for verification) --- Verdict: FAIL. Explanation: Failure rate exceeds 30% - verdict forced to FAIL.. Failed Windows: 3/6 (failure rate 50%). Recommendation: Research only. Do not deploy to production without further validation. Performance Degradation: -66.0%.

Time Stability & Overfitting Control

Performance Transfer

In-Sample (IS) vs Out-of-Sample (OOS)

Walk-Forward Analysis - Continuous View

IS (In-Sample) + OOS (Out-of-Sample) equity on a single timeline

Total OOS Return

-2056.7%

OOS Win Rate

3 / 6

IS Avg Return

-132.2%

OOS Avg Return

-219.5%

Overfitting Score

MEDIUM

Equity Curve - IS vs OOS segments (continuous)

ISOOS GoodOOS FragileOOS Failed

OOS performance per period (bar)

ISOOS

WFE (Efficiency):

0.05

Consistency:

50% (2/4)

Performance Degradation:

-66.0%

Failed Windows:3 / 6

Consistency uses only windows with IS > 0. Failed Windows = windows with OOS ≤ 0 or insufficient OOS trades (in some, OOS may still be better than IS). Different denominators.

Overfitting Risk:MEDIUM (n/a)

When failure rate > 30%, overfitting risk may be understated; consider verdict and failure rate.

Professional WFA

This analysis was stored with an older formula version. Values are shown as saved. Re-run or re-submit the analysis for the latest institutional-grade rules and WFE Advanced fields.

Grade:BBB - RESEARCH ONLY(override: Verdict FAIL and failure rate > 30%; grade capped to BBB - RESEARCH ONLY.)

Research only. Do not deploy to production without further validation.

Pre-verdict module scores (composite; overall verdict and grade above)

WFE Advanced:[?] ROBUST(score 100)(pre-verdict composite; overall verdict FAIL)

Overall verdict FAIL - do not rely on this score alone.

Regime: STABLE

Monte Carlo: DOUBTFUL(method: Legacy)(P(positive)=10%)[?]

Stress: RESILIENT(recovery: HIGH)[?]

Equity curve: WEAK

Window Breakdown

Period 1

[Fragile]

Opt: 2.8%(63)

Val: 1.0%(25)

█░░░░░░░░░░░36%

Period 2

[Fail]

Opt: 2.7%(68)

Val: -5.5%(16)

░░░░░░░░░░░░n/a

Diagnosis: Alpha Reversal (Overfitted)

Period 3

[Fragile]

Opt: 3.0%(59)

Val: 1.0%(29)

█░░░░░░░░░░░33%

Period 4

[Fail]

Opt: -4.7%(93)

Val: -9.5%(23)

░░░░░░░░░░░░n/a

Period 5

[Fail]

Opt: 1.9%(71)

Val: -0.4%(16)

░░░░░░░░░░░░n/a

Diagnosis: Alpha Reversal (Overfitted)

Period 6

[Fragile]

Opt: -13.7%(60)

Val: 0.3%(16)

░░░░░░░░░░░░n/a

Failed Windows Details

• Period 2: Validation return is non-positive

• Period 4: Validation return is non-positive

• Period 5: Validation return is non-positive

▶ Verdict: FAIL

Failure rate exceeds 30% - verdict forced to FAIL.

PARAMETER SENSITIVITY & STABILITY

[Parameter Sensitivity & Stability - single source of truth for this block] Parameter Sensitivity & Stability: single source from backend (optimization trials or WFA-derived). Sensitivity (R²), 2 decimals. Risk Score: Base = 100×(1 − maxSensitivity). Penalty = 2×needsTuningCount + 5×fragileCount (2 per Needs Tuning [0.40, 0.60), 5 per Fragile >=0.60). Raw = Base − Penalty. Ceiling = 100 − 5×penalisedCount. Final = max(0, floor(min(Raw, ceiling))). The number shown is Final. Order: round sensitivity to 2 decimals, then band, then penalisedCount, then Base, Penalty, Raw, Ceiling, Final. penalisedCount = Needs Tuning + Fragile. Bands: Stable [0, 0.30), Reliable [0.30, 0.40), Needs Tuning [0.40, 0.60), Fragile >=0.60. Example: Base 71 − Penalty 0 => Raw 71, Ceiling 100 => Final 71. Risk Class: LOW if score >= 65, MODERATE if 50<=score<65, HIGH if 20<=score<50, CRITICAL if <20. Deployment: (1) Data Quality Guard and sufficient OOS trades - if failed, REJECT; (2) Performance Decay - if >= 80%, REJECT; (3) Parameter Risk Score - if < 50, REJECT; (4) otherwise by Risk Class: MODERATE (50-64) -> APPROVED (Conditional), LOW (>= 65) -> APPROVED. When only Parameter Risk Score is used: 50 <= score < 65 -> APPROVED (Conditional); score >= 65 -> APPROVED. When overall robustness score is present, [50, 60) with PASS also yields APPROVED (Conditional). When Performance Decay is unavailable (< 3 periods), the Decay condition is omitted; deployment is then based only on DQG, Min OOS Trades, and Risk Score. Set HOLD_WHEN_DECAY_UNAVAILABLE to true (backend) to force HOLD when Decay is N/A and would otherwise APPROVE. When Performance Decay is N/A (< 3 periods), Decay gate is skipped; decision is (Risk Score verdict) AND (Min OOS Trades met). Set HOLD_WHEN_DECAY_UNAVAILABLE to true to force HOLD when Decay N/A. Governance metrics (Sharpe Drift, Tail-Risk, Coupling) do not affect Risk Score or Deployment; advisory only. Scale: round sensitivity to 2 decimals, then band. Stable [0, 0.30); Reliable [0.30, 0.40); Needs Tuning [0.40, 0.60); Fragile >= 0.6. Boundaries: 0.30 = Reliable (start); 0.40 = Needs Tuning (start); 0.60 = Fragile (start). penalisedCount = params with rounded sensitivity >= 0.4. Penalty: 2 per Needs Tuning, 5 per Fragile. Ceiling = 100 − 5×penalisedCount. Final = max(0, floor(min(Raw, ceiling))). Order: round → band → penalisedCount → Base → Penalty → Raw → Ceiling → Final. Score: integer (floor). --- Definitions (same as tooltips) --- Local Topology: Shape of the score-vs-parameter curve (when curve points available). Sharp peak = fragile; flat = stable. PSI, Surface Gini, Safety Margin, OOS variance attribution. Not used in Risk Score formula. Sharpe Retention: (OOS Sharpe/IS Sharpe)×100. Computed only when IS Sharpe > 0. Not the same as Benchmark OOS Retention (return ratio). When > 100%, displayed as improvement (OOS > IS); descriptive only, no significance test. May reflect regime, sample size, or IS underperformance; interpret with caution. Sharpe Drift (OOS vs IS): Sharpe Drift = Sharpe Retention − 100. Report in p.p. only (e.g. 31.3 p.p.). Positive = OOS > IS (improvement); negative = degradation. Defined when Sharpe Retention is defined (e.g. IS Sharpe > 0). Efficiency (Governance): Legacy alias for Sharpe Drift; same formula and value. Display one value only (Sharpe Drift). Result in p.p. only; do not use '%' for Drift. Max Tail-Risk Reduction: (OOS CVaR − IS CVaR) / |IS CVaR| × 100 (%). Name 'Reduction' is historical; result can be negative (risk increased). Use only negative values for loss. If IS CVaR >= 0: do not compute or display (data error); backend skips. Result > 0 = Risk Reduced; < 0 = Risk Increased. Relative change in tail risk (CVaR 95%). Governance Impact: Governance metrics: signal attenuation, Sharpe retention, Sharpe Drift, tail-risk reduction. Advisory only (manual review); not deployment gates. Performance Decay is a deployment gate (step 2), not a Governance metric. Efficiency = legacy alias for Sharpe Drift; same value, advisory only. Multi-Parameter Coupling: When parameters move together (high correlation), risk can multiply. Coupling Risk and Max Correlation Pair show linked parameters. Not included in Risk Score formula. --- Current report values (for verification) --- Deployment: APPROVED. Risk Score: 71/100 (LOW). maxSensitivity: 0.29 (same as table). penalisedCount (rounded sensitivity >= 0.4): 0. Risk Score: integer (floor). Pro-Note: Highest sensitivity: Buy_rsi (0.29, Stable).

Methodology: Sensitivity = R² (correlation²) between parameter value and trial score; we use it as a proxy for 'outcome strongly tied to parameter' (tuning matters). High R² = parameter significantly predicts outcome. Magnitude (slope per unit change) is a separate planned metric; Risk Score does not use slope. Sensitivity values: 2 decimal places. Risk Score: integer (floor). From optimization trials or WFA windows.

Suggested Mitigation: Risk Neutral

Parameter

Optimal[?]

Topology[?]

Sensitivity

Status

Buy_rsi

0.29

🟢 Stable

Suggested Mitigation: Risk Neutral

Exit_short_rsi

0.06

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_p1

0.03

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_p2

0.07

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_p3

0.07

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_t1

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_t2

0.02

🟢 Stable

Suggested Mitigation: Risk Neutral

Roi_t3

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Sell_rsi

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Short_rsi

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Stoploss

-0.30

0.01

🟢 Stable

Suggested Mitigation: Risk Neutral

Scale (classification bands): Scale: round sensitivity to 2 decimals, then band. Stable [0, 0.30); Reliable [0.30, 0.40); Needs Tuning [0.40, 0.60); Fragile >= 0.6. Boundaries: 0.30 = Reliable (start); 0.40 = Needs Tuning (start); 0.60 = Fragile (start). penalisedCount = params with rounded sensitivity >= 0.4. Penalty: 2 per Needs Tuning, 5 per Fragile. Ceiling = 100 − 5×penalisedCount. Final = max(0, floor(min(Raw, ceiling))). Order: round → band → penalisedCount → Base → Penalty → Raw → Ceiling → Final. Score: integer (floor).

Sensitivity (R²): strength of linear relationship between parameter value and score (predictability), not magnitude of effect. Slope (impact per unit change) is a separate planned metric; Risk Score uses R² only. From optimization trials or WFA windows.

Topology (when available): curve shape from trials; flat = stable, sharp peak = fragile.

Sensitivity: implemented as R² (correlation²). R² measures strength of linear relationship (predictability), not magnitude of change; we use it as proxy for parameter-outcome tie (high R² = fragility). For true sensitivity (magnitude per unit parameter), derivative-based metric is planned. Values: 2 decimals; Risk Score: integer (floor).

DIAGNOSTIC SUMMARY

1. Local Topology & Stability[?]

2. Governance Impact (Suggested Mitigation)[?]

Governance metrics below do not affect Risk Score or Deployment; advisory only.

Signal Attenuation: 63.8%

Sharpe Retention (IS ➔ OOS): [?]254.2%(OOS > IS; may indicate sample or regime bias - interpret with caution)

Sharpe Drift (OOS vs IS): [?]100.0 p.p.(OOS Sharpe > IS, improvement)

Max Tail-Risk Reduction: [?]30.7%(Risk Reduced)

3. Multi-Parameter Coupling[?]

Coupling analysis: No dominant unstable interactions detected.

AUDIT VERDICT

Deployment Status: APPROVED (no Decay check)Approval does not include the Decay gate (Decay N/A). Enable HOLD_WHEN_DECAY_UNAVAILABLE to require Decay before APPROVED.

Performance Decay: N/A (min 3 periods required for decay check). When N/A, Decay condition is omitted; HOLD_WHEN_DECAY_UNAVAILABLE (backend) can force HOLD instead of APPROVED.Warning: Decay N/A and Fail-Safe is off; deployment not based on decay.

Final Decision = (Risk Score Verdict) AND (Performance Decay < 80% when Decay is defined; when Decay is N/A this condition is omitted) AND (Min OOS Trades met). REJECTED when any applied condition fails. Performance Decay is a deployment gate (step 2). Governance (Sharpe Drift, Tail-Risk, etc.) is advisory only; does not change the result.

Risk Score: [?]Base 71 − Penalty 0 → Risk Class: LOW (71/100) (passing)

Pro-Note: Highest sensitivity: Buy_rsi (0.29, Stable).

TRADING INTENSITY & COST DRAG

[Trading Intensity & Cost Drag – included when copying for context] Trading Intensity & Cost Drag: Annual Turnover (institutional) = min(Purchases, Sales) / AUM, annualized (SEC/Morningstar style). Used for cost and rebate. Position velocity (holding-period) = utilization × (365.25 / avg holding days); can overstate when positions overlap. Utilization = avg trade notional / initial balance. Avg Holding Time = open-to-close per round-trip (matched BUY-SELL). Total Cost Drag = fees + slippage (+ market impact when in range). When participation > 15% ADV, market impact and capacity show N/A; total is fees + slippage only. Rebate Capture is informational, not included in Total Cost Drag. Cost/Edge and Safety Margin show n/a when gross edge ≤ 0 (breakeven undefined). Required Alpha Boost: (1) When per-trade gross < 0: = |gross per trade| + cost per trade bps. (2) When period gross < 0 but per-trade gross > 0: = |Avg Net Profit / Trade| bps (execution offset only); strategy loses at calendar level because accumulated costs exceed profit. Win Rate Sensitivity n/a when base profit ≤ 0 (not meaningful). Z-Score indicative only; significance not claimed at small n. --- Definitions / tooltips --- Annual Turnover (institutional): Institutional turnover = min(Purchases, Sales) / AUM, annualized (SEC/Morningstar style). Used for cost drag and rebate. Can be lower than position velocity when positions overlap. Position velocity (holding-period): utilization × (365.25 / avg holding period in days). Can overstate true capital turnover when multiple positions overlap; shown for reference. Avg Holding Time: Average time from open to close per round-trip (matched BUY-SELL pairs). From trade timestamps. Can imply multiple overlapping positions when combined with trades/month. Avg position size: Average trade notional as share of AUM (utilization). Institutional turnover = min(Buy,Sell)/AUM; position velocity = utilization × (365.25 / avg holding days). Cost / Edge Ratio (n/a): Ratio not reported when gross edge ≤ 0: it would be >100% and is not interpretable as an execution constraint. Avg Net Profit / Trade (bps): Net = gross − cost per trade (bps of notional). Total Cost Drag is annual cost as % of AUM; cost per trade in bps can be larger when turnover > 1 (e.g. −14 bps net with ~28 bps cost/trade implies ~+14 bps gross). Required Alpha Boost = |net| bps per trade to breakeven. Safety Margin (n/a when negative edge): Safety Margin is undefined when net edge ≤ 0 (breakeven slippage not defined). Shown as n/a to avoid implying a valid ratio. Safety Margin is based on average profit per trade; Net Edge is based on calendar returns and can include hidden costs. Trades may be profitable per fill while the strategy as a business loses money. Safety Margin (Required Alpha for 5x): Required Alpha for 5x Margin. At 5 bps assumed slippage, 5x Safety Margin needs breakeven ≥ 25 bps. Add ~(2×(25−BES)) bps per trade (one-way) to reach 5x when BES < 25. Market Impact (est.): Estimated cost as % of initial balance per year (square-root model). Values > 100% mean cost exceeds capital per year; check ADV and volume assumptions. Top 5 traded pairs / ADV utilization: ADV utilization requires candle/volume data for the traded pairs. Not available for this backtest. Portfolio weighted: Portfolio-weighted ADV requires candle/volume data. Not available for this backtest. Slippage sensitivity row [Zombie]: Unrealistic scale for strategy capacity. Slippage sensitivity row [N/A]: Participation > 15% ADV at this AUM; model out of range. Alpha Half-Life: Alpha decay in time (e.g. from trades). Differs from Edge Half-Life (OOS window-to-window decay in Pro Metrics). Requires at least 30 trades; otherwise n/a. Win Rate Sensitivity (n/a): When base profit ≤ 0, sensitivity is not meaningful: strategy is already loss-making. Metric is not interpretable as payoff structure. Required Alpha Boost (bps per trade): Bps per trade needed to reach breakeven. Two cases: (1) Per-trade gross < 0: full amount = |gross edge per trade| + execution cost per trade. (2) Per-trade gross ≥ 0 but net < 0: execution offset only = |Avg Net Profit / Trade| bps. When status is EDGE DEFICIT and this value equals |Avg Net Profit / Trade|, period gross is negative but each trade is profitable before costs; strategy loses at calendar level. Gross edge (per trade, at institutional): Same cost base as Cost Drag and Required Alpha Boost (institutional turnover). When overlap is high, this value can be large because cost per trade uses the lower institutional denominator. Gross edge (period/CAGR) / Profit Factor (Gross): Profit Factor < 1 means total gross losses exceed total gross wins over the period; this is consistent with positive avg gross per trade when losing trades are larger or more frequent.

Execution: Simple (estimated fees)

Results use estimated fees/slippage. Provide exact exchange parameters for Institutional-grade analysis.

Position velocity (holding-period) (198.0x) is 2.36x institutional turnover (83.8x). Overlapping positions likely; institutional turnover is used for cost and rebate.

Market Impact (Layer 2.5)

ADV $458.757 is very low; model assumptions may not hold.

INTERPRETIVE SUMMARY

Period gross is negative (profit factor < 1) but per-trade gross is positive. Strategy loses at calendar level; accumulated costs exceed profit. Required Alpha Boost is execution offset only.

Baseline AUM:$1,000

Avg Trades / Month:20

Annual Turnover (institutional):[?]83.8x

Position velocity (holding-period)[?]198.0x

Avg Holding Time:[?]30.8h

Avg position size[?]69.6% of AUM

Cross-check (trades × utilization)[?]~167.6x

Implied overlap factor[?]2.36x

EFFICIENCY & COST LIMITS

Profit Factor (Gross, full backtest):[?]0.60

Profit Factor (Net, full backtest):0.52

Cost / Edge Ratio:[?]n/a (negative gross edge)

Avg Net Profit / Trade (bps)[?]-7.46 bps

Break-even Slippage:

Tolerance:n/a (negative edge)

Margin of Safety:n/a

Safety Margin:[?]n/a (negative edge)

BES Status:EDGE DEFICIT

Failure Mode:Negative period Net Edge (cost > profit)

COST DECOMPOSITION (CAGR)

Exchange Fees:-33.5%

Slippage:-16.8%

Market Impact (est.)[?]N/A - participation ratio too high for model

Participation ratio exceeds 15% of ADV; square-root model out of range.

Total Cost Drag:-50.3%

Market impact not included (model out of range); total is fees + slippage only.

Rebate Capture:0.48 bps/trade(≈ 0.40% CAGR at current turnover)

Rebate Capture is not included in Total Cost Drag; informational (potential savings with maker-heavy execution).

When gross edge is negative, cost decomposition shows cost allocation; improving execution alone cannot make the strategy profitable.

CAPACITY & MARKET IMPACT

Estimated AUM Capacity:

N/A - model out of range (participation > 15% ADV)

ADV Utilization:

Top 5 traded pairs:n/a

Portfolio weighted:n/a

Market Impact Model:

Assumption:Square-root law

Liquidity regime:Micro / low liquidity

SLIPPAGE SENSITIVITY (NON-LINEAR)

AUM Size

Slippage CAGR

Net CAGR

~$100k[N/A]

N/A

~$1.0M[N/A]

N/A

~$5.0M[N/A]

N/A

~$10.0M[N/A]

N/A

EXECUTION HARDENING

Order Type Bias:Limit-biased

Taker / Maker Ratio:40 / 60

Limit Fill Probability:21.0%

Opportunity Cost (Fill Decay):17.80 bps

Adverse Selection (Cost):4.00 bps

Latency Sensitivity:Medium

Toxic Flow Risk:Medium

Moderate adverse selection risk in fast markets.

SENSITIVITY TO ALPHA DECAY

Alpha Half-Life:[?]480.0 days

Long half-life from WFA validation; high-turnover strategies may have shorter effective decay in practice.

Win Rate Sensitivity:[?]n/a (not meaningful)

RISK & CONTROLS

Primary Constraint:Gross edge negative (alpha-deficit)

Gross edge (per trade, at institutional 83.8x):[?]+52.6 bps

High value reflects low institutional turnover denominator; most capital cost is in overlap periods.

Gross edge (period/CAGR):[?]negative

Available Control Levers:

Reduce trading frequencyLow

Increase entry threshold (signal strength)Low

Shift to maker-only executionLow

STATUS

Deployment Class:Micro-cap / Research-only

COST ADAPTABILITY:❌ FAIL (Required: +7.5 bps)

Required Alpha Boost (bps per trade)[?]7.46 bps

CAPACITY GOVERNANCE:⚠ SCALE-LIMITED

EXECUTION RISK:⚠ WARNING

Confidence Level:Low

20 trades/mo, low signal-to-noise. Negative Z suggests loss-making; low confidence (small sample).

Z-Score: -1.48(negative = loss-making tendency; low confidence (small sample). Indicative only; statistical significance not claimed at this n.)

STRATEGY ACTION PLAN

[Strategy Action Plan – included when copying for context] Strategy Action Plan: Institutional decision engine. Slippage Sensitivity: Net Sharpe at each slippage level uses a nonlinear model (power 0.7 in slippage vs reference bps); degradation is not linear. When base OOS Sharpe < 0, strategy is NOT_VIABLE and table is suppressed. Baseline Sharpe from WFA OOS (window-level) or Risk (OOS trades). Verdict bands: Safe when netSharpe >= baseSharpe*0.85; Margin erosion when >= 0.6*base or netSharpe > 0; UNTRADABLE when netSharpe <= 0. Drawdown Delta = relative increase vs baseline DD; cap at 200% means DD >= 3x baseline displays as +200%, actual may be worse. Phase 1 (Incubation): allocation 10-20%, monitoring focus on most sensitive param, Runtime Kill Switch = Armed (N/M OOS Fail) or TRIGGERED. Kill Switch reset: OOS Sharpe > 0 (2 consecutive windows), fail ratio < 33%, WFE (all windows) above Phase 2 threshold, manual review. Phase 2 trigger: at least 2 consecutive WFA windows with WFE (all windows) > threshold and Trend regime confirmed. Conflict A (Critical): fail ratio > 50% - no Phase 2. Conflict B (Warning): low WFE - conservative allocation. Bull/Bear: regime count 0/3 or X/3; fragile params; alpha decay. --- Definitions (same as tooltips) --- Slippage Sensitivity (formula): Estimated Net Sharpe at each slippage level uses a nonlinear model (power 0.7 in slippage vs reference bps); degradation is not linear in slippage. When slippage destroys edge, Net Sharpe can go strongly negative. Baseline Sharpe source: From WFA OOS (window-level) when avgOosSharpe is used; otherwise from Risk (OOS trades). Same source as precomputed slippage table. Drawdown Delta (vs baseline): Relative increase vs baseline max drawdown. Formula: DD_delta% = (DD_slippage - DD_baseline) / DD_baseline × 100. E.g. +108% means DD ≈ 2.08× baseline; +200% (cap) = 3× baseline - actual may be worse when capped. Verdict (Safe / Margin erosion / UNTRADABLE): Safe: netSharpe >= baseSharpe*0.85. Margin erosion: netSharpe >= baseSharpe*0.6 or netSharpe > 0. UNTRADABLE: netSharpe <= 0. Execution Efficiency (approx.): spread / breakeven tolerance (see Breakeven tolerance). Assumed spread: 2 bps major pair (BTC/ETH), 5 bps others, 3 bps if no symbol. If ratio >= 1, spread alone may erase edge. Breakeven tolerance: Min average P&L per trade (bps) at which strategy breaks even after costs. From OOS stats or gross edge per trade. Phase 2 trigger: At least 2 consecutive WFA windows with WFE > threshold (0.5-0.7 by base Sharpe) and Trend regime confirmed (Fragile to Stable). Conservative when Sharpe < 1. Condition is forward-looking; this report shows one WFE median across all windows. System conflict: Conflict A (Critical): fail ratio > 50% - use pessimistic scenario, no Phase 2. Conflict B (Warning): WFE below threshold - conservative allocation and extended monitoring. Runtime Kill Switch: Armed (N/M OOS Fail): N = count of windows with OOS return <= 0 or insufficient OOS trades, M = total windows. Fragile params: Parameters with sensitivity >= 0.6 (Fragile zone). Same threshold as Parameter Sensitivity block. Regime (0/3 or X/3): Count of regimes (Trend, Range, HighVol) that pass in regimeSurvivalMatrix. 0/3 = strategy not validated across market conditions. When matrix is not computed, regime validation shows as N/A.

Slippage Sensitivity Analysis

Strategy not viable - slippage sensitivity table suppressed (negative base Sharpe).

Baseline Sharpe: from WFA OOS (window-level).

WFE 0.00 (all windows, n=6)

Equity erodes as slippage increases.

The Decision Engine

Phase 1: NOT VIABLE

Allocation: 0% - strategy not viable (negative base Sharpe). Do not allocate.
Monitoring: Observation without capital. Track Buy_rsi to detect when OOS Sharpe becomes positive.
Runtime Kill Switch: TRIGGERED

Kill Switch Reset Conditions (ALL must be met):

OOS Sharpe > 0 across minimum 2 consecutive windows
Fail ratio drops below 33%
WFE (all windows) above Phase 2 threshold for this strategy
Manual review by risk manager

Phase 2: UNAVAILABLE

Strategy did not pass Phase 1 (NOT VIABLE). Phase 2 conditions do not apply until base Sharpe is positive.

Why This Works

Bull Case

Bull Case: N/A - strategy not valid (negative base Sharpe).

Bear Case (Risks)

OOS retention may reflect a single regime; 0/3 regimes pass - strategy not validated across market conditions

Recommended Fixes

Statistical Significance: Extend test by +2 years or add instruments with low cross-correlation (ρ < 0.3) to generate independent observations. Correlated instruments share the same market regime and do not increase effective sample size.(High)
Tail Risk: Add a hard tail stop or halve leverage.(High)

RISK METRICS (OUT-OF-SAMPLE)

[Risk Metrics OUT-OF-SAMPLE - single source of truth for this block] Risk Metrics (OUT-OF-SAMPLE): computed from stitched OOS equity curve or window returns. VaR/CVaR from return distribution; PF and GtP from same series (GtP = Net profit / Gross loss, can be negative). Recovery Factor = total return / |max drawdown|. When CVaR is degenerate (single tail observation), CVaR is N/A and VaR should be treated as lower-bound only. --- Definitions (same as tooltips) --- Max Drawdown: Maximum peak-to-trough decline from OOS equity curve (cumulative product of 1+r). Stored negative in data; displayed as positive %. Recovery Factor: Total return divided by |max drawdown|. Same sign as total return; negative when strategy is net losing. Sharpe Ratio (OOS): Mean return / std return from OOS series. Same scale as Sortino in this block. Differs from window-level Avg OOS Sharpe in Pro Metrics. Sortino Ratio: Mean return / downside deviation. Requires at least 5 negative-return observations. When PF < 1, positive Sortino may be artifact - do not interpret as positive signal. VaR (95%): 5th percentile of return distribution (loss). When degenerate tail (single OOS window), treat as lower-bound estimate only. VaR when degenerate: Degenerate tail - single or collapsed tail observation. VaR should be treated as lower-bound estimate only. CVaR (ES): Expected Shortfall - mean of returns in worst 5% tail. Shown as N/A when degenerate (insufficient sample). Profit Factor: Gross profit / gross loss. Capped at 20. PF < 1 = strategy is unprofitable. Gain-to-Pain: Net profit / Gross loss. Can be negative when strategy loses. Equals PF - 1 when from same series. Trade Win Rate: Share of trades with profit > 0. From backtest (e.g. Freqtrade). Shown when payload provides backtest.results.winRate. Expectancy: E = WR×Payoff - (1-WR), in units of average loss. Displayed number is % of average loss per trade (not % of capital). Consistent with Win Rate and Payoff. Expectancy (of avg loss): Not % of capital. Value is % of average loss per trade. Period Win Rate (trades): OOS win rate: share of winning trades (from risk metrics). Not the same as Profitable Windows (share of windows with positive OOS return). Tail Ratio: 95th percentile / |5th percentile|. < 0.5 = left-tail dominant; > 2 = right-tail dominant. Null when p5 ≈ 0. Payoff Ratio: Average winning period / average losing period. Very low (< 0.2) means gains are small relative to losses. Payoff > 100: Capped at 100 (no or negligible losses). Edge Stability (t): t-statistic: mean return / (std/sqrt(n)) = Sharpe×sqrt(n). Same data as Sharpe; when single OOS window, t is computed but not statistically meaningful - interpret with caution. Skewness: Distribution asymmetry. Negative = left tail risk. Kurtosis: Excess kurtosis. > 0 = fatter tails than normal. Kurtosis winsorized: Winsorized (1% tails); raw kurtosis was > 50. Durbin-Watson: Residual autocorrelation. ≈ 2 = no autocorrelation; < 1.5 = overfitting risk. Not meaningful on small samples. Sortino when PF<1: Sortino is elevated due to low downside deviation on period-level returns; inconsistent with Profit Factor. Do not interpret as positive signal.

Out-of-sample risk metrics from Walk-Forward Analysis (stitched OOS equity curve or window returns).

Max Drawdown[?]42.91%

Recovery Factor[?]-0.87

Sharpe Ratio (OOS)[?]-0.10

Sortino Ratio[?]n/a[?]

VaR (95%)[?]-1.28%[!]

CVaR (ES)[?]n/a[?]

Profit Factor (OOS)[?]0.36

Gain-to-Pain[?]-0.64

Trade Win Rate[?]n/a

Expectancy (loss units)[?]-3%(of avg loss)

Period Win Rate (trades)[?]22%

Tail Ratio[?]n/a[?]

Payoff Ratio[?]0.08

Edge Stability (t)[?]-1.13

Skewness[?]-0.06

Kurtosis[?]67.25 (win. 67.2)

Durbin-Watson[?]n/a[?]

Diagnostic: Payoff Ratio is very low (avg win is only a fraction of avg loss); gains are small relative to losses. Negative Recovery Factor indicates the strategy has not recovered from max drawdown and is net negative.

Context: OOS metrics from 1 window (N=125 returns) (small sample - interpret with caution).

Regime Context: High drawdown (Max DD: 42.9%). Consider regime-dependent risk; do not infer volatility expectations without explicit volatility estimate.

Tail Risk Profile: Fat-tailed distribution (Kurtosis: 67.2). Elevated probability of extreme events. Tail Ratio may be unreliable on small sample.

Tail Authority: CVaR (ES) is unreliable - insufficient sample for robust ES estimation (e.g. single OOS window). VaR is reported; CVaR is not. In degenerate tail cases ES would equal VaR; we do not report a ratio when CVaR is unavailable. Both VaR and any reported tail metrics should be treated as lower-bound estimates only.

Risk Attribution: Edge profile is mixed with limited payoff buffer. Payoff Ratio (0.08) is very low - gains are small relative to losses.

Risk Verdict: Insufficient data - OOS metrics from 1 window are not statistically meaningful. Collect more walk-forward windows before interpreting.

❌ UNSTABLEInsufficient data - single OOS window. Collect more walk-forward windows before interpreting.Max Leverage: 1x

This analysis is for informational purposes only and does not constitute investment advice. Past performance is not indicative of future results. All metrics are model-based and subject to assumptions (slippage, fees, liquidity).