Drawdown Focus: BTC/USDT 1h Analysis | Freqtrade | Kiploks

WHY THIS MATTERS

Backtests show what worked.

Kiploks shows what can survive.

This page answers one question:

"Can I safely deploy capital here?"

Strategy:SampleStrategy (Freqtrade)03 Mar 2026, 20:53
Asset:BTC/USDT·Timeframe:1h·Exchange:binance
Test Period:2024-01-012026-02-08
Freqtrade config symbol: BTCUSDT, endDate: 2026-02-08, exchange: binance, fee_open: 0.0011, fee_close: 0.0011, startDate: 2024-01-01, timeframe: 1h, initialBalance: 1000, Trades: 507

FINAL VERDICT

DO NOT DEPLOY - Tail Risk

Diagnostic case: The Black Swan Magnet (© Kiploks)

Kiploks Robustness Score: 0 / 100Bayesian pass probability: 50%

One or more hard gates failed. DO NOT DEPLOY until blocking modules are fixed.

Deployment Gate
Validation Gates
  • Execution Buffer - Net Edge (Net Profit > 15 bps, period-level)(-32.56 bps vs 15 bps) ESTIMATED - Net edge below 15 bps or edge deficit after fees
  • Stability (WFE > 0.5)(0.05 vs 0.5) - WFE below 0.5 (OOS/IS ratio too low)
  • Data Quality Guard (test period ≥ 2 years)(769 vs 730 days)
Statistical Confidence
  • Statistical Significance (t-Stat > 1.96)
  • t-Stat (OOS Edge) > 2.0 (same metric as above, stricter threshold)(-1.13 vs 2) - t-Stat below 2.0 (OOS edge not significant). Same metric as Statistical Significance; 2.0 is stricter than 1.96. Single OOS window - interpret with caution.
Critical Failures
  • Deployment is blocked because the following hard gate(s) failed: Execution Buffer - Net Edge (Net Profit > 15 bps, period-level); Stability (WFE > 0.5).
  • Kurtosis > 15 indicates fat tails.
  • Negative skewness - losses cluster in extreme events.
Execution Note: Backtest execution settings were missing. The system applied a standard Safety Buffer (0.05% slippage, 0.1% fee). For Institutional Grade (AAA), provide exact exchange API fees and liquidity-based slippage. You can set slippage and commission in your backtest or integration config to use exact values and remove this note.
Operational Insight: Net edge is negative; no slippage headroom. Status: Edge Deficit.

Robustness score is 0 because a module blocks (e.g. Risk, Execution, or Stability). Potential score if unblocked: 3. Fix blocking modules first. Even unblocked, score remains in TRASH range (0-20) - no meaningful improvement.

ROBUSTNESS SCORE

Formula (multiplicative penalties)
0 / 100 (FAIL)
Blocked by Walk-Forward & OOS, Risk Profile, Execution Realism modules

Diagnosis

Execution score (5/100) is below the blocking threshold of 10. Edge does not survive 10 bps slippage - strategy may not be realizable in live conditions. Review transaction costs, reduce turnover, or improve edge.

░░░░░░░░░░░░░░░░░░░░
Methodology Note[?]
Breakdown (contributing factors)
Data Quality Guard
████████████
100
→ Adequate data period for full audit
Walk-Forward & OOS(40%)(blocking)
░░░░░░░░░░░░
0
→ BLOCKED
Risk Profile(30%)(blocking)
░░░░░░░░░░░░
0
→ BLOCKED
Parameter Stability(20%)
█████████░░░
75
→ Parameters stable across sensitivity tests
Execution Realism(10%)(blocking)
░░░░░░░░░░░░
0
→ BLOCKED (raw 5, threshold 10)

DATA QUALITY GUARD

Data Quality Guard (DQG)
Score: 100%PASSDQG Factor: 1.00Contribution: 40.0

Robust Net Edge (Safe Edge): No net profit - outlier check N/A

ModuleScoreVerdict
Gap Density100%PASS
Outlier Influencen/aN/A
Look-Ahead Bias100%PASS
Spread/Liquidity100%PASS
Sampling & Over-fitting100%PASS
Price Integrity100%PASS

BENCHMARK METRICS

Walk-Forward validation: [A] OOS-only metrics, [B] period-level (WFE, retention, profitable windows), [C] full backtest. Kill Switch and verdict use these values; at small N interpret with caution.
Quick Win (WFA summary)
[A] OOS equity-based OOS equity-based metrics from validation windows (canonical).From: WFA window OOS
OOS Sharpe: -0.56(same as below)OOS Calmar: -0.16OOS Max DD (validation only): -13.99%
[B] WFA period-levelWFA period-level: WFE, retention, profitable windows, trend match, win rate degradation.
Relative Loss Magnitude: 166.0%
[C] Full backtest contextFull backtest (IS) context; canonical full_backtest.From: full backtest (no WFA split)
Full Sharpe: -0.04Full Calmar: -0.34
Profitable Windows[?]3/6 all windows (50%) FAIL
Profitable OOS among IS>0 windows(same base as WFE)2/4
OOS/IS Trend MatchYES
Win Rate Change (OOS − IS, pp)[?]-2.1 pp RED FLAG
IS Win Rate / OOS Win Rate23.7% / 21.6%
IS: full backtest; OOS: OOS trades
Statistical Robustness (OOS Validation)

WFE is conditional on IS>0 only - not a full-sample metric. Do not interpret WFE median as a full-strategy summary; the strategy may be loss-making overall. Min/median/max and variance use the same N (windows with IS > 0) (N=4): median is the middle value (odd N) or the average of the two middle values (even N). When overall OOS is negative, median WFE > 1 means the loss is driven by a subset of windows (in some windows OOS was better than IS). Large spread (min to max) with negative total OOS suggests a few windows or outliers dominate; interpret with caution. OOS Retention and Relative Change use all windows (N=6). Profitable counts use all windows. Retention over all windows reflects full P&L.

WFE (Median OOS/IS) distributionn/a[?]

WFE Variancen/a[?]
Parameter Stability Index (PSI)[?]n/a[?]
Edge Half-Life (T1/2, OOS)[?]2 windows (~240.0d)
WFA Windows6
Exp. OOS Return ± Vol[?]-2.2% ± 4.0%
Worst Window Return (N=6, 1 obs)[?]
-9.50%
Avg OOS Sharpe (window-level)[?]-0.56
Avg OOS Calmar (approx)[?]-0.16
Optimization Gain (IS)[?]5.24%
Relative Loss Magnitude (OOS/IS)[?]166.0%(N=6 windows)

When both IS and OOS are negative, this ratio is not interpretable as retention (share of profit preserved); it is shown for transparency only and indicates relative magnitude of losses (|OOS| vs |IS|). >100% = OOS losses larger than IS (degradation); <100% = OOS losses smaller than IS. Retention 119% and Relative Change -19% describe the same thing: when mean(IS) < 0, Relative Change = -(Retention − 1), so both mean OOS worse than IS.Raw: sum(IS) = -7.93%, sum(OOS) = -13.17% (both negative).

sum(OOS)/sum(IS) over all windows(N=6 windows)

Relative Change (OOS−IS)/|IS|(mean OOS vs mean IS)[?]-66.0%

When mean(IS) < 0: Relative Change = -(Retention − 1).

Negative = OOS worse than IS (degradation).

(mean(OOS)-mean(IS))/|mean(IS)|(N=6 windows)

Advanced Diagnostic Indicators
Verdict[?]
🔴REJECT

Immediate Kill Switch triggered. Net Edge < 10 bps (current: -32.56 bps); Bayesian pass probability < 65% (current: 50%); Regime adaptability: 0/3 pass (min 1); Consecutive OOS drawdown windows: 2 (limit: 1)

Capital Kill Switch - The Red Line[?]
2 consecutive (all windows) (limit: 1)

Next OOS window in minus → turn off bot

Summary (possible)
Regime Failure: Strategy failed across all tested market regimes (0/3 pass). Logic is not adapted to current market.
OOS Retention 166.0% (OOS losses larger than IS - does not imply no overfitting). REJECT driven by regime failure (0/3) and low Bayesian pass probability - insufficient evidence for deployment.
Statistical Confidence: Bayesian pass probability 50% with 6 WFA windows - REJECT driven by regime failure and insufficient evidence for deployment.
Verdict: REJECT. Strategy is not viable.

WALK-FORWARD VALIDATION

Time Stability & Overfitting Control
Performance Transfer
In-Sample (IS) vs Out-of-Sample (OOS)

Walk-Forward Analysis - Continuous View

IS (In-Sample) + OOS (Out-of-Sample) equity on a single timeline

Total OOS Return
-2056.7%
OOS Win Rate
3 / 6
IS Avg Return
-132.2%
OOS Avg Return
-219.5%
Overfitting Score
MEDIUM
Equity Curve - IS vs OOS segments (continuous)
1.091.030.970.910.850.79
ISOOS GoodOOS FragileOOS Failed
OOS performance per period (bar)
P1282.5%99.2%P2266.6%-549.2%P3304.1%100.0%P4-467.6%-950.1%P5188.4%-43.8%P6-1367.0%27.2%14%0-14%
ISOOS
WFE (Efficiency):
0.05
Consistency:
50% (2/4)
Performance Degradation:
-66.0%
Failed Windows:3 / 6
Consistency uses only windows with IS > 0. Failed Windows = windows with OOS ≤ 0 or insufficient OOS trades (in some, OOS may still be better than IS). Different denominators.
Overfitting Risk:MEDIUM (n/a)
When failure rate > 30%, overfitting risk may be understated; consider verdict and failure rate.
Professional WFA
This analysis was stored with an older formula version. Values are shown as saved. Re-run or re-submit the analysis for the latest institutional-grade rules and WFE Advanced fields.
Grade:BBB - RESEARCH ONLY(override: Verdict FAIL and failure rate > 30%; grade capped to BBB - RESEARCH ONLY.)
Research only. Do not deploy to production without further validation.
Pre-verdict module scores (composite; overall verdict and grade above)
WFE Advanced:[?] ROBUST(score 100)(pre-verdict composite; overall verdict FAIL)
Overall verdict FAIL - do not rely on this score alone.
Regime: STABLE
Monte Carlo: DOUBTFUL(method: Legacy)(P(positive)=10%)[?]
Stress: RESILIENT(recovery: HIGH)[?]
Equity curve: WEAK
Window Breakdown
Period 1
[Fragile]
Opt: 2.8%(63)
Val: 1.0%(25)
█░░░░░░░░░░░36%
Period 2
[Fail]
Opt: 2.7%(68)
Val: -5.5%(16)
░░░░░░░░░░░░n/a
Diagnosis: Alpha Reversal (Overfitted)
Period 3
[Fragile]
Opt: 3.0%(59)
Val: 1.0%(29)
█░░░░░░░░░░░33%
Period 4
[Fail]
Opt: -4.7%(93)
Val: -9.5%(23)
░░░░░░░░░░░░n/a
Period 5
[Fail]
Opt: 1.9%(71)
Val: -0.4%(16)
░░░░░░░░░░░░n/a
Diagnosis: Alpha Reversal (Overfitted)
Period 6
[Fragile]
Opt: -13.7%(60)
Val: 0.3%(16)
░░░░░░░░░░░░n/a
Failed Windows Details
Period 2: Validation return is non-positive
Period 4: Validation return is non-positive
Period 5: Validation return is non-positive
▶ Verdict: FAIL
Failure rate exceeds 30% - verdict forced to FAIL.

PARAMETER SENSITIVITY & STABILITY

Methodology: Sensitivity = R² (correlation²) between parameter value and trial score; we use it as a proxy for 'outcome strongly tied to parameter' (tuning matters). High R² = parameter significantly predicts outcome. Magnitude (slope per unit change) is a separate planned metric; Risk Score does not use slope. Sensitivity values: 2 decimal places. Risk Score: integer (floor). From optimization trials or WFA windows.
Suggested Mitigation: Risk Neutral
Parameter
Optimal[?]
Topology[?]
Sensitivity
Status
Buy_rsi
42
0.29
🟢 Stable
Suggested Mitigation: Risk Neutral
Exit_short_rsi
22
0.06
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_p1
0.03
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_p2
0.07
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_p3
0.07
0
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_t1
87
0
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_t2
44
0.02
🟢 Stable
Suggested Mitigation: Risk Neutral
Roi_t3
17
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Sell_rsi
95
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Short_rsi
89
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Stoploss
-0.30
0.01
🟢 Stable
Suggested Mitigation: Risk Neutral
Scale (classification bands): Scale: round sensitivity to 2 decimals, then band. Stable [0, 0.30); Reliable [0.30, 0.40); Needs Tuning [0.40, 0.60); Fragile >= 0.6. Boundaries: 0.30 = Reliable (start); 0.40 = Needs Tuning (start); 0.60 = Fragile (start). penalisedCount = params with rounded sensitivity >= 0.4. Penalty: 2 per Needs Tuning, 5 per Fragile. Ceiling = 100 − 5×penalisedCount. Final = max(0, floor(min(Raw, ceiling))). Order: round → band → penalisedCount → Base → Penalty → Raw → Ceiling → Final. Score: integer (floor).
Sensitivity (R²): strength of linear relationship between parameter value and score (predictability), not magnitude of effect. Slope (impact per unit change) is a separate planned metric; Risk Score uses R² only. From optimization trials or WFA windows.
Topology (when available): curve shape from trials; flat = stable, sharp peak = fragile.
Sensitivity: implemented as R² (correlation²). R² measures strength of linear relationship (predictability), not magnitude of change; we use it as proxy for parameter-outcome tie (high R² = fragility). For true sensitivity (magnitude per unit parameter), derivative-based metric is planned. Values: 2 decimals; Risk Score: integer (floor).
DIAGNOSTIC SUMMARY
1. Local Topology & Stability[?]
2. Governance Impact (Suggested Mitigation)[?]
Governance metrics below do not affect Risk Score or Deployment; advisory only.
Signal Attenuation: 63.8%
Sharpe Retention (IS ➔ OOS): [?]254.2%(OOS > IS; may indicate sample or regime bias - interpret with caution)
Sharpe Drift (OOS vs IS): [?]100.0 p.p.(OOS Sharpe > IS, improvement)
Max Tail-Risk Reduction: [?]30.7%(Risk Reduced)
3. Multi-Parameter Coupling[?]
Coupling analysis: No dominant unstable interactions detected.
AUDIT VERDICT
Deployment Status: APPROVED (no Decay check)Approval does not include the Decay gate (Decay N/A). Enable HOLD_WHEN_DECAY_UNAVAILABLE to require Decay before APPROVED.
Performance Decay: N/A (min 3 periods required for decay check). When N/A, Decay condition is omitted; HOLD_WHEN_DECAY_UNAVAILABLE (backend) can force HOLD instead of APPROVED.Warning: Decay N/A and Fail-Safe is off; deployment not based on decay.
Final Decision = (Risk Score Verdict) AND (Performance Decay < 80% when Decay is defined; when Decay is N/A this condition is omitted) AND (Min OOS Trades met). REJECTED when any applied condition fails. Performance Decay is a deployment gate (step 2). Governance (Sharpe Drift, Tail-Risk, etc.) is advisory only; does not change the result.
Risk Score: [?]Base 71 − Penalty 0 Risk Class: LOW (71/100) (passing)
Pro-Note: Highest sensitivity: Buy_rsi (0.29, Stable).

TRADING INTENSITY & COST DRAG

Execution: Simple (estimated fees)

Results use estimated fees/slippage. Provide exact exchange parameters for Institutional-grade analysis.
Position velocity (holding-period) (198.0x) is 2.36x institutional turnover (83.8x). Overlapping positions likely; institutional turnover is used for cost and rebate.
Market Impact (Layer 2.5)
  • ADV $458.757 is very low; model assumptions may not hold.
INTERPRETIVE SUMMARY
Period gross is negative (profit factor < 1) but per-trade gross is positive. Strategy loses at calendar level; accumulated costs exceed profit. Required Alpha Boost is execution offset only.
Baseline AUM:$1,000
Avg Trades / Month:20
Annual Turnover (institutional):[?]83.8x
Position velocity (holding-period)[?]198.0x
Avg Holding Time:[?]30.8h
Avg position size[?]69.6% of AUM
Cross-check (trades × utilization)[?]~167.6x
Implied overlap factor[?]2.36x
EFFICIENCY & COST LIMITS
Profit Factor (Gross, full backtest):[?]0.60
Profit Factor (Net, full backtest):0.52
Cost / Edge Ratio:[?]n/a (negative gross edge)
Avg Net Profit / Trade (bps)[?]-7.46 bps
Break-even Slippage:
Tolerance:n/a (negative edge)
Margin of Safety:n/a
Safety Margin:[?]n/a (negative edge)
BES Status:EDGE DEFICIT
Failure Mode:Negative period Net Edge (cost > profit)
COST DECOMPOSITION (CAGR)
Exchange Fees:-33.5%
Slippage:-16.8%
Market Impact (est.)[?]N/A - participation ratio too high for model

Participation ratio exceeds 15% of ADV; square-root model out of range.

Total Cost Drag:-50.3%

Market impact not included (model out of range); total is fees + slippage only.

Rebate Capture:0.48 bps/trade(≈ 0.40% CAGR at current turnover)

Rebate Capture is not included in Total Cost Drag; informational (potential savings with maker-heavy execution).

When gross edge is negative, cost decomposition shows cost allocation; improving execution alone cannot make the strategy profitable.

CAPACITY & MARKET IMPACT
Estimated AUM Capacity:
N/A - model out of range (participation > 15% ADV)
ADV Utilization:
Top 5 traded pairs:n/a
Portfolio weighted:n/a
Market Impact Model:
Assumption:Square-root law
Liquidity regime:Micro / low liquidity
SLIPPAGE SENSITIVITY (NON-LINEAR)
AUM Size
Slippage CAGR
Net CAGR
~$100k[N/A]
N/A
N/A
~$1.0M[N/A]
N/A
N/A
~$5.0M[N/A]
N/A
N/A
~$10.0M[N/A]
N/A
N/A
EXECUTION HARDENING
Order Type Bias:Limit-biased
Taker / Maker Ratio:40 / 60
Limit Fill Probability:21.0%
Opportunity Cost (Fill Decay):17.80 bps
Adverse Selection (Cost):4.00 bps
Latency Sensitivity:Medium
Toxic Flow Risk:Medium
Moderate adverse selection risk in fast markets.
SENSITIVITY TO ALPHA DECAY
Alpha Half-Life:[?]480.0 days

Long half-life from WFA validation; high-turnover strategies may have shorter effective decay in practice.

Win Rate Sensitivity:[?]n/a (not meaningful)
RISK & CONTROLS
Primary Constraint:Gross edge negative (alpha-deficit)
Gross edge (per trade, at institutional 83.8x):[?]+52.6 bps

High value reflects low institutional turnover denominator; most capital cost is in overlap periods.

Gross edge (period/CAGR):[?]negative
Available Control Levers:
Reduce trading frequencyLow
Increase entry threshold (signal strength)Low
Shift to maker-only executionLow
STATUS
Deployment Class:Micro-cap / Research-only
COST ADAPTABILITY:❌ FAIL (Required: +7.5 bps)
Required Alpha Boost (bps per trade)[?]7.46 bps
CAPACITY GOVERNANCE:⚠ SCALE-LIMITED
EXECUTION RISK:⚠ WARNING
Confidence Level:Low
20 trades/mo, low signal-to-noise. Negative Z suggests loss-making; low confidence (small sample).
Z-Score: -1.48(negative = loss-making tendency; low confidence (small sample). Indicative only; statistical significance not claimed at this n.)

STRATEGY ACTION PLAN

Slippage Sensitivity Analysis
Strategy not viable - slippage sensitivity table suppressed (negative base Sharpe).

Baseline Sharpe: from WFA OOS (window-level).

WFE 0.00 (all windows, n=6)

Equity erodes as slippage increases.

The Decision Engine
Phase 1: NOT VIABLE
  • Allocation: 0% - strategy not viable (negative base Sharpe). Do not allocate.
  • Monitoring: Observation without capital. Track Buy_rsi to detect when OOS Sharpe becomes positive.
  • Runtime Kill Switch: TRIGGERED
Kill Switch Reset Conditions (ALL must be met):
  • OOS Sharpe > 0 across minimum 2 consecutive windows
  • Fail ratio drops below 33%
  • WFE (all windows) above Phase 2 threshold for this strategy
  • Manual review by risk manager
Phase 2: UNAVAILABLE

Strategy did not pass Phase 1 (NOT VIABLE). Phase 2 conditions do not apply until base Sharpe is positive.

Why This Works
Bull Case

Bull Case: N/A - strategy not valid (negative base Sharpe).

Bear Case (Risks)
  • OOS retention may reflect a single regime; 0/3 regimes pass - strategy not validated across market conditions
Recommended Fixes
  • Statistical Significance: Extend test by +2 years or add instruments with low cross-correlation (ρ < 0.3) to generate independent observations. Correlated instruments share the same market regime and do not increase effective sample size.(High)
  • Tail Risk: Add a hard tail stop or halve leverage.(High)

RISK METRICS (OUT-OF-SAMPLE)

Out-of-sample risk metrics from Walk-Forward Analysis (stitched OOS equity curve or window returns).

Max Drawdown[?]42.91%
|
Recovery Factor[?]-0.87
Sharpe Ratio (OOS)[?]-0.10
|
Sortino Ratio[?]n/a[?]
VaR (95%)[?]-1.28%[!]
|
CVaR (ES)[?]n/a[?]
Profit Factor (OOS)[?]0.36
|
Gain-to-Pain[?]-0.64
Trade Win Rate[?]n/a
|
Expectancy (loss units)[?]-3%(of avg loss)
Period Win Rate (trades)[?]22%
|
Tail Ratio[?]n/a[?]
Payoff Ratio[?]0.08
|
Edge Stability (t)[?]-1.13
Skewness[?]-0.06
|
Kurtosis[?]67.25 (win. 67.2)
Durbin-Watson[?]n/a[?]
|
Diagnostic: Payoff Ratio is very low (avg win is only a fraction of avg loss); gains are small relative to losses. Negative Recovery Factor indicates the strategy has not recovered from max drawdown and is net negative.
Context: OOS metrics from 1 window (N=125 returns) (small sample - interpret with caution).
Regime Context: High drawdown (Max DD: 42.9%). Consider regime-dependent risk; do not infer volatility expectations without explicit volatility estimate.
Tail Risk Profile: Fat-tailed distribution (Kurtosis: 67.2). Elevated probability of extreme events. Tail Ratio may be unreliable on small sample.
Tail Authority: CVaR (ES) is unreliable - insufficient sample for robust ES estimation (e.g. single OOS window). VaR is reported; CVaR is not. In degenerate tail cases ES would equal VaR; we do not report a ratio when CVaR is unavailable. Both VaR and any reported tail metrics should be treated as lower-bound estimates only.
Risk Attribution: Edge profile is mixed with limited payoff buffer. Payoff Ratio (0.08) is very low - gains are small relative to losses.
Risk Verdict: Insufficient data - OOS metrics from 1 window are not statistically meaningful. Collect more walk-forward windows before interpreting.
UNSTABLEInsufficient data - single OOS window. Collect more walk-forward windows before interpreting.Max Leverage: 1x

This analysis is for informational purposes only and does not constitute investment advice. Past performance is not indicative of future results. All metrics are model-based and subject to assumptions (slippage, fees, liquidity).