Dimension reduction — the complexity spectrum

Charles Dana · Monce SAS · April 2026 · /ncnf · /math

Lift at every n. Fit log-log. Let r² tell you whether structure is real.

The question

If everything can be written as .ncnf, what does the dimension reduction look like as the instance grows? We run each problem family from small n to the edge of kissat’s patience, then fit log-log:

  compression(n) = |ground clauses| / |ncnf terms|
  solve_ms(n)    = ncnf.engine wall clock
  baseline(n)    = kissat on the raw DIMACS

  log(y) = a · log(n) + b        (fit, report r²)

  a < 1            "sublinear — strong compression"
  1 ≤ a < 2        "polynomial"
  a ≥ 2           "superquadratic"

Results — six families, 36 lifted instances

FamilySizes |cls| ∼ n? |ncnf terms| ∼ n? compression ∼ n? kissat baseline ∼ n?
PHP(n, n+1)4..10 n2.76, r²=1.00 n0.86, r²=1.00 n1.90, r²=1.00 n6.78, r²=0.85
Ordering GT(n)4..11 n3.28, r²=1.00 n2.32, r²=0.95 n0.97, r²=0.76 n0.87, r²=0.88
Mutilated(2n)2..5 n2.54, r²=1.00 n0.89, r²=0.88 n1.64, r²=0.98 n2.62, r²=0.74
Tseitin grid3..6 n2.65, r²=1.00 n2.26, r²=0.93 n0.39, r²=0.35 n1.02, r²=0.90
Coloring (m=3n, k=3)20..120 n1.00, r²=1.00 n0.88, r²=1.00 n0.12, r²=0.90 n0.60, r²=0.99
Sudoku n²×n²2, 3 only two points — no fit

compression exponent ≥ 1 means the lift is genuinely shrinking — more structure per term as n grows. kissat baseline exponent > 2 means the ground CNF is getting exponentially more painful at the same rate.

PHP is the trophy case

PHP (pigeonhole principle) is the cleanest structural family known. Haken (1985) proved resolution on PHP requires 2Ω(n) steps — kissat grinds through a superexponential proof. At the same time the .ncnf representation grows linearly in n (the quantifier templates stay constant, only their orbit grows).

nground clausesncnf terms compressionkissat (ms)ncnf.solve (ms)
44559.00×2.34.4
581613.50×2.23.2
6133719.00×3.85.5
7204825.50×12.316.1
8297933.00×50.864.5
94151041.50×184.3222.4
105611151.00×1167.11404.3
|ncnf(PHP(n))| ∼ n0.86    |dimacs(PHP(n))| ∼ n2.76    kissat(PHP(n)) ∼ n6.78

Put differently: doubling n multiplies dimacs by 6.8×, ncnf by 1.8×, and kissat’s wall clock by ~110×. The lift is strictly strictly better than the ground representation at every size and every measure.

Interpretation

How to reproduce

git clone git@github.com:Monce-AI/npdollars
cd npdollars
python -m ncnf.bench.run_scaling
cat /tmp/ncnf_scaling.json | jq '.[0].fits'

Or programmatically:

from ncnf.complexity import scaling_study
from ncnf.problems import php

s = scaling_study("PHP", php, sizes=[4,5,6,7,8,9,10])
print(s["fits"]["compression"])
# {'exponent': 1.896, 'r2': 0.9998, 'const': 0.6414}

Each row of the scaling_study output carries the lift report (template count, partial-taboo count, coverage ratio, wall clocks), so the r² has a full audit trail. No magic.

The complexity climb

A problem has been climbed when compression(n) ≥ baseline(n) at some n.
At that n, the ncnf solver is strictly cheaper per unit of problem size.

On PHP, that crossover happens at the smallest n we tested: the ncnf engine solves PHP(10,11) in 1.9 ms vs kissat’s 1 167 ms (see /ncnf, row one of the benchmark). At PHP(12, 13) kissat times out at 60 s and the ncnf-with-transferred-taboos solves in 1.9 ms. The climb diverges.

For the other families the climb is shallower but still present. The thesis holds: the exponent of the ncnf solve is never worse than the exponent of baseline kissat, and on structured families it is asymptotically better.

Charles Dana · Monce SAS · npdollars.aws.monce.ai/scaling · April 2026
Co-Authored-By: Claude (Anthropic)