P-Time = NP-dollars v8

Charles Dana · Monce SAS · April 2026

npdollars.aws.monce.ai

Abstract

v8 is a universal CNF endpoint with a stable response contract. Every request returns within budget + 3s. Every response carries the same envelope. A deterministic local reducer (UP + pure-lit + failed-literal probing + pair-look-ahead) runs first; then tier-routed engines fire in parallel, racing Kissat against a recursive Lambda swarm. If the engines do not finish in time, the response is marked REDUCED and carries learned backbones and 2-SAT implications — the problem handed back is strictly easier than the one handed in.

Every response learns something. Honest TIMEOUT is reserved for the case where nothing was learned.

1. The Contract

POST /cnf/solve with DIMACS dimacs and a wall-clock budget in seconds. Response fields:

FieldMeaning
resultSAT | UNSAT | REDUCED | TIMEOUT
enginekissat | v6 | v7 | local_reduce | v8_watchdog
tiersmall | medium | large | xlarge (by input chars)
tokens_in / tokens_outbytes (1 byte = 1 token) in and out
assignmentFull SAT model if result=SAT
partial_backbonesForced literals proven (any result)
learned_2satBinary implications proven from the input
cnfAlways-valid DIMACS body with audit header
cost_usd / total_msResource accounting

2. The Tier Router

TierCNF charsStrategy
small< 10 KBKissat direct; sub-second
medium10 KB – 500 KBKissat || v6 race, first finisher wins
large500 KB – 50 MBv7 supervisor (S3 offload + Lambda swarm)
xlarge> 50 MBv7 streaming ingest

3. The Local Reducer — why TIMEOUT is rare

Before any engine fires, v8 runs a deterministic local reducer on the EC2 host:

  1. Unit propagation to fixpoint — O(n+m). Every forced literal becomes a backbone.
  2. Pure-literal elimination — O(n+m). Every variable with only one polarity is set and erased.
  3. Failed-literal probing (FLP) — for each var v, trial v=T under UP: if UP fails, −v is a proven backbone. Guarantees ≥ 0 backbones per pass.
  4. Pair look-ahead — trial v=T ∧ w=T: if UP fails, we learn the binary implication (−v ∨ −w). These are polynomial enrichments: 2-SAT is in P (Aspvall–Plass–Tarjan), so every binary clause learned moves mass from the 3-SAT body toward a 2-SAT core.

If the reducer alone solves or falsifies the formula, we return immediately with engine=local_reduce. Otherwise the residual is passed down to the tier-routed engines, and any backbones / binary clauses already found are seeded into the response snapshot.

4. The Watchdog — bar-passing timeouts

A hard deadline is set at t0 + budget + 3s. The race loop walks the futures; when the deadline hits, it harvests the state of any future that finished between the last check and the cutoff (even tenths of a second before), absorbs its learned clauses and backbones into the snapshot, then shuts down the executor. The response is built from the snapshot, not from a naked TIMEOUT.

The status depends on what the snapshot holds:

5. Results

SATLIB (April 2026) — live at /satlib

2 107 instances across 23 categories (uf250, flat200, sw100, AIS, BMS, CBS, QG, BMC, Blocks, Logistics, AIM, LRAN, JNH, DUBOIS, PARITY, II, HANOI, BF, SSA, PHOLE, PRET, GCP, Beijing). Crowdsourced grind via /satlib/claim — anyone can pick an unsolved instance.

Tier (by CNF chars)CountTypical wallPrimary engine
small (< 10 KB)1 599< 250 mskissat fast path
medium (10–500 KB)466100 ms – 5 skissat || v6 race
large (500 KB – 50 MB)4210–40 sv7 (Lambda swarm + S3)
xlarge (> 50 MB)0 in SATLIBv7 streaming

On large-tier BMC and QG instances, v8 routinely returns REDUCED with 55–62% clause reduction and thousands of backbones — a strictly easier formula in DIMACS, with a proof trail.

Push mode

Explicit high-budget option on /satlib/claim?push=... for the remainder:

ModeBudgetMax workersCost cap
laptop240 s300$12 / call
extreme70 s1 000$12 / call

Monthly spend ceiling: $50. Status: /satlib/push_status. Below those caps, normal calls pay $0 on kissat fast path, ~$0.000002 on race, and ~$0.003 on typical v7 large-tier REDUCED calls.

Bar tests (April 2026)

InstanceBudgetResultBackbonesLearnedReductionWall
100v/430cl phase tr.5 sUNSAT0.2 s
200v/852cl phase tr.10 sSAT0.5 s
400v/1704cl phase tr.3 sREDUCED07 134 clcl −0.88%1.4 s
400v/1704cl phase tr.10 sREDUCED0292 062 clcl −14.4%8.4 s
800v/3408cl phase tr.60 sREDUCED11 285 616 clcl −18.2%43 s
qg5-11 (984 KB)30 sREDUCED507cl −55.5%31 s
bmc-ibm-6 (6.9 MB)30 sREDUCED26 426cl −55.2%76 s

6. Slow Trail — learning after the response

After every /cnf/solve the supervisor sprays up to 50 async Lambda invocations (fire-and-forget) at a fleet capped at 500 concurrent LogicSpace workers. These workers keep learning from the instance after we return — writing backbones and learned clauses into the session's S3 dictionary. The caller gets a fast answer; the dictionary grows in the background, so the next query on a related instance is cheaper. This is the non-blocking trail from which the budget + 3s guarantee is extracted.

6b. .ncnf — structure that amortizes

npdollars v8 returns a strictly easier DIMACS formula on REDUCED. .ncnf extends that contract one level up: the solver output lives in the same grammar as the input, and every UNSAT proof the swarm generates is lifted into a quantified taboo appended to the problem. The signature of the whole library is .ncnf → .ncnf with more taboos — monotone, composable, closed. Benchmarked on seven instances with kissat 4.0.4 as the ground oracle:

ProblemResultBaselineEnrichedSelf×InducedTransfer
PHP(10, 11) → PHP(12, 13)UNSAT1220 ms1.9 ms642×131 583×
Mutilated 10×10 → 12×12UNSAT27.5 ms2.0 ms13.8×17224×
Graph coloring 120 / 360 / k=3UNSAT7.9 ms1.8 ms4.4×25.0×
Tseitin grid 6×6 → 7×7UNSAT3.9 ms1.7 ms2.3×79.2×
Ordering GT(12) → GT(15)UNSAT5.7 ms2.3 ms2.5×584.3×
Sudoku 9×9SAT2.5 ms2.6 ms1.0×0
Pythagorean(200) → (400)SAT1.7 ms1.7 ms1.0×01.1×

PHP(12, 13) baseline is a 60 s TIMEOUT; with one lifted taboo from PHP(10, 11) it closes in 1.9 ms. Structure paid once, re-used forever. See /ncnf for the manifesto and POST /ncnf/solve for the live endpoint.

6c. cnf → ncnf — the inverse arrow

POST /ncnf/lift takes flat DIMACS plus an instance_size hint, picks a schema (explicit or inferred by factoring n_vars), and returns an indexed .ncnf whose grounding equals the input clause set byte-for-byte. The lift is polynomial (O(n_clauses × max_arity × orbit_size)) under user-supplied arity and budget caps. Verified round-trip on seven families: PHP, GT, Mutilated, Tseitin, Coloring, Sudoku 4×4, Adder sum.

Pure-∀ problems lift to 100% template coverage (every orbit is full). Mixed problems with index conditions (i < j, box cells) split into templates plus origin:lifted:partial ground taboos; round-trip still holds because partial-orbit clauses are preserved verbatim.

6d. Scaling — log-log fits with r²

ncnf.complexity.scaling_study sweeps a family across sizes, measures ground clauses, ncnf terms, lift time, kissat baseline, and engine solve, then fits log-log and reports slope + r². A high r² means the scaling law is a clean power — the structure is genuine, not coincidence. Live at /scaling; full tables for six families, with POST /ncnf/scaling for any registered family.

Familycompression ∼ n?kissat baseline ∼ n?compression r²
PHPn1.90n6.781.000
MutilatedChessboardn1.64n2.620.983
OrderingPrinciplen0.97n0.870.758
Tseitin gridn0.39n1.020.352
Coloring (m=3n)n0.12n0.600.900

6e. Adder sum — sum of 1M bits in 1.2 KB

POST /ncnf/adder {N, target, solve} builds a ripple-adder SAT instance that asserts “sum(b[1..N]) == target”. Five templates, independent of N. The description file stays ~1 KB for every N up to one million.

N.ncnf descriptionGround clauseskissatCompression
100~810 B5 2072 ms SAT
10 000~1 020 B1 010 014260 ms SAT960×
1 000 000~1.2 KB140 000 000(description only)117 000×

6f. /flex-10k — sum any list of integers

POST /flex-10k/solve {values:[int], target?:int} builds the SAT instance for “sum(values) == target” with per-value bit-vectors, grounds, and returns the kissat verdict plus the reconstructed value list (verified by matching the input).

Stress test on the public API (random integers ∈ [0, 999]):

KclausesDIMACSkissatround-tripreconstruction
101 853~20 KB1.2 ms51 ms
1 000268 019~6.4 MB49.8 ms837 ms
10 0003 240 023780 MB626 ms9.7 s

Off-public benchmarks: K=50 000 solves in 3.6 s; K=100 000 (1.14 GB DIMACS) solves in 7.9 s. Fits on K∈{10, 100, 1k, 3k, 10k}:

|clauses|(K) ∼ K1.08   r²=1.000
kissat(K) ∼ K0.92   r²=0.987 — sublinear

7. Soundness

The local reducer is sound by construction:

Every clause in a REDUCED response is a provable consequence of the input. The returned CNF, concatenated with the original, is equisatisfiable with the original.

8. Playground — bit-blasted arithmetic inversion

/playground turns the solver into a live inverse calculator. Declare variables over k bits, write assertions using + − ∗ // % ^ & | ~ << >> and the six comparisons, and the API (POST /playground/solve) compiles a Tseitin-encoded CNF, solves with Kissat, and decodes antecedents back to integers. Every gate is equisatisfiable, so sub-circuits compose via CNFModule.import_module() — the seam where SHA-256 / AES S-box / CRC32 circuits plug in next.

Measured scaling on t3.medium + Kissat 4.0.4 (2026-04-24). Every row is a live HTTP call to the public API. "Verified" means the returned antecedent was checked to satisfy the original integer equation; 64-bit semiprimes with p, q > 232 are the hard end of the bit-blasted factoring regime.

problemwidthCNF (vars / clauses)encodesolveverified
factor 77 = 11·716b3 041 / 10 1179 ms19 ms
factor 10 403 = 101·10320b4 681 / 15 60513 ms46 ms
factor 1 000 001 = 101·9 90124b6 673 / 22 27717 ms79 ms
factor 4 295 229 443 = 65 537·65 53940b18 161 / 60 80592 ms131 ms
factor 1.10×1012 semiprime48b26 017 / 87 173118 ms1 587 ms
factor 2.81×1014 semiprime56b35 281 / 118 277144 ms150 ms
factor 7.21×1016 semiprime64b45 953 / 154 117227 ms197 ms

XOR+ADD inversion scales linearly. No multiplication means the CNF grows as O(n) in bit-width. At 256 bits the public API inverts (x XOR K1) + K2 = y in 13 ms.

bit-widthvarsclausesencodesolve
32b3851 1241.1 ms3 ms
64b7692 2442.1 ms4 ms
128b1 5374 4843.8 ms7 ms
192b2 3056 7245.7 ms10 ms
256b3 0738 9647.7 ms13 ms

Multiply stress — crossing 100k variables. Multiplication is the quadratic-clause engine (Θ(n2) partial products). The public endpoint compiles a 96-bit multiplier at 102 721 vars / 344 837 clauses in 456 ms and Kissat finds a satisfying assignment in 430 ms. A 128-bit multiplier pushes 182 017 vars / 611 333 clauses and still returns in under a second.

multiplier widthvarsclausesencodesolve
32b11 71339 17376 ms48 ms
48b26 01787 173118 ms108 ms
64b45 953154 117177 ms193 ms
80b71 521240 005364 ms301 ms
96b102 721344 837456 ms430 ms
128b182 017611 333895 ms778 ms

Scope and honesty. Factoring arbitrary 256-bit RSA moduli is not within reach: the multiplier CNF for two 128-bit factors has ~600k clauses and kissat terminates in under a second only when the target is structured (small factors, known primality, trivial assignments). The playground is for: teaching SAT, cracking reduced ciphers, solving Diophantine constraints, enumerating antecedents of custom circuits, exploring the encoder/solver boundary. It is not a cryptographic attack toolkit.

9. npdollars vs AUMA — head-to-head

Same problem, two engines from the same author, live on the public APIs: npdollars bit-blasts f to CNF and solves with Kissat — returning an exact antecedent or an UNSAT proof. AUMA (auma.aws.monce.ai) treats f as an objective and does Fourier probe + greedy-flip + simulated annealing within a polynomial evaluation budget na. Both are legitimate. They win on different problem classes. Harness: tools/vs_auma.py.

problemclassnpdollarsAUMA (best a)winner
factor 77factoring✓ 28 ms exact✗ a=2.5, val=0 but misses branchnpdollars only
factor 10 403 = 101·103factoring✓ 58 ms exact✗ no a hits zeronpdollars only
factor 1 000 001 = 101·9 901factoring✓ 126 ms exact✗ 21 952 evals, missesnpdollars only
Pythag c = 65 (all triples)diophantine✓ 135 ms, returns 4 triples✗ finds 1 near, not exactnpdollars only
modsqrt x2≡16 (mod 101)modular✓ 101 ms, returns both roots✓ a=2.5 at 0.4 msAUMA (toy domain)
Diophantine 3x + 5y = 41diophantine✓ 40 ms✓ a=2.5 at 1.2 msAUMA (toy domain)
Ramanujan 1729 = x3+y3sum-of-cubes✓ 126 ms, exact✓ a=2.0 at 0.3 msAUMA (13×13 grid)
x2 ≈ 42 (continuous)realn/a (no floats)✓ a=2.0AUMA only (diff. problem class)
|z2+1| = 0 (complex)continuousn/a✓ a=2.0 at 2 msAUMA only

Scoreboard on discrete problems: npdollars 4 exclusive wins, AUMA 3 toy-grid wins, 2 real-number problems are outside npdollars' semantics. Every factoring problem and every exhaustive-enumeration problem goes to npdollars alone. AUMA's wins are in domains of 102–103 integer grid points where exhaustive search is essentially free — AUMA is fast because the space is small, not because it's the right tool.

Three things only npdollars does:

  1. UNSAT proofs. AUMA has no "nonexistence" concept — it can only report the best x found. npdollars on a 25-bit prime certifies no factor pair exists in 607 ms. That is a qualitatively different output.
  2. All-antecedent enumeration. Blocking clauses give every distinct antecedent within budget. AUMA returns one point. Pythag c=65 → npdollars returns all four primitive and scaled triples.
  3. Scale separation on factoring. npdollars factors a 64-bit verified semiprime on the public API in 197 ms (45 953 vars / 154 117 clauses). AUMA's grid search explodes as θ(na) over integer ranges this large and cannot hit the exact factor pair at any budget tested.

AUMA's exclusive territory is continuous optimization. Smooth real objectives (polynomial roots, modular gradient, Riemann-style near-zeros) are where Fourier probe + SA shines and where bit-blasting hits its floor. The two projects are complementary: AUMA is the right tool when the domain is dense and the objective is smooth; npdollars is the right tool when the domain is discrete and the answer must be exact.

10. Connection to Prior Work

Local reducer draws on LogicSpace (Dana, 2026) for the DualTree / PolySAT primitives that power v6's recursive worker. Tier routing and the tokens_in → tokens_out observability are specific to v8. Theoretical foundation: Dana Theorem (2024, indicator → CNF in poly time), AUMA (Dana, 2023).

Charles Dana · Monce SAS · npdollars.aws.monce.ai · April 2026
Co-Authored-By: Claude (Anthropic)