v8 is a universal CNF endpoint with a stable response contract. Every
request returns within budget + 3s. Every response carries the
same envelope. A deterministic local reducer (UP + pure-lit + failed-literal
probing + pair-look-ahead) runs first; then tier-routed engines fire
in parallel, racing Kissat against a recursive Lambda swarm. If the
engines do not finish in time, the response is marked
REDUCED and carries learned backbones and 2-SAT
implications — the problem handed back is strictly easier
than the one handed in.
POST /cnf/solve with DIMACS dimacs and a
wall-clock budget in seconds. Response fields:
| Field | Meaning |
|---|---|
result | SAT | UNSAT | REDUCED | TIMEOUT |
engine | kissat | v6 | v7 | local_reduce | v8_watchdog |
tier | small | medium | large | xlarge (by input chars) |
tokens_in / tokens_out | bytes (1 byte = 1 token) in and out |
assignment | Full SAT model if result=SAT |
partial_backbones | Forced literals proven (any result) |
learned_2sat | Binary implications proven from the input |
cnf | Always-valid DIMACS body with audit header |
cost_usd / total_ms | Resource accounting |
| Tier | CNF chars | Strategy |
|---|---|---|
| small | < 10 KB | Kissat direct; sub-second |
| medium | 10 KB – 500 KB | Kissat || v6 race, first finisher wins |
| large | 500 KB – 50 MB | v7 supervisor (S3 offload + Lambda swarm) |
| xlarge | > 50 MB | v7 streaming ingest |
Before any engine fires, v8 runs a deterministic local reducer on the EC2 host:
If the reducer alone solves or falsifies the formula, we return immediately
with engine=local_reduce. Otherwise the residual is passed down
to the tier-routed engines, and any backbones / binary clauses already found
are seeded into the response snapshot.
A hard deadline is set at t0 + budget + 3s. The race loop
walks the futures; when the deadline hits, it harvests the state of any
future that finished between the last check and the cutoff (even
tenths of a second before), absorbs its learned clauses and backbones
into the snapshot, then shuts down the executor. The response is built
from the snapshot, not from a naked TIMEOUT.
The status depends on what the snapshot holds:
engines_tried and last_engine_error exposed.2 107 instances across 23 categories (uf250, flat200, sw100, AIS,
BMS, CBS, QG, BMC, Blocks, Logistics, AIM, LRAN, JNH, DUBOIS, PARITY,
II, HANOI, BF, SSA, PHOLE, PRET, GCP, Beijing). Crowdsourced grind via
/satlib/claim — anyone can pick an unsolved instance.
| Tier (by CNF chars) | Count | Typical wall | Primary engine |
|---|---|---|---|
| small (< 10 KB) | 1 599 | < 250 ms | kissat fast path |
| medium (10–500 KB) | 466 | 100 ms – 5 s | kissat || v6 race |
| large (500 KB – 50 MB) | 42 | 10–40 s | v7 (Lambda swarm + S3) |
| xlarge (> 50 MB) | 0 in SATLIB | — | v7 streaming |
On large-tier BMC and QG instances, v8 routinely returns REDUCED with 55–62% clause reduction and thousands of backbones — a strictly easier formula in DIMACS, with a proof trail.
Explicit high-budget option on /satlib/claim?push=... for the
remainder:
| Mode | Budget | Max workers | Cost cap |
|---|---|---|---|
| laptop | 240 s | 300 | $12 / call |
| extreme | 70 s | 1 000 | $12 / call |
Monthly spend ceiling: $50. Status: /satlib/push_status. Below those caps, normal calls pay $0 on kissat fast path, ~$0.000002 on race, and ~$0.003 on typical v7 large-tier REDUCED calls.
| Instance | Budget | Result | Backbones | Learned | Reduction | Wall |
|---|---|---|---|---|---|---|
| 100v/430cl phase tr. | 5 s | UNSAT | — | — | — | 0.2 s |
| 200v/852cl phase tr. | 10 s | SAT | — | — | — | 0.5 s |
| 400v/1704cl phase tr. | 3 s | REDUCED | 0 | 7 134 cl | cl −0.88% | 1.4 s |
| 400v/1704cl phase tr. | 10 s | REDUCED | 0 | 292 062 cl | cl −14.4% | 8.4 s |
| 800v/3408cl phase tr. | 60 s | REDUCED | 1 | 1 285 616 cl | cl −18.2% | 43 s |
| qg5-11 (984 KB) | 30 s | REDUCED | 507 | — | cl −55.5% | 31 s |
| bmc-ibm-6 (6.9 MB) | 30 s | REDUCED | 26 426 | — | cl −55.2% | 76 s |
After every /cnf/solve the supervisor sprays up to
50 async Lambda invocations (fire-and-forget) at a fleet capped at
500 concurrent LogicSpace workers. These workers keep learning
from the instance after we return — writing backbones and learned
clauses into the session's S3 dictionary. The caller gets a fast answer;
the dictionary grows in the background, so the next query on a related
instance is cheaper. This is the non-blocking trail from which the
budget + 3s guarantee is extracted.
npdollars v8 returns a strictly easier DIMACS formula on
REDUCED. .ncnf extends that contract one level up: the solver
output lives in the same grammar as the input, and every UNSAT
proof the swarm generates is lifted into a quantified taboo
appended to the problem. The signature of the whole library is
.ncnf → .ncnf with more taboos — monotone, composable,
closed. Benchmarked on seven instances with kissat 4.0.4 as the ground
oracle:
| Problem | Result | Baseline | Enriched | Self× | Induced | Transfer |
|---|---|---|---|---|---|---|
| PHP(10, 11) → PHP(12, 13) | UNSAT | 1220 ms | 1.9 ms | 642× | 1 | 31 583× |
| Mutilated 10×10 → 12×12 | UNSAT | 27.5 ms | 2.0 ms | 13.8× | 17 | 224× |
| Graph coloring 120 / 360 / k=3 | UNSAT | 7.9 ms | 1.8 ms | 4.4× | 2 | 5.0× |
| Tseitin grid 6×6 → 7×7 | UNSAT | 3.9 ms | 1.7 ms | 2.3× | 7 | 9.2× |
| Ordering GT(12) → GT(15) | UNSAT | 5.7 ms | 2.3 ms | 2.5× | 58 | 4.3× |
| Sudoku 9×9 | SAT | 2.5 ms | 2.6 ms | 1.0× | 0 | — |
| Pythagorean(200) → (400) | SAT | 1.7 ms | 1.7 ms | 1.0× | 0 | 1.1× |
PHP(12, 13) baseline is a 60 s TIMEOUT; with one lifted taboo from
PHP(10, 11) it closes in 1.9 ms. Structure paid once, re-used forever.
See /ncnf for the manifesto and
POST /ncnf/solve for the live endpoint.
POST /ncnf/lift takes flat DIMACS plus an instance_size
hint, picks a schema (explicit or inferred by factoring n_vars), and
returns an indexed .ncnf whose grounding equals the input clause set
byte-for-byte. The lift is polynomial
(O(n_clauses × max_arity × orbit_size)) under
user-supplied arity and budget caps. Verified round-trip on seven
families: PHP, GT, Mutilated, Tseitin, Coloring, Sudoku 4×4,
Adder sum.
Pure-∀ problems lift to 100% template coverage (every
orbit is full). Mixed problems with index conditions (i < j,
box cells) split into templates plus origin:lifted:partial
ground taboos; round-trip still holds because partial-orbit clauses are
preserved verbatim.
ncnf.complexity.scaling_study sweeps a family across
sizes, measures ground clauses, ncnf terms, lift time, kissat baseline,
and engine solve, then fits log-log and reports slope + r².
A high r² means the scaling law is a clean power — the
structure is genuine, not coincidence. Live at /scaling;
full tables for six families, with POST /ncnf/scaling for
any registered family.
| Family | compression ∼ n? | kissat baseline ∼ n? | compression r² |
|---|---|---|---|
| PHP | n1.90 | n6.78 | 1.000 |
| MutilatedChessboard | n1.64 | n2.62 | 0.983 |
| OrderingPrinciple | n0.97 | n0.87 | 0.758 |
| Tseitin grid | n0.39 | n1.02 | 0.352 |
| Coloring (m=3n) | n0.12 | n0.60 | 0.900 |
POST /ncnf/adder {N, target, solve} builds a ripple-adder
SAT instance that asserts “sum(b[1..N]) == target”.
Five templates, independent of N. The description file stays ~1 KB
for every N up to one million.
| N | .ncnf description | Ground clauses | kissat | Compression |
|---|---|---|---|---|
| 100 | ~810 B | 5 207 | 2 ms SAT | 6× |
| 10 000 | ~1 020 B | 1 010 014 | 260 ms SAT | 960× |
| 1 000 000 | ~1.2 KB | 140 000 000 | (description only) | 117 000× |
POST /flex-10k/solve {values:[int], target?:int} builds
the SAT instance for “sum(values) == target” with
per-value bit-vectors, grounds, and returns the kissat verdict plus the
reconstructed value list (verified by matching the input).
Stress test on the public API (random integers ∈ [0, 999]):
| K | clauses | DIMACS | kissat | round-trip | reconstruction |
|---|---|---|---|---|---|
| 10 | 1 853 | ~20 KB | 1.2 ms | 51 ms | ✓ |
| 1 000 | 268 019 | ~6.4 MB | 49.8 ms | 837 ms | ✓ |
| 10 000 | 3 240 023 | 780 MB | 626 ms | 9.7 s | ✓ |
Off-public benchmarks: K=50 000 solves in 3.6 s; K=100 000 (1.14 GB DIMACS) solves in 7.9 s. Fits on K∈{10, 100, 1k, 3k, 10k}:
The local reducer is sound by construction:
Every clause in a REDUCED response is a provable consequence of the input. The returned CNF, concatenated with the original, is equisatisfiable with the original.
/playground turns the solver into a live
inverse calculator. Declare variables over k bits, write
assertions using + − ∗ // % ^ & | ~ << >>
and the six comparisons, and the API (POST /playground/solve)
compiles a Tseitin-encoded CNF, solves with Kissat, and decodes
antecedents back to integers. Every gate is equisatisfiable, so
sub-circuits compose via CNFModule.import_module() —
the seam where SHA-256 / AES S-box / CRC32 circuits plug in next.
Measured scaling on t3.medium + Kissat 4.0.4 (2026-04-24). Every row is a live HTTP call to the public API. "Verified" means the returned antecedent was checked to satisfy the original integer equation; 64-bit semiprimes with p, q > 232 are the hard end of the bit-blasted factoring regime.
| problem | width | CNF (vars / clauses) | encode | solve | verified |
|---|---|---|---|---|---|
| factor 77 = 11·7 | 16b | 3 041 / 10 117 | 9 ms | 19 ms | ✓ |
| factor 10 403 = 101·103 | 20b | 4 681 / 15 605 | 13 ms | 46 ms | ✓ |
| factor 1 000 001 = 101·9 901 | 24b | 6 673 / 22 277 | 17 ms | 79 ms | ✓ |
| factor 4 295 229 443 = 65 537·65 539 | 40b | 18 161 / 60 805 | 92 ms | 131 ms | ✓ |
| factor 1.10×1012 semiprime | 48b | 26 017 / 87 173 | 118 ms | 1 587 ms | ✓ |
| factor 2.81×1014 semiprime | 56b | 35 281 / 118 277 | 144 ms | 150 ms | ✓ |
| factor 7.21×1016 semiprime | 64b | 45 953 / 154 117 | 227 ms | 197 ms | ✓ |
XOR+ADD inversion scales linearly. No multiplication means
the CNF grows as O(n) in bit-width. At 256 bits the public
API inverts (x XOR K1) + K2 = y in
13 ms.
| bit-width | vars | clauses | encode | solve |
|---|---|---|---|---|
| 32b | 385 | 1 124 | 1.1 ms | 3 ms |
| 64b | 769 | 2 244 | 2.1 ms | 4 ms |
| 128b | 1 537 | 4 484 | 3.8 ms | 7 ms |
| 192b | 2 305 | 6 724 | 5.7 ms | 10 ms |
| 256b | 3 073 | 8 964 | 7.7 ms | 13 ms |
Multiply stress — crossing 100k variables. Multiplication is the quadratic-clause engine (Θ(n2) partial products). The public endpoint compiles a 96-bit multiplier at 102 721 vars / 344 837 clauses in 456 ms and Kissat finds a satisfying assignment in 430 ms. A 128-bit multiplier pushes 182 017 vars / 611 333 clauses and still returns in under a second.
| multiplier width | vars | clauses | encode | solve |
|---|---|---|---|---|
| 32b | 11 713 | 39 173 | 76 ms | 48 ms |
| 48b | 26 017 | 87 173 | 118 ms | 108 ms |
| 64b | 45 953 | 154 117 | 177 ms | 193 ms |
| 80b | 71 521 | 240 005 | 364 ms | 301 ms |
| 96b | 102 721 | 344 837 | 456 ms | 430 ms |
| 128b | 182 017 | 611 333 | 895 ms | 778 ms |
Scope and honesty. Factoring arbitrary 256-bit RSA moduli is not within reach: the multiplier CNF for two 128-bit factors has ~600k clauses and kissat terminates in under a second only when the target is structured (small factors, known primality, trivial assignments). The playground is for: teaching SAT, cracking reduced ciphers, solving Diophantine constraints, enumerating antecedents of custom circuits, exploring the encoder/solver boundary. It is not a cryptographic attack toolkit.
Same problem, two engines from the same author, live on the public
APIs: npdollars bit-blasts f to CNF and solves with
Kissat — returning an exact antecedent or an UNSAT
proof. AUMA (auma.aws.monce.ai)
treats f as an objective and does Fourier probe +
greedy-flip + simulated annealing within a polynomial evaluation
budget na. Both are legitimate. They win on
different problem classes. Harness:
tools/vs_auma.py.
| problem | class | npdollars | AUMA (best a) | winner |
|---|---|---|---|---|
| factor 77 | factoring | ✓ 28 ms exact | ✗ a=2.5, val=0 but misses branch | npdollars only |
| factor 10 403 = 101·103 | factoring | ✓ 58 ms exact | ✗ no a hits zero | npdollars only |
| factor 1 000 001 = 101·9 901 | factoring | ✓ 126 ms exact | ✗ 21 952 evals, misses | npdollars only |
| Pythag c = 65 (all triples) | diophantine | ✓ 135 ms, returns 4 triples | ✗ finds 1 near, not exact | npdollars only |
| modsqrt x2≡16 (mod 101) | modular | ✓ 101 ms, returns both roots | ✓ a=2.5 at 0.4 ms | AUMA (toy domain) |
| Diophantine 3x + 5y = 41 | diophantine | ✓ 40 ms | ✓ a=2.5 at 1.2 ms | AUMA (toy domain) |
| Ramanujan 1729 = x3+y3 | sum-of-cubes | ✓ 126 ms, exact | ✓ a=2.0 at 0.3 ms | AUMA (13×13 grid) |
| x2 ≈ 42 (continuous) | real | n/a (no floats) | ✓ a=2.0 | AUMA only (diff. problem class) |
| |z2+1| = 0 (complex) | continuous | n/a | ✓ a=2.0 at 2 ms | AUMA only |
Scoreboard on discrete problems: npdollars 4 exclusive wins, AUMA 3 toy-grid wins, 2 real-number problems are outside npdollars' semantics. Every factoring problem and every exhaustive-enumeration problem goes to npdollars alone. AUMA's wins are in domains of 102–103 integer grid points where exhaustive search is essentially free — AUMA is fast because the space is small, not because it's the right tool.
Three things only npdollars does:
AUMA's exclusive territory is continuous optimization. Smooth real objectives (polynomial roots, modular gradient, Riemann-style near-zeros) are where Fourier probe + SA shines and where bit-blasting hits its floor. The two projects are complementary: AUMA is the right tool when the domain is dense and the objective is smooth; npdollars is the right tool when the domain is discrete and the answer must be exact.
Local reducer draws on
LogicSpace
(Dana, 2026) for the DualTree / PolySAT primitives that power v6's
recursive worker. Tier routing and the
tokens_in → tokens_out observability are specific to v8.
Theoretical foundation:
Dana Theorem (2024, indicator → CNF in poly time),
AUMA (Dana, 2023).
Charles Dana · Monce SAS · npdollars.aws.monce.ai · April 2026
Co-Authored-By: Claude (Anthropic)