CSP: Universal Constraint Satisfaction via the P/NP/P Sandwich

Charles Dana & Claude — April 2026
csp.aws.monce.ai
Abstract. We present a universal CSP solver built on indexed constraint collections (ONES/ZEROS) with LIFO propagation, failed value probing, budgeted backtracking with Markov chain branch evaluation, and a stochastic walk adapted from the Logistics framework. The solver decomposes the solving process into three zones — P-Left (polynomial deduction), NP-Strip (budgeted search + walk), P-Right (structural shortcuts) — following the P/NP/P sandwich architecture. Zero dependencies. Pure Python.

1. Framework

A CSP instance is a tuple (X, Φ, ONES, ZEROS) where:

ComponentDefinition
X = {x1, ..., xn}Variables
Φ = (Φ1, ..., Φn)Domains — candidate value sets, mutable during solving
ONES[(i,v)]Forbidden partners: if xi=v then xj≠w
ZEROS[(i,v)]Compatible partners: xi=v and xj=w can coexist

The conflict predicate C(i,v,j,w)→{0,1} is called once to build the ONES/ZEROS indices, then discarded. All solving uses indexed lookups — no black-box queries.

1.1 Why indexed collections, not a predicate

Constraints are data, not queries. The ONES index can be walked, filtered, and updated. Propagation becomes set arithmetic on the index. Learning is native — new forbidden pairs are added directly to ONES. The vulnerability ratio |ONES[(i,v)]| / (|ZEROS[(i,v)]| + 1) gives a per-value fragility score for variable/value ordering.

2. The P/NP/P Sandwich

The solver decomposes into three zones, following the architecture established in the Sudoku and SAT papers:

ZoneTierTechniqueType
P-Left0Singleton propagation (LIFO BCP)Polynomial
1Hidden singles (value unique in constraint group)Polynomial
2Naked subsets (k vars sharing k values → lock)Polynomial
3Failed value probing (try → propagate → eliminate)Polynomial
P-Right4Symmetry breaking (identical vars → fix one)Polynomial
NP-Strip5cdcl(a) — budgeted backtracking + chain evalO(na)
6Markov chain walk (from Logistics framework)Stochastic

P-Left techniques run in a fixpoint loop — when any technique makes progress, the loop restarts from Tier 0. This maximizes deductive reach before entering the NP-Strip.

3. LIFO Propagation (Tier 0)

When a domain becomes singleton {v}, all forbidden partners from ONES[(i,v)] are removed from neighbor domains. Removals that create new singletons are processed depth-first (LIFO stack), catching contradictions early and maximizing cascade depth.

propagate():
    while stack not empty:
        (i, v) = stack.pop()           # LIFO
        if v ∉ Φi or |Φi| ≠ 1: continue
        for (j, w) in ONES[(i, v)]:
            if w ∈ Φj:
                Φj.remove(w)
                trail.append((j, w))
                if Φj = ∅: return CONTRADICTION
                if |Φj| = 1: stack.push((j, sole(Φj)))
    return OK

4. Failed Value Probing (Tier 3)

For each variable i with |Φi| > 1, for each value v ∈ Φi: temporarily assign xi=v, propagate. If contradiction: v is dead — permanently remove from Φi. Uses snapshot/restore for clean undo.

This is BCP-1 generalized from SAT to arbitrary CSP domains.

5. Markov Chain Branch Evaluation (Tier 5)

Before committing to a decision in cdcl, a short Markov chain run from the partial assignment reveals the effective difficulty of completing it:

evaluate_branch(i, v):
    assignment = partial ∪ {xi = v} ∪ random_fill(undecided)
    for step in 1..K:
        j = random(conflicted_variables(assignment))
        assignment[j] = argminw ∈ Φj conflicts(j, w)
    return total_conflicts(assignment)  # residual = branch quality

Residual = 0: branch leads to a solution. Low residual: easy subproblem. High residual: likely dead end. This is a Monte Carlo estimate of the remaining search depth, information that propagation alone cannot provide.

6. Markov Chain Walk (Tier 6)

Adapted directly from the Logistics framework (Dana, 2023). The walk operates on the reduced domains from P-Left:

1. Greedy init: assign each variable to min-conflict value
2. Repeat:
   a. Pick random conflicted variable (from incremental counter)
   b. Sample K candidates from Φi, pick min-conflicts
   c. Accept: improvement (always), lateral (always), worse (SA probability)
   d. Adaptive restart when stale

Conflict tracking is incremental using the ONES index: O(degree) per update instead of O(n²). The walk's convergence bound from the Logistics paper: O(na+1 M² ln M).

7. Connection to the Dana Theorem

The Dana Theorem for CSP (2024, extended 2026) states: any indicator function over a finite dataset can be encoded as a CSP in polynomial time. Snake implements this construction; this solver operates on the resulting instances.

Data ⟶ [Dana Theorem / Snake] ⟶ CSP ⟶ [Solver / Sandwich] ⟶ Assignment

The ONES/ZEROS collections are the CSP analogue of AUMA's weighted clause set W. The conflict predicate is the analogue of clause evaluation. The Markov chain walk is the same framework that originated in the Logistics bin packing paper.

8. Experimental Results

ProblemScaleTechniqueTimeStatus
Graph coloringPetersen (χ=3)cdcl<1msSAT
Graph coloringPetersen, 2 colorsprobing<1msUNSAT (proven)
Graph coloringG(100, 0.1), 5 colorscdcl5msSAT
SudokuEasy (30+ clues)propagation<1msSAT
SudokuHard (17+ clues)cdcl4msSAT
N-Queensn=50cdcl1.6sSAT
N-Queensn=100walk4.0sSAT

8.1 Technique attribution

The sandwich correctly routes problems to the right tier:

9. CSP-Classification Equivalence

Any CSP is a boolean classification problem on the Cartesian product of its domains. Any boolean classification problem is a CSP. They are the same object.

9.1 CSP → Classification

Let a CSP have variables X1, ..., Xm with finite domains D1, ..., Dm and constraints C. Define:

Ω = D1 × D2 × ... × Dm

Every ω ∈ Ω is a complete assignment. The constraint predicate defines an indicator:

f(ω) = 1 if ω satisfies all constraints (licit), 0 otherwise

The CSP IS a boolean classification on Ω. Licit = positive class. Illicit = negative class. Solving = finding a positive.

9.2 Classification → CSP (Dana Theorem)

Any indicator function over a finite dataset can be encoded as a CSP in polynomial time (Dana Theorem for CSP, 2024/2026). The boolean tests on features become constraints. Snake implements this construction: oppose() finds separating tests, the bucket chain builds constraints, the 30 literal types provide the test library.

9.3 The equivalence

CSP
Classification
f: Ω → {0,1}
licit or not
Dana Theorem
indicator → CSP in poly time
Boolean classification
on the Cartesian product
Constraint satisfaction
on the feature space

The difference is operational:

Same function, different question. Every tool from one domain applies to the other:

Classification toolCSP application
Snake lookalike probabilityValue ordering — "candidates like this tend to be licit"
Oppose profilesConstraint-aware test selection — which boolean tests generalize
Core/noise divergenceStructural vs base-rate signal for candidate quality
Audit trailExplanation of why a partial assignment is likely licit or illicit
CSP toolClassification application
Arc consistencyFeature domain reduction — eliminate provably irrelevant values
Constraint propagationDeductive feature pruning — if feature_k=v then feature_j≠w
Failed value probingFeature testing — "if I assume this literal, contradiction?"
BacktrackingDecision tree with rollback — explore classification hypotheses

10. Research Leads

10.1 Markov chain mixing time as difficulty certificate

The walk's convergence rate on a partial assignment is related to the mixing time of the chain restricted to compatible solutions. Fast mixing = many solutions = easy subproblem. The /landscape endpoint exposes this: the trajectory_shape field classifies instances as exponential_decay (easy), linear_decay (moderate), plateau (hard), or oscillation (chaotic).

Open question: Can we prove that mixing time < poly(n) implies the instance belongs to a tractable CSP class? The Jerrum-Sinclair framework covers matchings and colorings; a general CSP version relating ONES index density to mixing time would be new.

10.2 Vulnerability ratio as ordering heuristic

The ratio |ONES[(i,v)]| / (|ZEROS[(i,v)]| + 1) is a value ordering heuristic derived from the indexed collection structure. We compared it against LCV (least constraining value), most-constraining (fail-fast), and random on 67 instances across coloring, N-Queens, Sudoku, and SAT.

HeuristicWinsWin%Strength
vulnerability3257%Structured asymmetric problems (Sudoku, SAT)
most_constraining1730%Dense coloring (fail-fast pruning)
random611%N-Queens (symmetric — ordering doesn't matter)
lcv12%Rarely best, often tied with vulnerability

Key finding: vulnerability is not universal. On Sudoku (hard): 20 nodes vs LCV's 29. On N-Queens (n=16): 1246 nodes vs LCV's 84. The ONES density ratio captures asymmetry in the constraint structure — when constraints affect values unequally, vulnerability exploits the difference. When constraints are symmetric (N-Queens: all column/diagonal conflicts look alike), vulnerability degenerates to uniform scoring and wastes nodes.

The deeper result: the divergence between vulnerability and LCV is itself informative. When vulnerability ≈ LCV, the problem has symmetric ONES density — use fail-fast or random. When they diverge, asymmetric structure exists — vulnerability should be preferred. The fingerprint (section 10.7) could detect which regime applies and select the heuristic adaptively.

10.3 PAC-learning transfer via the equivalence

The CSP-Classification equivalence (Section 9) implies that PAC-learning results transfer. If a CSP's constraint structure has low VC dimension (as a classification problem on Ω), then a polynomial number of random samples suffice to learn the decision boundary. This means: random walk trajectories can efficiently learn which regions of Ω contain solutions. The connection between VC dimension of the ONES structure and constraint propagation depth is unexplored.

10.4 P-Left as a standalone preprocessor

The /probe endpoint runs P-Left (propagation + hidden singles + naked subsets + probing) and returns tightened domains. This preprocessor is solver-agnostic — it could feed into Gecode, OR-Tools, or any CSP backend. Measuring domain reduction across MiniZinc benchmark instances would quantify the preprocessor's standalone value.

10.5 Snake-guided value ordering

Train a Snake model on the walk's trajectories during solving: features = local constraint structure around each decision, label = did the branch succeed. The model learns which variable-value combinations tend to appear in low-conflict states for this specific instance. The crossover point where the learned model outperforms min-conflicts — and whether that crossover is polynomial in n — is the key empirical question.

10.6 Fourier probing on CSP domains

AUMA's order-1 Fourier probe (2n+1 evaluations) identifies which bits "want" to be 1 in the optimal solution. The CSP analogue: for each (variable, value) pair, the compatibility score measures what fraction of each neighbor's domain is conflict-free. This is the order-1 CSP coefficient. Can we define order-2 (pairwise value interactions)? If so, the full AUMA pipeline applies natively to CSP without binarization.

10.7 Conflict trajectory as problem fingerprint

The /landscape endpoint records conflict count over time across multiple walks. The shape of this trajectory is a fingerprint of the problem's structure.

Experiment (75 instances: coloring, queens, sudoku, SAT). Run 5 short walks (200 steps each, ~3ms total), classify the trajectory shape, then solve fully and record which tier succeeded.

ShapeMeaningBest tierInstances
trivial3+ walks solve in 200 stepscdcl52 (69%)
exponential_decayConflicts drop fastpropagation1
linear_decaySteady improvement, no convergencewalk (more budget)10
plateauNo improvementwalk or cdcl12

Results:

Problem typeInstancesRouting accuracyNotes
Graph coloring4843/48 (90%)Trivial detection perfect for sparse/high-k
SAT (3-SAT)2020/20 (100%)All trivial with clause watching
N-Queens42/4Small n trivial, large n plateau
Sudoku31/3Mixed shapes across difficulties
Overall7566/75 (88.0%)3.3ms overhead (0.9% of solve time)

The fingerprint fast path is integrated into the solver: 3 micro-walks (100 steps, ~2ms) detect trivial instances. If 2+ solve, skip P-Left fixpoint and go straight to cdcl. Catches 69% of instances correctly with sub-1% overhead.

10.8 ONES-native nogood learning

When cdcl backtracks because all values for a variable fail, the current partial assignment is a nogood. We extract the decided (variable, value) pairs and add pairwise ONES entries. This is lighter than traditional CSP nogood learning (which stores full tuples) and integrates directly with the ONES index — no separate data structure. The question: what's the right granularity? SAT learns 1-UIP clauses. The CSP analogue of 1-UIP on the ONES index is an open formalization problem.

11. Lineage

AUMA (2023)         — {f: {0,1}n → R} ⊂ W, everything is weighted MAXSAT
Dana Theorem (2024) — indicator → CNF in poly time
Snake               — SAT classifier, 30 literal types, oppose profiles
PolySAT             — 16 polynomial techniques on SAT
Sudoku sandwich     — P/NP/P decomposition for Sudoku
Logistics           — Markov chain CSP, superposition + random walk
CSP                 — universal solver, indexed ONES/ZEROS, P/NP/P sandwich

11. API Reference

MethodEndpointDescription
POST/cspUniversal CSP (variables + domains + ONES constraints)
POST/colorGraph coloring (adjacency + n_colors)
POST/nqueensN-Queens (board size n)
POST/sudoku9×9 Sudoku (81-int grid, 0=empty)
POST/satBoolean satisfiability (DIMACS clauses)
POST/scheduleTask scheduling (conflicts + time slots)
POST/analyzeCSP characterization without solving (difficulty, vulnerability)
POST/probeP-Left only — tighten domains, no search
POST/landscapeMarkov chain conflict trajectory + fingerprint
GET/healthHealth check
GET/docsOpenAPI interactive documentation