CSP: Universal Constraint Satisfaction via the P/NP/P Sandwich

Charles Dana & Claude — April 2026
csp.aws.monce.ai

Abstract. We present a universal CSP solver built on indexed constraint collections (ONES/ZEROS) with LIFO propagation, failed value probing, budgeted backtracking with Markov chain branch evaluation, and a stochastic walk adapted from the Logistics framework. The solver decomposes the solving process into three zones — P-Left (polynomial deduction), NP-Strip (budgeted search + walk), P-Right (structural shortcuts) — following the P/NP/P sandwich architecture. Zero dependencies. Pure Python.

1. Framework

A CSP instance is a tuple (X, Φ, ONES, ZEROS) where:

Component	Definition
`X = {x₁, ..., x_n}`	Variables
`Φ = (Φ₁, ..., Φ_n)`	Domains — candidate value sets, mutable during solving
`ONES[(i,v)]`	Forbidden partners: if x_i=v then x_j≠w
`ZEROS[(i,v)]`	Compatible partners: x_i=v and x_j=w can coexist

The conflict predicate C(i,v,j,w)→{0,1} is called once to build the ONES/ZEROS indices, then discarded. All solving uses indexed lookups — no black-box queries.

1.1 Why indexed collections, not a predicate

Constraints are data, not queries. The ONES index can be walked, filtered, and updated. Propagation becomes set arithmetic on the index. Learning is native — new forbidden pairs are added directly to ONES. The vulnerability ratio |ONES[(i,v)]| / (|ZEROS[(i,v)]| + 1) gives a per-value fragility score for variable/value ordering.

2. The P/NP/P Sandwich

The solver decomposes into three zones, following the architecture established in the Sudoku and SAT papers:

Zone	Tier	Technique	Type
P-Left	0	Singleton propagation (LIFO BCP)	Polynomial
	1	Hidden singles (value unique in constraint group)	Polynomial
	2	Naked subsets (k vars sharing k values → lock)	Polynomial
	3	Failed value probing (try → propagate → eliminate)	Polynomial
P-Right	4	Symmetry breaking (identical vars → fix one)	Polynomial
NP-Strip	5	cdcl(a) — budgeted backtracking + chain eval	O(n^a)
NP-Strip	6	Markov chain walk (from Logistics framework)	Stochastic

P-Left techniques run in a fixpoint loop — when any technique makes progress, the loop restarts from Tier 0. This maximizes deductive reach before entering the NP-Strip.

3. LIFO Propagation (Tier 0)

When a domain becomes singleton {v}, all forbidden partners from ONES[(i,v)] are removed from neighbor domains. Removals that create new singletons are processed depth-first (LIFO stack), catching contradictions early and maximizing cascade depth.

propagate():
    while stack not empty:
        (i, v) = stack.pop()           # LIFO
        if v ∉ Φ_i or |Φ_i| ≠ 1: continue
        for (j, w) in ONES[(i, v)]:
            if w ∈ Φ_j:
                Φ_j.remove(w)
                trail.append((j, w))
                if Φ_j = ∅: return CONTRADICTION
                if |Φ_j| = 1: stack.push((j, sole(Φ_j)))
    return OK

4. Failed Value Probing (Tier 3)

For each variable i with |Φ_i| > 1, for each value v ∈ Φ_i: temporarily assign x_i=v, propagate. If contradiction: v is dead — permanently remove from Φ_i. Uses snapshot/restore for clean undo.

This is BCP-1 generalized from SAT to arbitrary CSP domains.

5. Markov Chain Branch Evaluation (Tier 5)

Before committing to a decision in cdcl, a short Markov chain run from the partial assignment reveals the effective difficulty of completing it:

evaluate_branch(i, v):
    assignment = partial ∪ {x_i = v} ∪ random_fill(undecided)
    for step in 1..K:
        j = random(conflicted_variables(assignment))
        assignment[j] = argmin_{w ∈ Φ_j} conflicts(j, w)
    return total_conflicts(assignment)  # residual = branch quality

Residual = 0: branch leads to a solution. Low residual: easy subproblem. High residual: likely dead end. This is a Monte Carlo estimate of the remaining search depth, information that propagation alone cannot provide.

6. Markov Chain Walk (Tier 6)

Adapted directly from the Logistics framework (Dana, 2023). The walk operates on the reduced domains from P-Left:

1. Greedy init: assign each variable to min-conflict value
2. Repeat:
   a. Pick random conflicted variable (from incremental counter)
   b. Sample K candidates from Φ_i, pick min-conflicts
   c. Accept: improvement (always), lateral (always), worse (SA probability)
   d. Adaptive restart when stale

Conflict tracking is incremental using the ONES index: O(degree) per update instead of O(n²). The walk's convergence bound from the Logistics paper: O(n^a+1 M² ln M).

7. Connection to the Dana Theorem

The Dana Theorem for CSP (2024, extended 2026) states: any indicator function over a finite dataset can be encoded as a CSP in polynomial time. Snake implements this construction; this solver operates on the resulting instances.

Data ⟶ [Dana Theorem / Snake] ⟶ CSP ⟶ [Solver / Sandwich] ⟶ Assignment

The ONES/ZEROS collections are the CSP analogue of AUMA's weighted clause set W. The conflict predicate is the analogue of clause evaluation. The Markov chain walk is the same framework that originated in the Logistics bin packing paper.

8. Experimental Results

Problem	Scale	Technique	Time	Status
Graph coloring	Petersen (χ=3)	cdcl	<1ms	SAT
Graph coloring	Petersen, 2 colors	probing	<1ms	UNSAT (proven)
Graph coloring	G(100, 0.1), 5 colors	cdcl	5ms	SAT
Sudoku	Easy (30+ clues)	propagation	<1ms	SAT
Sudoku	Hard (17+ clues)	cdcl	4ms	SAT
N-Queens	n=50	cdcl	1.6s	SAT
N-Queens	n=100	walk	4.0s	SAT

8.1 Technique attribution

The sandwich correctly routes problems to the right tier:

Easy Sudoku → P-Left Tier 0 (propagation alone solves it)
Hard Sudoku → P-Left eliminates 8 values via probing, cdcl finishes in 4ms
Petersen 2-color → P-Left Tier 3 proves UNSAT (probing finds all values dead)
100-Queens → NP-Strip walk (search space too large for cdcl, walk converges in 162 restarts)

9. CSP-Classification Equivalence

Any CSP is a boolean classification problem on the Cartesian product of its domains. Any boolean classification problem is a CSP. They are the same object.

9.1 CSP → Classification

Let a CSP have variables X₁, ..., X_m with finite domains D₁, ..., D_m and constraints C. Define:

Ω = D₁ × D₂ × ... × D_m

Every ω ∈ Ω is a complete assignment. The constraint predicate defines an indicator:

f(ω) = 1 if ω satisfies all constraints (licit), 0 otherwise

The CSP IS a boolean classification on Ω. Licit = positive class. Illicit = negative class. Solving = finding a positive.

9.2 Classification → CSP (Dana Theorem)

Any indicator function over a finite dataset can be encoded as a CSP in polynomial time (Dana Theorem for CSP, 2024/2026). The boolean tests on features become constraints. Snake implements this construction: oppose() finds separating tests, the bucket chain builds constraints, the 30 literal types provide the test library.

9.3 The equivalence

CSP

Classification

f: Ω → {0,1}
licit or not

↔

Dana Theorem
indicator → CSP in poly time

Boolean classification
on the Cartesian product

≡

Constraint satisfaction
on the feature space

The difference is operational:

Classification asks: given an element, is it positive or negative? (evaluation — polynomial)
CSP asks: find me a positive element. (search — NP-hard)

Same function, different question. Every tool from one domain applies to the other:

Classification tool	CSP application
Snake lookalike probability	Value ordering — "candidates like this tend to be licit"
Oppose profiles	Constraint-aware test selection — which boolean tests generalize
Core/noise divergence	Structural vs base-rate signal for candidate quality
Audit trail	Explanation of why a partial assignment is likely licit or illicit

CSP tool	Classification application
Arc consistency	Feature domain reduction — eliminate provably irrelevant values
Constraint propagation	Deductive feature pruning — if feature_k=v then feature_j≠w
Failed value probing	Feature testing — "if I assume this literal, contradiction?"
Backtracking	Decision tree with rollback — explore classification hypotheses

10. Research Leads

10.1 Markov chain mixing time as difficulty certificate

The walk's convergence rate on a partial assignment is related to the mixing time of the chain restricted to compatible solutions. Fast mixing = many solutions = easy subproblem. The /landscape endpoint exposes this: the trajectory_shape field classifies instances as exponential_decay (easy), linear_decay (moderate), plateau (hard), or oscillation (chaotic).

Open question: Can we prove that mixing time < poly(n) implies the instance belongs to a tractable CSP class? The Jerrum-Sinclair framework covers matchings and colorings; a general CSP version relating ONES index density to mixing time would be new.

10.2 Vulnerability ratio as ordering heuristic

The ratio |ONES[(i,v)]| / (|ZEROS[(i,v)]| + 1) is a value ordering heuristic derived from the indexed collection structure. We compared it against LCV (least constraining value), most-constraining (fail-fast), and random on 67 instances across coloring, N-Queens, Sudoku, and SAT.

Heuristic	Wins	Win%	Strength
vulnerability	32	57%	Structured asymmetric problems (Sudoku, SAT)
most_constraining	17	30%	Dense coloring (fail-fast pruning)
random	6	11%	N-Queens (symmetric — ordering doesn't matter)
lcv	1	2%	Rarely best, often tied with vulnerability

Key finding: vulnerability is not universal. On Sudoku (hard): 20 nodes vs LCV's 29. On N-Queens (n=16): 1246 nodes vs LCV's 84. The ONES density ratio captures asymmetry in the constraint structure — when constraints affect values unequally, vulnerability exploits the difference. When constraints are symmetric (N-Queens: all column/diagonal conflicts look alike), vulnerability degenerates to uniform scoring and wastes nodes.

The deeper result: the divergence between vulnerability and LCV is itself informative. When vulnerability ≈ LCV, the problem has symmetric ONES density — use fail-fast or random. When they diverge, asymmetric structure exists — vulnerability should be preferred. The fingerprint (section 10.7) could detect which regime applies and select the heuristic adaptively.

10.3 PAC-learning transfer via the equivalence

The CSP-Classification equivalence (Section 9) implies that PAC-learning results transfer. If a CSP's constraint structure has low VC dimension (as a classification problem on Ω), then a polynomial number of random samples suffice to learn the decision boundary. This means: random walk trajectories can efficiently learn which regions of Ω contain solutions. The connection between VC dimension of the ONES structure and constraint propagation depth is unexplored.

10.4 P-Left as a standalone preprocessor

The /probe endpoint runs P-Left (propagation + hidden singles + naked subsets + probing) and returns tightened domains. This preprocessor is solver-agnostic — it could feed into Gecode, OR-Tools, or any CSP backend. Measuring domain reduction across MiniZinc benchmark instances would quantify the preprocessor's standalone value.

10.5 Snake-guided value ordering

Train a Snake model on the walk's trajectories during solving: features = local constraint structure around each decision, label = did the branch succeed. The model learns which variable-value combinations tend to appear in low-conflict states for this specific instance. The crossover point where the learned model outperforms min-conflicts — and whether that crossover is polynomial in n — is the key empirical question.

10.6 Fourier probing on CSP domains

AUMA's order-1 Fourier probe (2n+1 evaluations) identifies which bits "want" to be 1 in the optimal solution. The CSP analogue: for each (variable, value) pair, the compatibility score measures what fraction of each neighbor's domain is conflict-free. This is the order-1 CSP coefficient. Can we define order-2 (pairwise value interactions)? If so, the full AUMA pipeline applies natively to CSP without binarization.

10.7 Conflict trajectory as problem fingerprint

The /landscape endpoint records conflict count over time across multiple walks. The shape of this trajectory is a fingerprint of the problem's structure.

Experiment (75 instances: coloring, queens, sudoku, SAT). Run 5 short walks (200 steps each, ~3ms total), classify the trajectory shape, then solve fully and record which tier succeeded.

Shape	Meaning	Best tier	Instances
trivial	3+ walks solve in 200 steps	cdcl	52 (69%)
exponential_decay	Conflicts drop fast	propagation	1
linear_decay	Steady improvement, no convergence	walk (more budget)	10
plateau	No improvement	walk or cdcl	12

Results:

Problem type	Instances	Routing accuracy	Notes
Graph coloring	48	43/48 (90%)	Trivial detection perfect for sparse/high-k
SAT (3-SAT)	20	20/20 (100%)	All trivial with clause watching
N-Queens	4	2/4	Small n trivial, large n plateau
Sudoku	3	1/3	Mixed shapes across difficulties
Overall	75	66/75 (88.0%)	3.3ms overhead (0.9% of solve time)

The fingerprint fast path is integrated into the solver: 3 micro-walks (100 steps, ~2ms) detect trivial instances. If 2+ solve, skip P-Left fixpoint and go straight to cdcl. Catches 69% of instances correctly with sub-1% overhead.

10.8 ONES-native nogood learning

When cdcl backtracks because all values for a variable fail, the current partial assignment is a nogood. We extract the decided (variable, value) pairs and add pairwise ONES entries. This is lighter than traditional CSP nogood learning (which stores full tuples) and integrates directly with the ONES index — no separate data structure. The question: what's the right granularity? SAT learns 1-UIP clauses. The CSP analogue of 1-UIP on the ONES index is an open formalization problem.

11. Lineage

AUMA (2023)         — {f: {0,1}ⁿ → R} ⊂ W, everything is weighted MAXSAT
Dana Theorem (2024) — indicator → CNF in poly time
Snake               — SAT classifier, 30 literal types, oppose profiles
PolySAT             — 16 polynomial techniques on SAT
Sudoku sandwich     — P/NP/P decomposition for Sudoku
Logistics           — Markov chain CSP, superposition + random walk
CSP                 — universal solver, indexed ONES/ZEROS, P/NP/P sandwich

11. API Reference

Method	Endpoint	Description
POST	`/csp`	Universal CSP (variables + domains + ONES constraints)
POST	`/color`	Graph coloring (adjacency + n_colors)
POST	`/nqueens`	N-Queens (board size n)
POST	`/sudoku`	9×9 Sudoku (81-int grid, 0=empty)
POST	`/sat`	Boolean satisfiability (DIMACS clauses)
POST	`/schedule`	Task scheduling (conflicts + time slots)
POST	`/analyze`	CSP characterization without solving (difficulty, vulnerability)
POST	`/probe`	P-Left only — tighten domains, no search
POST	`/landscape`	Markov chain conflict trajectory + fingerprint
GET	`/health`	Health check
GET	`/docs`	OpenAPI interactive documentation