Grantor-Specific Rule Taxonomies

A deterministic, sandboxed rule-evaluation layer for nonprofit grant automation: version-controlled grantor taxonomies, frozen-context execution, immutable audit verdicts, and explicit handoffs to federal and state compliance pipelines.

This guide is part of the Core Architecture & Compliance Mapping framework and defines the discrete evaluation layer that turns funder-imposed conditions into deterministic, auditable verdicts. The scope is deliberately narrow: it specifies how grantor rule definitions are loaded, validated, compiled into a frozen execution graph, and evaluated against a normalized payload to emit a structured pass/fail verdict with a cryptographic audit trail. It is written for nonprofit operations leads, grant program managers, Python automation developers, and compliance officers who need a stateless, reproducible contract for funder adjudication.

This module does not parse raw documents, convert currencies, reconcile ledgers, or generate regulatory filings. Structural normalization is owned upstream by Field Mapping & Normalization; the metadata envelope and provenance fields are sealed by Compliance Metadata Standards; and the actual federal and state artifacts are produced downstream by IRS 990 Data Schema Mapping and State Charity Registration Compliance. This layer consumes a pre-validated canonical payload and emits a verdict. Everything in scope is the adjudication of grantor conditions, never the financial payload itself and never the filing it later informs.

Architectural Positioning & Stage Boundaries

The rule taxonomy engine accepts only reconciled, schema-validated payloads in which monetary values are already converted to base currency units and dates are standardized to UTC. It performs no raw parsing, no I/O, and no report generation. Outputs are strictly compliance verdicts, triggered-rule identifiers, exception severity, and structured audit metadata. Blending ingestion, reconciliation, evaluation, or reporting logic into a single execution context violates pipeline isolation and introduces non-deterministic state drift, so each stage maintains explicit interface contracts, versioned schemas, and independent error boundaries.

Boundary enforcement is non-negotiable:

Boundary	Contract	Guarantee
Upstream	Payloads arrive pre-normalized; monetary values in base units, dates in ISO-8601 UTC	No coercion or parsing happens inside the engine
Internal	Conditions evaluate against a frozen mapping; no network, filesystem, or mutable global state	Identical inputs produce identical verdicts on any worker
Downstream	Engine emits a deterministic verdict: status, triggered `rule_id`s, severity, and a signed audit trail	Downstream filings reconstruct state without re-execution

Prerequisites

The reference implementation targets a reproducible, pinned environment. Floating dependency ranges are prohibited because validation and sandboxing semantics — especially Pydantic strict mode and ast node membership — change across versions and would silently alter rejection behaviour.

Python: 3.11 or later (required for datetime.fromisoformat offset parsing and strict ast constant handling).
Pinned packages:

bash

pip install "pydantic==2.6.4" "ruamel.yaml==0.18.6" "structlog==24.1.0"

Environment variables:

Variable	Purpose	Example
`TAXONOMY_REGISTRY_URL`	Version-controlled source of grantor rule definitions	`https://registry.internal/grantor-rules`
`TAXONOMY_SCHEMA_VERSION`	Pins the active rule-contract version	`1.0.0`
`AUDIT_LOG_SINK`	Append-only destination for signed verdict records	`s3://audit-ledger/verdicts/`
`DECIMAL_PRECISION`	Global `decimal` context precision for threshold math	`28`

Upstream stage dependencies: payloads must already be type-normalized and the metadata envelope sealed. Monetary fields and line-item identifiers must be canonical against the IRS 990 Data Schema Mapping contract before any rule references them, and provenance hashes must already be attached per Compliance Metadata Standards.

Core Implementation

The engine separates three concerns: a Pydantic schema that rejects malformed grantor rules at the boundary, a sandboxed AST evaluator that forbids arbitrary execution, and a deterministic execution graph that emits an immutable audit record per rule. All monetary comparisons use decimal.Decimal; floats are never permitted in compliance thresholds because IEEE 754 rounding would make a verdict irreproducible.

python

import ast
import uuid
import hashlib
import datetime
from decimal import Decimal, InvalidOperation
from typing import Any

import structlog
from pydantic import BaseModel, Field, field_validator

logger = structlog.get_logger(__name__)


class TaxonomyError(Exception):
    """Structured, machine-readable failure carrying a stable error code."""

    def __init__(self, code: str, detail: str) -> None:
        super().__init__(f"{code}: {detail}")
        self.code = code
        self.detail = detail


# --- Schema: rejects malformed grantor rules at the boundary ---
class GrantorRule(BaseModel):
    rule_id: str = Field(pattern=r"^[A-Z0-9_]{4,32}$")
    grantor_code: str = Field(min_length=3, max_length=12)
    condition_expression: str
    compliance_threshold: Decimal
    severity: str = Field(pattern=r"^(BLOCKING|WARNING|INFO)$")
    effective_date: datetime.datetime
    expiration_date: datetime.datetime
    priority: int = Field(ge=1, le=100)

    @field_validator("effective_date", "expiration_date", mode="before")
    @classmethod
    def enforce_utc(cls, v: Any) -> datetime.datetime:
        if isinstance(v, str):
            dt = datetime.datetime.fromisoformat(v.replace("Z", "+00:00"))
            if dt.tzinfo is None or dt.utcoffset() != datetime.timedelta(0):
                raise ValueError("dates must be ISO 8601 UTC")
            return dt
        return v

    @field_validator("compliance_threshold", mode="before")
    @classmethod
    def coerce_decimal(cls, v: Any) -> Decimal:
        try:
            return Decimal(str(v))
        except InvalidOperation as exc:
            raise ValueError("threshold must be a valid numeric string") from exc


# --- Sandbox: only comparison/arithmetic nodes survive the walk ---
class SandboxedEvaluator:
    ALLOWED_NODES = {
        ast.Expression, ast.BoolOp, ast.BinOp, ast.UnaryOp, ast.Compare,
        ast.Constant, ast.Name, ast.Load, ast.And, ast.Or,
        ast.Add, ast.Sub, ast.Mult, ast.Div,
        ast.Lt, ast.LtE, ast.Gt, ast.GtE, ast.Eq, ast.NotEq,
    }

    @classmethod
    def validate_ast(cls, source: str) -> ast.Expression:
        try:
            tree = ast.parse(source, mode="eval")
        except SyntaxError as exc:
            raise TaxonomyError("ERR_TAXONOMY_SCHEMA_INVALID", str(exc)) from exc
        for node in ast.walk(tree):
            if type(node) not in cls.ALLOWED_NODES:
                raise TaxonomyError(
                    "ERR_UNSAFE_EXPRESSION",
                    f"forbidden node {type(node).__name__}",
                )
        return tree

    @classmethod
    def evaluate(cls, code: ast.Expression, context: dict[str, Decimal]) -> bool:
        compiled = compile(code, "<rule_expr>", "eval")
        return bool(eval(compiled, {"__builtins__": {}}, dict(context)))


class AuditRecord(BaseModel):
    rule_id: str
    expression: str
    triggered: bool
    severity: str
    evaluated_at: datetime.datetime
    context_hash: str  # SHA-256 of the frozen input context


# --- Execution graph: deterministic ordering + immutable verdict ---
class TaxonomyEngine:
    def __init__(self, rules: list[GrantorRule]) -> None:
        self._rules = sorted(rules, key=lambda r: (-r.priority, r.effective_date, r.rule_id))
        self._compiled: list[tuple[GrantorRule, ast.Expression]] = []
        for rule in self._rules:
            self._compiled.append((rule, SandboxedEvaluator.validate_ast(rule.condition_expression)))

    @staticmethod
    def _hash_context(payload: dict[str, Decimal]) -> str:
        canonical = ";".join(f"{k}={payload[k]}" for k in sorted(payload))
        return hashlib.sha256(canonical.encode("utf-8")).hexdigest()

    def evaluate(self, payload: dict[str, Decimal], evaluation_id: str | None = None) -> dict[str, Any]:
        eval_id = evaluation_id or str(uuid.uuid4())
        now = datetime.datetime.now(datetime.timezone.utc)
        context_hash = self._hash_context(payload)
        triggered: list[str] = []
        audit_trail: list[dict[str, Any]] = []
        blocking = False

        for rule, tree in self._compiled:
            if not (rule.effective_date <= now < rule.expiration_date):
                continue
            try:
                fired = SandboxedEvaluator.evaluate(tree, payload)
            except Exception as exc:  # noqa: BLE001 - re-raised as structured error
                logger.error("rule_eval_failure", rule_id=rule.rule_id, error=str(exc))
                raise TaxonomyError("ERR_EVALUATION_FAILURE", f"{rule.rule_id}: {exc}") from exc
            if fired:
                triggered.append(rule.rule_id)
                blocking = blocking or rule.severity == "BLOCKING"
            audit_trail.append(
                AuditRecord(
                    rule_id=rule.rule_id,
                    expression=rule.condition_expression,
                    triggered=fired,
                    severity=rule.severity,
                    evaluated_at=now,
                    context_hash=context_hash,
                ).model_dump()
            )

        verdict = "PASS" if not triggered else ("FAIL" if blocking else "PARTIAL")
        logger.info("verdict_emitted", evaluation_id=eval_id, verdict=verdict, triggered=triggered)
        return {
            "evaluation_id": eval_id,
            "grantor_code": self._rules[0].grantor_code if self._rules else "UNKNOWN",
            "verdict": verdict,
            "triggered_rules": triggered,
            "context_hash": context_hash,
            "audit_trail": audit_trail,
            "taxonomy_version": "v1.0.0",
        }

A BLOCKING trigger yields FAIL; a non-blocking trigger yields PARTIAL; an empty trigger set yields PASS. Severity is a first-class field rather than an inferred property so that the same condition can be advisory for one grantor and disqualifying for another without forking the expression.

Field Mapping & Schema Contract

Grantor rule definitions arrive from the registry under varying key names. They must resolve to the canonical contract below before compilation; unresolved or duplicated aliases are rejected at the validation gate rather than silently coerced.

Canonical field	Accepted aliases	Type / coercion
`rule_id`	`id`, `rule`, `code`	`str`, regex `^[A-Z0-9_]{4,32}$`
`grantor_code`	`funder`, `grantor`, `source_code`	`str`, 3–12 chars
`condition_expression`	`condition`, `expr`, `predicate`	`str`, parsed to a sandboxed AST
`compliance_threshold`	`threshold`, `limit`, `cap`	`decimal.Decimal` (never `float`)
`severity`	`level`, `enforcement`	enum `BLOCKING` / `WARNING` / `INFO`
`effective_date`	`start`, `valid_from`	ISO-8601 UTC `datetime`
`expiration_date`	`end`, `valid_to`, `sunset`	ISO-8601 UTC `datetime`
`priority`	`weight`, `rank`, `order`	`int`, 1–100

Two source keys resolving to one canonical field raise ERR_ALIAS_COLLISION; the offending alias must be dropped at the registry, not picked arbitrarily. Field semantics that reference monetary line items must already match the IRS 990 Data Schema Mapping contract so a threshold compares against the same canonical amount the filing will later report.

Validation & Testing

Determinism is the property under test: the same payload against the same taxonomy version must always yield the same verdict and the same context_hash. The suite asserts known pass/fail payloads, sandbox rejection of hostile expressions, and audit-trail integrity.

python

import datetime
from decimal import Decimal

import pytest

from taxonomy_engine import GrantorRule, TaxonomyEngine, TaxonomyError

UTC = datetime.timezone.utc


def _rule(expr: str, severity: str = "BLOCKING", priority: int = 50) -> GrantorRule:
    return GrantorRule(
        rule_id="MAX_INDIRECT_RATE",
        grantor_code="NSF01",
        condition_expression=expr,
        compliance_threshold=Decimal("0.10"),
        severity=severity,
        effective_date=datetime.datetime(2025, 1, 1, tzinfo=UTC),
        expiration_date=datetime.datetime(2030, 1, 1, tzinfo=UTC),
        priority=priority,
    )


def test_blocking_overage_yields_fail() -> None:
    engine = TaxonomyEngine([_rule("indirect_rate > 0.10")])
    result = engine.evaluate({"indirect_rate": Decimal("0.15")})
    assert result["verdict"] == "FAIL"
    assert result["triggered_rules"] == ["MAX_INDIRECT_RATE"]


def test_warning_severity_yields_partial() -> None:
    engine = TaxonomyEngine([_rule("indirect_rate > 0.10", severity="WARNING")])
    result = engine.evaluate({"indirect_rate": Decimal("0.15")})
    assert result["verdict"] == "PARTIAL"


def test_compliant_payload_yields_pass() -> None:
    engine = TaxonomyEngine([_rule("indirect_rate > 0.10")])
    assert engine.evaluate({"indirect_rate": Decimal("0.08")})["verdict"] == "PASS"


def test_identical_inputs_are_reproducible() -> None:
    engine = TaxonomyEngine([_rule("indirect_rate > 0.10")])
    payload = {"indirect_rate": Decimal("0.12")}
    assert engine.evaluate(payload)["context_hash"] == engine.evaluate(payload)["context_hash"]


def test_sandbox_rejects_function_calls() -> None:
    with pytest.raises(TaxonomyError) as exc:
        TaxonomyEngine([_rule("__import__('os').system('echo x')")])
    assert exc.value.code == "ERR_UNSAFE_EXPRESSION"

Each verdict carries a per-rule AuditRecord, so tests should also assert that len(result["audit_trail"]) equals the count of active rules and that every record shares the run’s context_hash — a divergent hash inside a single run signals context mutation and must fail the build.

Performance & Scale Considerations

Nonprofit adjudication is bursty — concentrated at grant-cycle deadlines — rather than continuously high-volume, so the engine optimizes for predictable memory and reproducibility over raw throughput.

Compile once, evaluate many: taxonomy compilation (AST validation) happens at construction. Reuse a single TaxonomyEngine across a batch instead of rebuilding it per payload; compilation, not evaluation, is the expensive step.
Batch sizing: evaluate in batches of 500–1,000 payloads. Verdict envelopes are small, so a 1,000-payload batch with full audit trails stays comfortably under a 64 MB working-set ceiling.
Concurrency: evaluation is CPU-bound and pure, so it parallelizes cleanly across processes. Because each call is stateless against a frozen context, sharding payloads across workers needs no locking. Producer fan-out is governed upstream by Async Batch Processing Pipelines.
Audit trail growth: the trail grows by one fixed-size record per active rule. Store a hash reference rather than inlining large payload diffs so record size stays bounded as taxonomies grow.

Failure Modes & Troubleshooting

Error category	Root cause	Remediation
`ERR_TAXONOMY_SCHEMA_INVALID`	Rule field missing, mistyped, or expression unparseable	Fix the definition at the registry; reject the whole batch, never partially load
`ERR_UNSAFE_EXPRESSION`	Expression contains a call, attribute, or import node	Rewrite the condition using only comparisons and arithmetic; the sandbox is intentionally narrow
`ERR_ALIAS_COLLISION`	Two source keys map to one canonical field	Pin a single accepted alias and drop the duplicate at the registry
`ERR_DATE_RANGE_OVERLAP`	Overlapping windows for identical `grantor_code` and `priority`	Resolve precedence at the registry; overlaps make ordering non-deterministic
`ERR_EVALUATION_FAILURE`	Payload missing a name the expression references	Validate payload completeness upstream; quarantine the batch for manual review

Failures are terminal at the batch level: a single invalid rule rejects, logs, and quarantines the entire taxonomy rather than evaluating a partial set. Transient dependency failures — a registry fetch timeout, for example — are never retried in place; they are handed to Pipeline Fallback & Retry Logic, which applies bounded backoff and preserves state so re-evaluation runs against the same frozen context and produces an idempotent verdict.

Compliance Mapping & Handoff Protocols

This layer produces verdicts; it does not file. Handoffs are explicitly bounded to preserve audit integrity:

Federal reporting handoff: a FAIL or PARTIAL verdict routes to the IRS 990 Data Schema Mapping pipeline. The audit trail supplies the exact rule expressions and threshold comparisons needed for Form 990 Part IX (Statement of Functional Expenses) allocation and Schedule O narrative justification.
State-level handoff: verdicts whose grantor_code maps to multi-state jurisdictions route to State Charity Registration Compliance. Deterministic ordering resolves conflicting state thresholds — California’s RRF-1 and New York’s CHAR500 registration gates among them — before any jurisdictional filing is generated.
Access boundaries: read access to the evaluation graph and write access to the taxonomy registry are enforced by Data Security & Access Boundaries; registry writes require cryptographic signing and merge gates.

Compliance Alignment

These taxonomies exist to satisfy concrete obligations, not abstract best practice:

2 CFR §200.302 (financial management): signed, hash-verified verdicts provide records that identify the application of federal award funds against funder conditions and support reconciliation.
2 CFR §200.303 (internal controls): deterministic, sandboxed evaluation is the documented internal control that enforces award terms uniformly and is reproducible under audit.
2 CFR §200.414 (indirect costs): threshold rules adjudicate negotiated indirect cost rates and the 10% de minimis rate, blocking submissions that exceed the rate a grantor will honor.
2 CFR §200.334 (record retention): verdict payloads and audit trails are retained a minimum of three years from final report submission; the context_hash lets reviewers reconstruct an evaluation without re-running it.
IRS Form 990: triggered-rule metadata maps to Part IX line items so derived disclosures remain reconstructable from sealed verdicts rather than recomputed from raw payloads.

Core Architecture & Compliance Mapping — parent framework and system topology
Compliance Metadata Standards — the sealed envelope and provenance fields every payload carries into this engine
IRS 990 Data Schema Mapping — canonical field semantics that thresholds compare against
State Charity Registration Compliance — jurisdictional filing layer that consumes routed verdicts
Pipeline Fallback & Retry Logic — deterministic recovery for transient registry and evaluation failures
Async Batch Processing Pipelines — concurrent producers that feed normalized payloads into this evaluation boundary

Architectural Positioning & Stage Boundaries #

Prerequisites #

Core Implementation #

Field Mapping & Schema Contract #

Validation & Testing #

Performance & Scale Considerations #

Failure Modes & Troubleshooting #

Compliance Mapping & Handoff Protocols #

Compliance Alignment #

Related #