Slotting Architecture · 12 min read

Validating JSON Schemas for Inventory Updates

You receive inventory-update payloads as raw JSON off a message queue or REST handoff, and you need a single, language-neutral contract that declares exactly what a valid record looks like — one you can publish to an external producer and enforce at runtime before any velocity math touches the data. This page implements that contract as a Draft 2020-12 JSON Schema and a precompiled Python validator that turns a malformed payload into a structured, routable rejection instead of a silent corruption. It is the raw-schema counterpart to the pydantic enforcer described in Schema Validation for Inventory Feeds, the parent guide within the Velocity Data Ingestion & WMS Sync Pipelines architecture. Reach for JSON Schema over pydantic when the contract has to be shared across languages, versioned as a document, or handed to an upstream team as the authoritative spec of the feed.

Prerequisites

Confirm each of these before wiring the validator into a live feed:

Python 3.10+ — the implementation uses X | None union syntax and list[...] generics.
jsonschema 4.18+ installed with the format extra — pip install "jsonschema[format]". The format extra pulls in rfc3339-validator, without which format: "date-time" is a no-op annotation, not an enforced check.
The canonical inventory record already defined. This schema validates the same six load-bearing fields the parent contract does — sku_id, location_code, on_hand_qty, reserved_qty, velocity_class, last_sync_ts — so the JSON Schema document and the pydantic model stay in lockstep and a field added to one is added to the other.
A durable dead-letter sink (a queue topic, table, or object prefix) that is separate from your transport-failure path, so a data reject and a pipe failure are diagnosed independently.
A version string for the contract, stamped into the schema $id and carried on every rejection, so a post-change reconciliation can tell a genuine bad row from one that only failed because the contract moved.

Configuration Block

The schema itself is the configuration. Externalize it to a versioned .json file — never inline it in code — so a producer can consume the exact same document and a contract change is a reviewable diff rather than a code deploy. Every tunable bound (the sku_id length, the location_code aisle pattern, the velocity_class enum) lives here.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "urn:warehouse-slotting:inventory-update:2.3.0",
  "title": "InventoryUpdate",
  "type": "object",
  "additionalProperties": false,
  "required": ["sku_id", "location_code", "on_hand_qty", "last_sync_ts"],
  "properties": {
    "sku_id":        {"type": "string", "minLength": 3, "maxLength": 50, "pattern": "^[A-Z0-9\\-]+$"},
    "location_code": {"type": "string", "pattern": "^[A-D]-\\d{2}-\\d{2}$"},
    "on_hand_qty":   {"type": "integer", "minimum": 0},
    "reserved_qty":  {"type": "integer", "minimum": 0, "default": 0},
    "velocity_class":{"type": ["string", "null"], "enum": ["FAST", "MED", "SLOW", "DEAD", null]},
    "last_sync_ts":  {"type": "string", "format": "date-time"}
  }
}

additionalProperties: false is the single most important line: it turns an unannounced upstream field — the classic signature of a producer changing the contract without telling the consumer — into a loud rejection instead of a silently dropped column. "type": "integer" (not "number") rejects a fractional 12.4 quantity outright rather than letting it drift into scoring, and the velocity_class enum keeps an out-of-taxonomy label from ever reaching the tiering layer.

A handful of operational knobs sit around the schema rather than in it. Keep them in a small profile so a noisier facility can loosen an alert without touching the contract, shown here as YAML and its equivalent Python dict:

# inventory_update_validation.yaml — one profile per feed source
contract:
  schema_path: "schemas/inventory-update/2.3.0.json"
  enforce_formats: true        # activate the RFC 3339 date-time checker
quarantine:
  reject_ratio_alert: 0.05     # page when >5% of a batch is dead-lettered
  collect_all_errors: true     # report every violation, not just the first

INVENTORY_UPDATE_VALIDATION = {
    "contract": {
        "schema_path": "schemas/inventory-update/2.3.0.json",
        "enforce_formats": True,
    },
    "quarantine": {"reject_ratio_alert": 0.05, "collect_all_errors": True},
}

Implementation

Two functions do the work: one compiles the versioned schema once at startup, the other validates a single payload and returns a structured result. Compiling once matters — Draft202012Validator caches its compiled regexes and type checkers, so recompiling per payload is the most common way this layer becomes a latency bottleneck.

from __future__ import annotations

import json
import logging
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any

from jsonschema import Draft202012Validator, FormatChecker

logger = logging.getLogger("velocity.jsonschema")


@dataclass
class ValidationResult:
    """Outcome of validating one payload; errors is empty when is_valid is True."""
    is_valid: bool
    errors: list[dict[str, Any]] = field(default_factory=list)


def load_validator(schema_path: Path) -> Draft202012Validator:
    """Compile a versioned JSON Schema once, with RFC 3339 date-time enforcement."""
    schema = json.loads(schema_path.read_text())
    Draft202012Validator.check_schema(schema)   # fail fast on a malformed contract
    return Draft202012Validator(schema, format_checker=FormatChecker())


def validate_inventory_update(
    payload: dict[str, Any], validator: Draft202012Validator
) -> ValidationResult:
    """Validate one inventory-update payload against the compiled contract.

    Collects EVERY violation (not just the first) as a JSON-path-keyed error dict
    so a dead-letter consumer can reconcile a row without replaying the feed.
    """
    errors = [
        {
            "field": "/".join(map(str, e.absolute_path)) or "<root>",
            "message": e.message,
            "constraint": e.validator,
        }
        for e in sorted(validator.iter_errors(payload), key=lambda e: list(e.absolute_path))
    ]
    if errors:
        logger.warning(
            "sku=%s rejected with %d violation(s): %s",
            payload.get("sku_id", "<missing>"), len(errors), errors[0]["message"],
        )
    return ValidationResult(is_valid=not errors, errors=errors)

The diagram below traces the rejected payload from the Verification section through the compiled validator: iter_errors does not stop at the first failure, so all six fields cross the gate and each surfaces as one structured error keyed by its JSON path and the exact constraint that fired.

Step-by-Step Walkthrough

Load the contract from disk, not from a literal. load_validator reads the schema_path from the config profile so the running validator and the published spec are byte-identical. Serving the schema from one versioned file is what keeps producer and consumer from drifting apart.
Self-check the schema before trusting it. check_schema validates the schema document itself against the Draft 2020-12 meta-schema. A typo like "minimun" would otherwise be silently ignored — meta-validation makes a broken contract fail at startup instead of passing bad data for weeks.
Attach a FormatChecker. Passing format_checker=FormatChecker() is what activates the enforce_formats behaviour and makes format: "date-time" a real RFC 3339 check. Omit it and a last_sync_ts of "not-a-date" sails straight through.
Collect every violation with iter_errors. Unlike validate(), which raises on the first failure, iter_errors yields all of them. Sorting by absolute_path gives a stable, field-ordered error list so the same bad payload always produces the same dead-letter record.
Key each error by its JSON path. absolute_path joined with / names the exact offending field (on_hand_qty, or <root> for a whole-object violation like a forbidden extra property), and e.validator records which constraint fired (minimum, enum, pattern). That pair is everything a reconciliation job needs.
Return a typed result, log once. A ValidationResult dataclass carries the verdict and the structured errors to the caller, which routes valid payloads onward and quarantined ones — with their errors and the contract version — to the dead-letter sink. The single logger.warning gives an audit trail without flooding the log per field.

Verification

Assert the validator’s invariants in CI before it ever meets a live feed: a clean payload passes, each class of malformed payload is caught with a legible field-keyed error, and an unexpected extra field is rejected.

from pathlib import Path

validator = load_validator(Path("schemas/inventory-update/2.3.0.json"))

good = {
    "sku_id": "WIDGET-001", "location_code": "A-04-12",
    "on_hand_qty": 120, "reserved_qty": 5,
    "velocity_class": "FAST", "last_sync_ts": "2026-07-02T09:15:00Z",
}
assert validate_inventory_update(good, validator).is_valid

# Fractional quantity, out-of-scheme location, bad enum, naive timestamp, phantom field.
bad = {
    "sku_id": "widget-001", "location_code": "Z-99-99",
    "on_hand_qty": 12.4, "velocity_class": "TURBO",
    "last_sync_ts": "2026-07-02 09:15", "warehouse_temp": 4.0,
}
result = validate_inventory_update(bad, validator)
assert not result.is_valid
fields = {e["field"] for e in result.errors}
assert {"sku_id", "location_code", "on_hand_qty", "velocity_class"} <= fields
print(f"{len(result.errors)} violations: {sorted(fields)}")

A healthy run logs the rejection once and prints the collected field set:

WARNING:velocity.jsonschema:sku=widget-001 rejected with 6 violation(s): 'widget-001' does not match '^[A-Z0-9\\-]+$'
6 violations: ['', 'last_sync_ts', 'location_code', 'on_hand_qty', 'sku_id', 'velocity_class']

The <root> entry is the additionalProperties failure for warehouse_temp — proof the contract is catching upstream drift, not just field-level typos.

Common Pitfalls

Forgetting the FormatChecker. format: "date-time" is annotation-only in JSON Schema by default. Without format_checker=FormatChecker() and the jsonschema[format] install, "2026-07-02 09:15" validates as a plain string and a naive timestamp corrupts every downstream window comparison.
Using "number" where you mean "integer". "type": "number" accepts 12.4. If legacy ERP exports quantities as whole-number floats you must decide deliberately: keep "integer" and coerce 12.0→12 in a pre-filter, or reject and fix the producer — never silently truncate.
Loosening additionalProperties to stop the noise. Flipping it to true to quiet a reject-ratio spike hides the exact contract change you needed to see. Bump the schema $id version and reconcile the new field instead.
Recompiling the validator per payload. Building Draft202012Validator(schema) inside the hot path re-parses every regex on every record. Compile once with load_validator at module init and reuse the instance; p99 validation should stay well under a couple of milliseconds per payload.

FAQ

When should I use JSON Schema instead of the pydantic model?

Use JSON Schema when the contract must live outside Python — shared with a Java or Go producer, published as the authoritative spec, or version-controlled as a standalone document. Use the pydantic model from Schema Validation for Inventory Feeds when the validated records are consumed by Python and you want typed objects for free. Many pipelines run both: the JSON Schema as the language-neutral contract of record, pydantic as the runtime enforcer inside the consumer.

How do I version the schema without breaking live feeds?

Encode the version in the $id (inventory-update:2.3.0) and bump it on any field add, removal, or retype. Stamp that version onto every dead-letter record so a post-change reconciliation can distinguish a genuine bad row from one that only failed because the contract moved. For a planned change, run the old and new schema documents in parallel over a sampled feed, confirm the reject ratio stays flat, then cut over.

Where do rejected payloads go after validation?

To a dead-letter sink, carrying the original payload, the field-keyed error list, and the contract version — never dropped and never coerced into passing. The clean payloads advance to the batch scoring path in Async Batch Processing for Velocity, which assumes contract-clean input; keep schema rejects on a separate path from transport failures so the two are diagnosed independently.

Schema Validation for Inventory Feeds — the parent guide and its pydantic enforcer, which this JSON Schema contract complements field-for-field.
WMS & ERP Polling Strategies — the extractors that emit these payloads, and whether they arrive as delta events or snapshot batches.
Async Batch Processing for Velocity — the scoring layer that consumes only the contract-clean payloads this validator lets through.
SKU Velocity Taxonomy Design — the source of the velocity_class enumeration this schema pins the payload to.