PIF v0.1 — Preflight Interchange Format
Status: Draft. Initial public release. Authors: RegCheck (initial author of the spec; open to additional contributors) License: Apache 2.0 (spec and reference implementation)
What PIF is
PIF (Preflight Interchange Format) is an open JSON specification for representing the inputs and outputs of an AI workflow preflight assessment in regulated industries — pharma being the initial target.
PIF defines the shape of three artifacts:
WorkflowDescription— what an AI workflow is intended to doPreflightAssertion— the assessment of that workflow against applicable regulatory and operational considerationsPreflightSession— the durable record wrapping description, assertion, tool invocations, and reviewer actions
PIF does NOT define:
- The content of any specific regulatory requirement
- How a preflight should reason about applicability
- Confidence calibration methodology
- Acceptable thresholds for any field
These are deliberately left to implementations. PIF describes the shape of preflight artifacts so they can flow between tools, audit systems, and reviewers — not the substance of regulatory interpretation.
PIF v0.1 is tuned for pharma and GxP contexts — that is the initial author's working domain. The artifact shape is general enough to apply to medtech, diagnostics, banking, and critical-infrastructure preflight scenarios; future minor versions may add domain-specific vocabularies. Experiments in adjacent regulated sectors are encouraged.
Why a shared format
Pharma teams deploying AI workflows currently document compliance considerations in inconsistent ways: Confluence pages, Word documents in change controls, fields in RIM systems, sometimes nowhere. When an auditor or inspector asks "how did you assess whether this AI workflow was appropriate?" — there's no shared shape to the answer.
PIF aims to be to AI workflow preflight what SBOM is to software supply chains, OpenAPI is to APIs, or SPDX is to licensing metadata: a portable, validatable, machine-readable artifact format that organizations and tools can produce and consume consistently.
Conceptual model
PIF is workflow-first. The primary object is the workflow being assessed, not the regulatory document being searched. Requirements are projected onto workflows; they are not the starting point.
This matters because it forces preflight outputs to answer the question the workflow owner actually has — "is this workflow appropriate given what it's trying to do?" — rather than the question search engines answer — "what regulations exist about topic X?"
Three principles run through the design:
- Honesty by architecture. Required fields like
assumptions_made,clarifying_questions,out_of_scope, andapplicability_basismake uncertainty and limits structural, not optional disclaimer text. - Composable primitives. A
PreflightAssertionis built from multiple tool invocations recorded in thePreflightSession. Producers may decompose the work however they choose, but the artifact shape is consistent. - Replayability. Combined with
corpus_snapshotandproduced_by.methodology, a PIF document carries enough metadata that the assessment can be reasoned about — and ideally reproduced — months or years later.
Lifecycle
The three artifact types form a forward flow. A WorkflowDescription is authored by the workflow owner (or generated from intake) and describes intent. A PreflightAssertion is produced by a preflight tool against that description and records the assessment. A PreflightSession wraps both, plus the tool invocations and reviewer actions in between, into a single durable record that can be archived, audited, and replayed.
WorkflowDescription PreflightAssertion PreflightSession
────────────────── ────────────────── ────────────────
designer authors ▸ tool produces ▸ both wrapped +
intent + scope findings + basis tool calls + reviews
stored durably
The boundary is deliberate: descriptions and assertions can flow independently (e.g., a description can be re-assessed by a different tool), while the session is the system-of-record envelope an audit function holds onto.
Required fields
PIF v0.1 is strict on a minimal core, permissive on extensions.
WorkflowDescription — required
pif_versionworkflow_idintentai_roleoutput_destinationhuman_gate
Everything else is optional but recommended. A WorkflowDescription with only required fields is valid PIF; it will produce a less precise PreflightAssertion than one with more fields populated.
PreflightAssertion — required
pif_versionassertion_idworkflow_refproduced_by(with at minimumtoolandversion)produced_atrisk_classification(withlevelandrationale)applicable_requirements(with applicability basis and source per requirement)missing_controlsassumptions_madeverification_stepsstatus
The audit-defensibility argument hinges on these last several being mandatory. A preflight artifact that doesn't tell you what assumptions it made, what controls it identified as missing, and how a human can verify it independently — is not an audit artifact, it's just an answer.
PreflightSession — required
pif_versionsession_idworkflow_descriptionpreflight_assertioncreated_at
Recommended shapes for corpus_snapshot and produced_by.methodology
corpus_snapshot and produced_by.methodology are optional in v0.1 but central to replayability. Implementations are encouraged to populate them with a stable shape so consumers can reason about provenance without per-tool special cases.
Recommended fields for corpus_snapshot:
name: human-readable corpus identifier (e.g.,preclari-eu-gmp)version— corpus version string, monotonicsnapshot_date— ISO-8601 timestamp of the snapshotsource_count— integer count of source documents in the snapshothash— content hash of the snapshot (e.g.,sha256:...) so consumers can detect drift
Recommended fields for produced_by.methodology:
name: methodology identifier (e.g.,preclari-method)version— methodology version (e.g.,1.0)reference_url— URL to the published methodology document
These shapes are not enforced by the v0.1 JSON Schemas (the fields accept a free string today for backward compatibility). They are documented here to keep implementations from diverging on what "snapshot" and "methodology" mean in practice.
Examples
Minimal valid WorkflowDescription
The smallest PIF artifact: a WorkflowDescription populated with only the required fields. Valid against workflow-description.schema.json but will produce a thin assertion.
{
"pif_version": "0.1",
"workflow_id": "wf_minimal_example_001",
"intent": "Summarize incoming customer complaints into a structured report for the quality team to triage.",
"ai_role": "summarization",
"output_destination": "draft_for_review",
"human_gate": "review"
}
Rich PreflightAssertion (excerpted)
A PreflightAssertion with most optional fields populated. Excerpted for readability; the full document lives at spec/v0.1/examples/preflight-assertion.example.json.
{
"pif_version": "0.1",
"assertion_id": "pa_2026_001_a",
"workflow_ref": "wf_qd_triage_2026_001",
"produced_by": {
"tool": "preclari",
"version": "0.4.2",
"methodology": "preclari-method-v1.0",
"provider": "anthropic",
"model_used": "claude-sonnet-4.6",
"model_assignments": {
"risk_classification": "claude-sonnet-4.6",
"requirement_projection": "claude-sonnet-4.6"
// + 3 more steps
},
"entitlements": {
"jurisdictions": ["EU", "US", "CH", "UK"],
"gxp_domains": ["GMP", "GDP", "GCP", "GLP", "GVP", "CSV", "data_integrity", "quality_systems"]
}
},
"produced_at": "2026-05-17T14:30:00Z",
"corpus_snapshot": {
"snapshot_id": "corpus_2026_w19",
"snapshot_hash": "sha256:7f3e8a2b...4f",
"snapshot_date": "2026-05-12T00:00:00Z",
"source_count": 247
},
"risk_classification": {
"level": "medium",
"rationale": "Workflow influences GxP decisions but human approval gate is required before any action. Lifecycle stage is pilot, narrowing the operational scope.",
"drivers": [
"data_classes includes gxp_record",
"human_gate=approve_each (every output reviewed)",
"lifecycle_stage=pilot (limited blast radius)"
]
},
"applicable_requirements": [
{
"requirement_id": "req_001",
"requirement_text": "Computerized systems that influence GxP decisions are subject to validation expectations proportionate to their risk and intended use.",
"source": {
"url": "https://example-regulator.eu/annex-11",
"canonical_document_id": "EU-GMP-Annex-11",
"issuing_authority": "European Commission",
"jurisdiction": "EU",
"document_type": "annex",
"effective_date": "2011-06-30",
"content_hash": "sha256:a1b2c3d4...b2"
},
"applicability_basis": "The workflow uses an AI system to produce structured recommendations that feed into GxP decisions. Annex 11 applies because the system is computerised, used in GxP context, and influences regulated decisions.",
"confidence": "high",
"jurisdictional_scope": ["EU"]
}
// + 2 more requirements
],
"missing_controls": [
{
"control": "documented_user_requirements_specification",
"rationale": "Annex 11 expects URS for computerised systems influencing GxP decisions. WorkflowDescription does not indicate whether a URS exists.",
"criticality": "required",
"related_requirements": ["req_001"]
}
// + 5 more controls
],
"assumptions_made": [
{
"assumption": "The Basel facility holds a Swiss GMP manufacturing authorization from Swissmedic.",
"impact_if_wrong": "Swiss-specific requirements would not apply; assessment narrows to EU-only.",
"basis": "Inferred from context_notes mention of Basel manufacturing without explicit confirmation."
}
// + 2 more assumptions
],
"verification_steps": [
{
"step": "Confirm the EU GMP Annex 11 citation matches the requirement text in req_001 and that the retrieved document version remains current.",
"type": "source_check",
"applies_to": "applicable_requirements[0]"
}
// + 2 more steps
],
"recommendation": "Treat as a controlled pilot with missing_controls addressed before first live use. Document the qualification approach proportionate to medium risk classification.",
"status": "draft",
"notice": "This assertion is informational and does not constitute regulatory advice."
}
The // + N more markers indicate elision for readability. The on-disk fixture is pure JSON — comments are not part of the format.
Controlled vocabularies
PIF v0.1 defines closed enums for the following fields. Extension via the extensions object is permitted but the core vocabularies are normative. Several enums encode distinctions that are non-obvious to non-pharma readers; the values below carry short glosses so producers and consumers agree on meaning.
ai_role — what the AI is asked to do
decision— AI makes the final call without human approval in the workflow.recommendation— AI proposes; a human approves or rejects before action.draft— AI produces content for a human to edit and own.classification— AI assigns categories or labels to inputs.extraction— AI pulls structured data from unstructured sources.copilot— AI suggests within a human-driven workflow, no separate approval step.summarization— AI condenses content for human consumption.
output_destination — where the AI's output flows
advisory— informs a human; not stored as a record.regulated_decision— becomes part of a GxP decision or regulatory submission.system_of_record— written to a controlled system (QMS, RIM, LIMS, etc.).draft_for_review— staged for human edit before any downstream use.automated_action— triggers downstream action without human review.archive— stored but not actioned.
human_gate — how a human stays in the loop
none— no human in the loop; the AI's output flows directly downstream.review— optional human review; the human may or may not look at any given output.approve_each— reviewer approves every model output individually before downstream use.approve_batch— reviewer approves outputs in batches (e.g., a daily review of the day's outputs).post_hoc_audit— outputs flow without prior review; a sample is audited after the fact.
reversibility — can the workflow's action be undone
reversible— the downstream action can be reversed without material consequence.partial— some effects can be reversed; some cannot (e.g., a draft is reversible but downstream perception is not).irreversible— the action cannot be undone once taken (e.g., batch release, regulatory submission).
lifecycle_stage — where the workflow sits
design— pre-implementation; design and feasibility work.pilot— limited deployment with controlled scope and elevated monitoring.production— full deployment under steady-state controls.retirement— being decommissioned; included so PIF can describe sunsetted workflows.
risk_tolerance — the workflow owner's stated tolerance
very_low— zero-defect target; controls scoped for worst-case scenarios.low— conservative; controls scoped for likely failure modes.medium— balanced; standard industry practice.high— exploratory or low-stakes; lighter controls accepted.
Other closed enums
data_classes:gxp_record,pii,phi,manufacturing_data,clinical_data,regulatory_submission,quality_data,safety_data,supply_chain_data,commercial_data,othergxp_domains:GMP,GDP,GCP,GLP,GVP,CSV,data_integrity,quality_systems,regulatory_affairs,pharmacovigilance,labelingrisk_level(assertion output):low,medium,high,criticalconfidence:high,medium,low,contestedstatus:draft,in_review,approved,contested,superseded
Jurisdictions use ISO 3166 alpha-2 codes plus recognized supranational codes (EU, ICH, WHO, PICS).
Extensions
Implementations may add fields via the extensions object on any top-level type. Field names within extensions should be prefixed with the implementing tool's identifier (e.g., preclari:custom_field).
Implementations MUST NOT reject documents containing unknown extensions. Forward compatibility is a hard requirement.
Example: a vendor adding internal scoring fields under its own namespace.
"extensions": {
"preclari:internal_risk_score": 0.74,
"preclari:calibration_method": "isotonic_v2"
}
A consumer that does not recognise the preclari: namespace MUST still accept and pass through the document; the values can be ignored, surfaced as unknown, or carried forward, but the document itself remains valid PIF.
Extending closed enums
Closed enums (ai_role, output_destination, human_gate, reversibility, lifecycle_stage, risk_tolerance, risk_level, confidence, status, data_classes, gxp_domains, document_type) MUST NOT be extended ad-hoc by implementations. A producer that emits an enum value not defined in this version of the spec is producing a non-conformant document.
To propose a new enum value:
- Preferred — open a pull request against the spec repository describing the use case, why no existing value fits, and the backward-compatibility implication. Accepted proposals ship in the next minor version.
- Otherwise, carry the implementation-specific value under the
extensionsobject with a vendor namespace (e.g.,"extensions": { "preclari:custom_risk_level": "elevated" }) rather than reusing the core enum field.
This rule exists so that a tool consuming PIF can rely on the core enums having a fixed shape. Forward compatibility lives in extensions, not in the closed vocabularies.
Signatures
PIF v0.1 defines a signature field on PreflightAssertion. The field is optional in the spec; implementations may require it for specific use cases (e.g., paid tiers, audit submissions).
Recommended:
- Sign over the canonicalized form of the document with the
signaturefield removed - Canonicalization: JCS (JSON Canonicalization Scheme, RFC 8785)
- Algorithms: Ed25519 (recommended), RSA-PSS-SHA256, ECDSA-P256-SHA256
The verification model is intentionally simple. There is no PIF-level PKI. Implementations publish their own public keys and own the trust relationship with their users.
JSON Schema URLs
The JSON Schemas for the three PIF artifact types are published at stable versioned URLs. Each schema's $id is its public URL, and $schema is https://json-schema.org/draft/2020-12/schema.
https://preclari.com/pif/v0.1/workflow-description.schema.jsonhttps://preclari.com/pif/v0.1/preflight-assertion.schema.jsonhttps://preclari.com/pif/v0.1/preflight-session.schema.json
The JSON-LD context for the same vocabulary is published at:
https://preclari.com/pif/v0.1/context.jsonld- Also mirrored at the
.well-knownpath:https://preclari.com/.well-known/pif/v0.1/context.jsonld
These URLs are stable for the lifetime of v0.1. Future minor versions (v0.2, etc.) publish at their own paths; v0.1 URLs do not move.
Validation
The repository includes JSON Schema files in spec/v0.1/. Any standards-compliant JSON Schema validator (Draft 2020-12) can validate PIF documents.
A CLI validator is provided in validator/:
pif-validate path/to/document.json
The validator returns 0 on conformance, non-zero with line-level errors otherwise.
Conformance test suite
A conformance test bundle for PIF v0.1 lives at spec/v0.1/conformance/. It carries a manifest of 25 test cases — valid examples across the risk tiers, plus 19 single-violation invalid fixtures each testing exactly one schema rule (missing required fields, closed-enum violations, pattern violations, length violations, type violations, additionalProperties: false violations). A reference TypeScript harness validates every case against the declared expectation and asserts the validator surfaced the expected violation; CI runs the suite on every PR. Implementers of PIF validators should run the suite against their tool — a validator that disagrees on any case is not v0.1-conformant.
JSON and JSON-LD
PIF documents are valid JSON. The repository also includes a JSON-LD @context at spec/v0.1/context.jsonld that maps PIF fields to semantic IRIs.
The same field shapes serve both modes. A flat JSON PIF document can be made JSON-LD-compatible by adding "@context": "https://preclari.com/pif/v0.1/context.jsonld" at the top level. A JSON-LD PIF document with the @context removed is valid flat JSON.
Implementations are encouraged to support both. Most consumers will use flat JSON. Knowledge graph, provenance, and semantic reasoning integrations benefit from JSON-LD.
Example: PIF inside a provenance graph
A common JSON-LD use case is linking a PreflightAssertion to the workflow it assesses inside a broader provenance graph (e.g., a QMS or RIM system that already exposes its records as linked data):
{
"@context": "https://preclari.com/pif/v0.1/context.jsonld",
"@type": "PreflightAssertion",
"assertion_id": "pa_2026_001_a",
"workflow_ref": "wf_qd_triage_2026_001",
"describes": {
"@id": "urn:qms:workflow:wf_qd_triage_2026_001"
},
"produced_by": {
"tool": "preclari",
"version": "0.4.2"
},
"produced_at": "2026-05-17T14:30:00Z",
"status": "approved"
}
The describes link is an extension pattern using a urn: identifier so the assertion can resolve into a host system's namespace (urn:qms:..., urn:rim:..., https://example.org/qms/...) without PIF taking a position on which QMS or RIM the consumer happens to use.
Versioning
PIF follows semantic versioning at the spec level:
- Patch (0.1.x): documentation clarifications, non-normative additions
- Minor (0.x): additive changes that preserve backward compatibility (new optional fields, expanded enum values that are gracefully ignored by older validators)
- Major (x.0): breaking changes
Older spec versions remain available at versioned URLs indefinitely.
Cadence
v0.1 is intended to remain stable for at least 6 months from publication so implementers can ship against it without chasing the spec. v0.2 will focus on a formal vocabulary for control names (currently free-text in missing_controls.control) and on RIM integration shapes — both already listed under "What's not in v0.1". Breaking changes between minor versions are not permitted; anything breaking ships under a major version with a migration story.
What's not in v0.1
Deliberately deferred to v0.2 or later:
- A formal vocabulary for control names (currently free-text in
missing_controls.control) - Schema for jurisdictional override rules
- Bidirectional links between WorkflowDescription and policy systems (e.g., RIM integration shapes)
- Multi-language support for
intentand other free-text fields - A digital ledger / transparency log for signed assertions
These are valuable but not necessary for v0.1 to be useful.
Interoperability
PIF is designed to flow between systems already in place at organisations running regulated AI workflows. Implementers from any of these categories are welcome:
- eQMS / QMS systems — the natural system-of-record for
PreflightSessiondocuments and for downstream change control ofmissing_controls. - RIM (Regulatory Information Management) systems — for linking
PreflightAssertiondocuments to product, submission, and authorisation context. - CSV (Computerised Systems Validation) toolchains — for capturing PIF artifacts as part of qualification evidence for AI-impacting computerised systems.
- Agent frameworks — MCP servers, LangChain, LlamaIndex, Bedrock Agents, and similar runtimes that produce
WorkflowDescriptiondocuments from agent definitions and consumePreflightAssertiondocuments as preflight gating. - Audit / SIEM systems — for ingesting signed
PreflightSessiondocuments into the audit trail for inspections and internal periodic review.
PIF takes no position on which of these systems an organisation runs. The format aims to be the portable artifact that flows between them.
Governance
PIF is maintained by RegCheck as the initial author, with the intent to transition to an open working group as adoption broadens. Decisions are documented in GOVERNANCE.md at the repository root: change types, the review window for minor releases, the criteria for adding fields and enum values, and the deprecation policy. Breaking changes require a major version bump and a 30-day public RFC.
Once v0.1 sees adoption beyond RegCheck, the spec and reference implementation are intended to move to a neutral pif-spec GitHub organisation under continued Apache 2.0 licensing. This commitment is recorded here so future contributors can hold the maintainers to it.
How to contribute
PIF is currently maintained by the RegCheck team but is intended to be a community spec. Contributions welcome via the repository's issues and pull requests. See GOVERNANCE.md for the decision process.