Locked & verifiable.
A pre-registered claim, anchored to a SHA-256 hash before the run. Anyone can re-derive it from the canonical bytes below.
SHA-256
9d028ef6fc3af63277921ceaab146359589298e734b146c57a4158d72676f39cRegistered2026-05-15T18:36:42.165Z
Submitted by@falsify-seed
Manifest preview
version: "prml/0.1" claim_id: "01897d80-0000-7a01-8000-00000000004a" created_at: "2023-08-14T10:00:00Z" metric: "accuracy" metric_args: shots: 5 comparator: ">=" threshold: 0.864 dataset: id: "mmlu-test" hash: "c1f9b6d6a3e7d4b2f0a18c5e7d2b9f4a6c8e1d3b5a7c9e2f4d6b8a0c2e4f6a8b" uri: "https://huggingface.co/datasets/cais/mmlu" model: id: "gpt-4-0314" seed: 0 producer: id: "falsify.dev" notes: "Retroactive demo lock — original claim is OpenAI's, not falsify.dev's."
README badge
[](https://registry.falsify.dev/9d028ef6fc3af63277921ceaab146359589298e734b146c57a4158d72676f39c)Verify in CI
- uses: studio-11-co/prml-verify-action@v1
with:
mode: verdict
expected-hash: 9d028ef6fc3af63277921ceaab146359589298e734b146c57a4158d72676f39cgithub.com/studio-11-co/prml-verify-action →Verify this hash yourself
Paste your manifest YAML. The canonical hash must match 9d028ef6fc3a….
v0.2 RFC open for comment until 2026-05-22 23:59 UTC — comment on github.com/studio-11-co/falsify/issues. How to reach the editor: spec.falsify.dev/editor.