Skip to main content

Trust and verification

This page explains what our public labels mean, what we benchmark internally, and where human review is still required. The short version: not every output is equal, and we label that openly.

Latest full gate

7/7

Full fixture checks passed in the latest published quality gate run.

Active recommendation

gemini-3.1-flash-lite-preview

candidate meets quality and latency/runtime guardrails

Runtime comparison

7.97s

Candidate model faster than the baseline across the current benchmark run.

Public status labels

Every public example uses one of these labels. The label is part of the product, not decoration.

Draft

Draft output

Useful for checking workflow fit, but not yet human-reviewed by the AutoDots team.

Current public count: 2

Reviewed

Reviewed sample

A team member has checked the sample against the source and recorded known limitations.

Current public count: 3

Verified

Verified catalogue entry

Approved against the current checklist version and ready for catalogue publication.

Current public count: 0

How the evidence is produced

Public trust data is generated from internal benchmark and translation quality gate reports, then paired with reviewed sample metadata before publication.

  • Benchmark summaries are generated from repository-controlled fixtures rather than customer files.
  • We publish the active model recommendation, high-level pass/fail counts, and runtime summaries, not raw prompts or private outputs.
  • Document AI is the default PDF extraction path, with Gemini used selectively for chunk repair, visual descriptions, and internal model benchmarking.
Fixture set
Full translation quality gate
Thinking level
MINIMAL
Runtime
33.43 seconds
Checklist version
2026.03

Promotion rules

Reviewed

  • A team member compares the sample against the source and checks the main reading flow.
  • Known limitations are written down before the sample is published.

Verified

  • The reviewed checks pass again on the current pipeline version.
  • The sample clears the current checklist version and is approved for catalogue publication.

Known limitations

  • Poor scans, handwriting, and dense layout PDFs can still require manual correction.
  • Mathematics, chemistry, and exam papers remain review-required even when the technical path succeeds.
  • Visual descriptions use source context where available, but labelled diagrams may still need human checking.

When human review is required

  • Any document that will be issued as an official exam, compliance artifact, or embosser-ready final.
  • Outputs containing dense notation, multi-part diagrams, or large tables.
  • Any case where the source itself is ambiguous, incomplete, or image-heavy.
Inspect examples Browse catalogue Read help and limitations