QEAR API documentation.

Diagnostic uncertainty for AI you can trust. Use one HTTP endpoint to find out whether an answer from any LLM is grounded in real knowledge, or whether the model is making it up. Get a calibrated confidence score and a human-readable diagnosis you can act on.

Quickstart

QEAR is a single HTTPS endpoint. You send a question and (for verification) the candidate answer; QEAR returns a confidence score plus a diagnosis of the kind of uncertainty involved.

1. Get an API key

Sign up at app.qear.ai/signup with your email. After clicking the magic link, you can generate your first API key on the dashboard. The key is shown once — save it in a password manager.

2. Make your first call

Use the /v1/verify endpoint to check whether an answer is trustworthy. Replace qe_YOUR_KEY with the key you just generated.

Shell
curl https://api.qear.ai/v1/verify \
  -H "Authorization: Bearer qe_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What year did Napoleon die?",
    "answer": "1821"
  }'
import requests

response = requests.post(
    "https://api.qear.ai/v1/verify",
    headers={"Authorization": "Bearer qe_YOUR_KEY"},
    json={
        "prompt": "What year did Napoleon die?",
        "answer": "1821",
    },
)

result = response.json()
print(result["verdict"], result["confidence"])
print(result["diagnosis"])
const response = await fetch(
  "https://api.qear.ai/v1/verify",
  {
    method: "POST",
    headers: {
      "Authorization": "Bearer qe_YOUR_KEY",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      prompt: "What year did Napoleon die?",
      answer: "1821",
    }),
  }
);

const result = await response.json();
console.log(result.verdict, result.confidence);
console.log(result.diagnosis);

3. Read the response

A typical successful response looks like this:

JSON
{
  "confidence": 0.92,
  "verdict": "high_confidence",
  "uncertainty_class": "none",
  "diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
  "recommended_action": "Trust the answer.",
  "consensus_answer": "1821",
  "customer_answer_relation": "ENTAILMENT",
  ...
}

The four fields you'll almost always use:

  • verdict — one of high_confidence, low_confidence, insufficient_information. Use this to decide whether to surface the answer to a user.
  • confidence — a number in [0.0, 1.0] for fine-grained thresholding.
  • uncertainty_class — the kind of uncertainty. Distinguishes "model doesn't know" from "model is making it up" from "candidates disagree." See Response schemas.
  • diagnosis — a human-readable explanation of why the score came out the way it did. Surface this to your engineers or to end users.
Tip Free tier includes 1,000 verifications and 200 generations per month — enough to integrate and evaluate without paying. See Rate limits & quotas.

Authentication

Every request must include your API key in the Authorization header as a Bearer token. Keys look like qe_ followed by ~42 random characters.

HTTP
Authorization: Bearer qe_YOUR_KEY_HERE
Content-Type: application/json

Missing or invalid keys return 401 unauthorized. The most common causes are: typo in the key, key was revoked, or the Bearer prefix was omitted.

Security Never put your API key in client-side code, public repositories, or chat windows. Keys grant access to your quota and your billing. Store them in environment variables on your server. If a key is exposed, rotate it immediately from the dashboard.

Rotating a key

Go to the dashboard and click the rotate (↻) button next to your masked key. The old key is revoked instantly and a new one is shown once. Any code using the old key will start returning 401 immediately, so deploy the new key first.

POST /v1/verify

Given a prompt (the question) and a candidate answer, returns a confidence score and diagnosis. This is the primary endpoint — use it whenever you have an AI-generated answer and want to know whether to trust it.

POST https://api.qear.ai/v1/verify

Request body

FieldDescription
promptrequired
string, max 8000 chars
The question or instruction. This is what an LLM would have been asked.
answerrequired
string
The candidate answer you want to verify. Typically, the output of an LLM call you already made.
source_model
string, optional
The model QEAR should use internally for verification (e.g. llama-3.3-70b-versatile). Defaults to a balanced model. Higher tiers can use more capable models.
confidence_target
number, optional, 0–1
How confident QEAR should aim to be before returning. Higher values cascade more samples and cost more. Default 0.7.
byok_key
string, optional, Pro+
Bring your own provider API key. When set, QEAR uses it to call the model on your behalf, bypassing QEAR's compute cost. See requires_byok on models.

Example request

Shell
curl https://api.qear.ai/v1/verify \
  -H "Authorization: Bearer qe_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Who wrote the novel '\''The Master and Margarita'\''?",
    "answer": "Mikhail Bulgakov",
    "confidence_target": 0.8
  }'
import requests

r = requests.post(
    "https://api.qear.ai/v1/verify",
    headers={"Authorization": f"Bearer {QEAR_KEY}"},
    json={
        "prompt": "Who wrote 'The Master and Margarita'?",
        "answer": "Mikhail Bulgakov",
        "confidence_target": 0.8,
    },
)
data = r.json()
const r = await fetch("https://api.qear.ai/v1/verify", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${QEAR_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    prompt: "Who wrote 'The Master and Margarita'?",
    answer: "Mikhail Bulgakov",
    confidence_target: 0.8,
  }),
});
const data = await r.json();

Example response

JSON
{
  "confidence": 0.94,
  "verdict": "high_confidence",
  "customer_answer_relation": "ENTAILMENT",
  "uncertainty_class": "none",
  "diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
  "recommended_action": "Trust the answer.",
  "consensus_answer": "Mikhail Bulgakov",
  "alternative_answers": [],
  "semantic_entropy": 0.0,
  "cascade": {
    "rounds": 1,
    "cumulative_samples": 5,
    "cluster_count": 1,
    "nli_calls": 0
  },
  "auxiliary_signals": {
    "default_fraction": 0.0,
    "refusal_fraction": 0.0
  },
  "usage": {
    "input_tokens": 142,
    "output_tokens": 38,
    "latency_ms": 812
  },
  "model": {
    "provider": "groq",
    "id": "llama-3.3-70b-versatile"
  }
}

See Response schemas for the full field reference, including all possible verdict and uncertainty_class values and what each means.

POST /v1/generate

Asks QEAR to generate an answer to a prompt and assess its own confidence in that answer. Unlike /v1/verify, you don't supply a candidate answer — QEAR produces one and returns it alongside the diagnostic. Use this when you don't have an LLM call of your own to verify.

POST https://api.qear.ai/v1/generate

Request body

FieldDescription
promptrequired
string, max 8000 chars
The question. QEAR will both answer it and assess its confidence in that answer.
source_model
string, optional
Model to use for generation. Defaults to llama-3.3-70b-versatile.
confidence_target
number, optional, 0–1
How confident QEAR should aim to be before returning. Higher values cascade more samples. Default 0.7.
byok_key
string, optional, Pro+
Bring your own provider API key for the underlying model calls.

Example request

Shell
curl https://api.qear.ai/v1/generate \
  -H "Authorization: Bearer qe_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the capital of Australia?"
  }'

Example response

The response shape is the same as /v1/verify, except there is no customer_answer_relation field (you didn't provide an answer). The consensus_answer is QEAR's best answer, and the verdict applies to that answer.

JSON
{
  "confidence": 0.96,
  "verdict": "high_confidence",
  "uncertainty_class": "none",
  "diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
  "recommended_action": "Trust the answer.",
  "consensus_answer": "Canberra",
  "alternative_answers": [],
  "semantic_entropy": 0.0,
  ...
}

Errors

All errors return a JSON body with an error.code machine-readable identifier and a human-readable error.message. The HTTP status code reflects the error class.

JSON (typical error)
{
  "error": {
    "code": "quota_exceeded",
    "message": "Monthly quota exceeded for this endpoint",
    "quota": {
      "current": 200,
      "limit": 200,
      "remaining": 0,
      "is_over_limit": true
    }
  },
  "request_id": "req_eb50bd77341e474b"
}
HTTPCodeMeaning
400 bad_request The request body is malformed or missing required fields. Check the details field for specifics.
400 byok_required The model you requested requires a BYOK (bring-your-own-key) configuration that wasn't provided.
400 byok_invalid The BYOK key you provided was rejected by the underlying provider.
401 unauthorized Missing or invalid API key. Check the Bearer header.
403 forbidden Your tier doesn't include access to this model or feature. Upgrade to access it.
413 payload_too_large Prompt or answer exceeds the 8000 character limit.
429 quota_exceeded You've hit your monthly verify or generate quota. See Rate limits.
500 internal_error Something unexpected went wrong on QEAR's side. Retries are safe.
502 provider_error The upstream model provider (Groq, OpenAI, etc.) failed. Retries are safe.

Errors include a request_id. Include it in any support email — it lets us look up exactly what happened.

Rate limits & quotas

QEAR enforces two kinds of limits: per-minute rate limits to prevent abuse, and monthly quotas that determine your tier. Both apply at the user level (not per API key), so generating more keys doesn't increase your allotment.

Tier /v1/verify /v1/generate Requests/min
free 1,000 / mo 200 / mo 10
indie — $19/mo 25,000 / mo 5,000 / mo 60
pro — $79/mo 100,000 / mo 25,000 / mo 300
scale — $299/mo 500,000 / mo 150,000 / mo 1,000
enterprise — contact unlimited unlimited 5,000

When you exceed a monthly quota, QEAR returns 429 quota_exceeded and stops processing requests until the start of the next calendar month (UTC). Upgrade your tier at the dashboard to lift the limit immediately.

No per-key limits Quotas are computed per user, not per API key. You can create as many keys as you need (one per service, one per environment, etc.) without splitting your quota.

Response schemas

Both /v1/verify and /v1/generate return the same shape (verify additionally includes customer_answer_relation). The fields that drive integration decisions:

verdict

A single token summarizing whether the answer should be trusted.

ValueWhat it means
high_confidenceThe model is consistent and the customer answer is supported. Safe to surface.
low_confidenceThe model is uncertain or the customer answer contradicts the consensus. Flag for human review.
insufficient_informationThe model didn't produce enough usable candidates to assess. Often signals an out-of-scope or malformed prompt.

uncertainty_class

The kind of uncertainty involved. This is the diagnostic signal that distinguishes "model doesn't know" from "model is making it up" — most use cases should branch on this field.

ValueWhat it means
noneAll candidates agreed. No uncertainty detected.
surface_variationCandidates split into clusters, but the disagreement is just phrasing on the same core fact.
factual_disagreementCandidates disagree on substance. The model has multiple incompatible answers.
knowledge_gapThe model is consistently signaling it doesn't have this information (refusals, "I don't know").
degenerate_defaultThe model is hedging with defaults like 0, N/A, undefined. Not actually answering.
insufficient_informationNo valid answers generated. The prompt may be malformed or outside the model's capability.

customer_answer_relation

(Only on /v1/verify.) How the customer's submitted answer relates to QEAR's consensus answer.

ValueWhat it means
ENTAILMENTThe customer answer matches or follows from the consensus. Likely correct.
NEUTRALThe customer answer is neither clearly supported nor contradicted. Ambiguous relationship.
CONTRADICTIONThe customer answer contradicts the consensus. Likely wrong.

Diagnostic fields

FieldDescription
confidence
number, 0–1
Calibrated confidence in the answer. Higher = more reliable. Use for fine-grained thresholding.
diagnosis
string
Human-readable explanation of how the verdict was reached. Surfaces well in error messages and audit logs.
recommended_action
string
What QEAR thinks you should do with this answer.
consensus_answer
string
QEAR's best-guess answer, derived from the candidate consensus. May be null if no candidates agreed.
alternative_answers
string[]
Up to 3 alternative answers from other clusters (when candidates disagreed).
semantic_entropy
number, 0–1
Normalized entropy across answer clusters. 0 = unanimous, 1 = maximally split.

Cascade & usage metadata

Auxiliary fields useful for debugging and cost tracking.

FieldDescription
cascade.roundsHow many sampling rounds were needed to reach the confidence target. Higher = harder question.
cascade.cumulative_samplesTotal candidate answers generated across all rounds.
cascade.cluster_countNumber of semantically distinct answer clusters.
cascade.nli_callsNumber of NLI (entailment) classifier calls used to merge clusters.
usage.input_tokens / output_tokensToken usage from the underlying model.
usage.latency_msEnd-to-end response time.
model.provider / idThe model that did the work (e.g. groq / llama-3.3-70b-versatile).
Stable surface Field names and value enums are stable. We may add new fields or new uncertainty_class values, but won't remove or rename existing ones without a major version bump.