QEAR API documentation.
Diagnostic uncertainty for AI you can trust. Use one HTTP endpoint to find out whether an answer from any LLM is grounded in real knowledge, or whether the model is making it up. Get a calibrated confidence score and a human-readable diagnosis you can act on.
Quickstart
QEAR is a single HTTPS endpoint. You send a question and (for verification) the candidate answer; QEAR returns a confidence score plus a diagnosis of the kind of uncertainty involved.
1. Get an API key
Sign up at app.qear.ai/signup with your email. After clicking the magic link, you can generate your first API key on the dashboard. The key is shown once — save it in a password manager.
2. Make your first call
Use the /v1/verify endpoint to check whether an answer is
trustworthy. Replace qe_YOUR_KEY with the key you just generated.
curl https://api.qear.ai/v1/verify \ -H "Authorization: Bearer qe_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "What year did Napoleon die?", "answer": "1821" }'
import requests response = requests.post( "https://api.qear.ai/v1/verify", headers={"Authorization": "Bearer qe_YOUR_KEY"}, json={ "prompt": "What year did Napoleon die?", "answer": "1821", }, ) result = response.json() print(result["verdict"], result["confidence"]) print(result["diagnosis"])
const response = await fetch( "https://api.qear.ai/v1/verify", { method: "POST", headers: { "Authorization": "Bearer qe_YOUR_KEY", "Content-Type": "application/json", }, body: JSON.stringify({ prompt: "What year did Napoleon die?", answer: "1821", }), } ); const result = await response.json(); console.log(result.verdict, result.confidence); console.log(result.diagnosis);
3. Read the response
A typical successful response looks like this:
{
"confidence": 0.92,
"verdict": "high_confidence",
"uncertainty_class": "none",
"diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
"recommended_action": "Trust the answer.",
"consensus_answer": "1821",
"customer_answer_relation": "ENTAILMENT",
...
}
The four fields you'll almost always use:
verdict— one ofhigh_confidence,low_confidence,insufficient_information. Use this to decide whether to surface the answer to a user.confidence— a number in[0.0, 1.0]for fine-grained thresholding.uncertainty_class— the kind of uncertainty. Distinguishes "model doesn't know" from "model is making it up" from "candidates disagree." See Response schemas.diagnosis— a human-readable explanation of why the score came out the way it did. Surface this to your engineers or to end users.
Authentication
Every request must include your API key in the Authorization
header as a Bearer token. Keys look like qe_ followed by ~42
random characters.
Authorization: Bearer qe_YOUR_KEY_HERE Content-Type: application/json
Missing or invalid keys return 401 unauthorized. The most common
causes are: typo in the key, key was revoked, or the Bearer
prefix was omitted.
Rotating a key
Go to the dashboard and click the rotate (↻) button next to your masked key. The old key is revoked instantly and a new one is shown once. Any code using the old key will start returning 401 immediately, so deploy the new key first.
POST /v1/verify
Given a prompt (the question) and a candidate answer, returns a confidence score and diagnosis. This is the primary endpoint — use it whenever you have an AI-generated answer and want to know whether to trust it.
Request body
| Field | Description |
|---|---|
promptrequiredstring, max 8000 chars |
The question or instruction. This is what an LLM would have been asked. |
answerrequiredstring |
The candidate answer you want to verify. Typically, the output of an LLM call you already made. |
source_modelstring, optional |
The model QEAR should use internally for verification (e.g. llama-3.3-70b-versatile). Defaults to a balanced model. Higher tiers can use more capable models. |
confidence_targetnumber, optional, 0–1 |
How confident QEAR should aim to be before returning. Higher values cascade more samples and cost more. Default 0.7. |
byok_keystring, optional, Pro+ |
Bring your own provider API key. When set, QEAR uses it to call the model on your behalf, bypassing QEAR's compute cost. See requires_byok on models. |
Example request
curl https://api.qear.ai/v1/verify \ -H "Authorization: Bearer qe_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "Who wrote the novel '\''The Master and Margarita'\''?", "answer": "Mikhail Bulgakov", "confidence_target": 0.8 }'
import requests r = requests.post( "https://api.qear.ai/v1/verify", headers={"Authorization": f"Bearer {QEAR_KEY}"}, json={ "prompt": "Who wrote 'The Master and Margarita'?", "answer": "Mikhail Bulgakov", "confidence_target": 0.8, }, ) data = r.json()
const r = await fetch("https://api.qear.ai/v1/verify", { method: "POST", headers: { "Authorization": `Bearer ${QEAR_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ prompt: "Who wrote 'The Master and Margarita'?", answer: "Mikhail Bulgakov", confidence_target: 0.8, }), }); const data = await r.json();
Example response
{
"confidence": 0.94,
"verdict": "high_confidence",
"customer_answer_relation": "ENTAILMENT",
"uncertainty_class": "none",
"diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
"recommended_action": "Trust the answer.",
"consensus_answer": "Mikhail Bulgakov",
"alternative_answers": [],
"semantic_entropy": 0.0,
"cascade": {
"rounds": 1,
"cumulative_samples": 5,
"cluster_count": 1,
"nli_calls": 0
},
"auxiliary_signals": {
"default_fraction": 0.0,
"refusal_fraction": 0.0
},
"usage": {
"input_tokens": 142,
"output_tokens": 38,
"latency_ms": 812
},
"model": {
"provider": "groq",
"id": "llama-3.3-70b-versatile"
}
}
See Response schemas for the full field reference,
including all possible verdict and uncertainty_class
values and what each means.
POST /v1/generate
Asks QEAR to generate an answer to a prompt and assess its own
confidence in that answer. Unlike /v1/verify, you don't supply a
candidate answer — QEAR produces one and returns it alongside the diagnostic.
Use this when you don't have an LLM call of your own to verify.
Request body
| Field | Description |
|---|---|
promptrequiredstring, max 8000 chars |
The question. QEAR will both answer it and assess its confidence in that answer. |
source_modelstring, optional |
Model to use for generation. Defaults to llama-3.3-70b-versatile. |
confidence_targetnumber, optional, 0–1 |
How confident QEAR should aim to be before returning. Higher values cascade more samples. Default 0.7. |
byok_keystring, optional, Pro+ |
Bring your own provider API key for the underlying model calls. |
Example request
curl https://api.qear.ai/v1/generate \ -H "Authorization: Bearer qe_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "What is the capital of Australia?" }'
Example response
The response shape is the same as /v1/verify, except there is no
customer_answer_relation field (you didn't provide an answer).
The consensus_answer is QEAR's best answer, and the verdict
applies to that answer.
{
"confidence": 0.96,
"verdict": "high_confidence",
"uncertainty_class": "none",
"diagnosis": "All 5 valid candidates agreed semantically. High confidence in the consensus answer.",
"recommended_action": "Trust the answer.",
"consensus_answer": "Canberra",
"alternative_answers": [],
"semantic_entropy": 0.0,
...
}
Errors
All errors return a JSON body with an error.code machine-readable
identifier and a human-readable error.message. The HTTP status
code reflects the error class.
{
"error": {
"code": "quota_exceeded",
"message": "Monthly quota exceeded for this endpoint",
"quota": {
"current": 200,
"limit": 200,
"remaining": 0,
"is_over_limit": true
}
},
"request_id": "req_eb50bd77341e474b"
}
| HTTP | Code | Meaning |
|---|---|---|
400 |
bad_request |
The request body is malformed or missing required fields. Check the details field for specifics. |
400 |
byok_required |
The model you requested requires a BYOK (bring-your-own-key) configuration that wasn't provided. |
400 |
byok_invalid |
The BYOK key you provided was rejected by the underlying provider. |
401 |
unauthorized |
Missing or invalid API key. Check the Bearer header. |
403 |
forbidden |
Your tier doesn't include access to this model or feature. Upgrade to access it. |
413 |
payload_too_large |
Prompt or answer exceeds the 8000 character limit. |
429 |
quota_exceeded |
You've hit your monthly verify or generate quota. See Rate limits. |
500 |
internal_error |
Something unexpected went wrong on QEAR's side. Retries are safe. |
502 |
provider_error |
The upstream model provider (Groq, OpenAI, etc.) failed. Retries are safe. |
Errors include a request_id. Include it in any support email — it
lets us look up exactly what happened.
Rate limits & quotas
QEAR enforces two kinds of limits: per-minute rate limits to prevent abuse, and monthly quotas that determine your tier. Both apply at the user level (not per API key), so generating more keys doesn't increase your allotment.
| Tier | /v1/verify | /v1/generate | Requests/min |
|---|---|---|---|
free |
1,000 / mo | 200 / mo | 10 |
indie — $19/mo |
25,000 / mo | 5,000 / mo | 60 |
pro — $79/mo |
100,000 / mo | 25,000 / mo | 300 |
scale — $299/mo |
500,000 / mo | 150,000 / mo | 1,000 |
enterprise — contact |
unlimited | unlimited | 5,000 |
When you exceed a monthly quota, QEAR returns 429 quota_exceeded
and stops processing requests until the start of the next calendar month (UTC).
Upgrade your tier at the dashboard
to lift the limit immediately.
Response schemas
Both /v1/verify and /v1/generate return the same
shape (verify additionally includes customer_answer_relation).
The fields that drive integration decisions:
verdict
A single token summarizing whether the answer should be trusted.
| Value | What it means |
|---|---|
high_confidence | The model is consistent and the customer answer is supported. Safe to surface. |
low_confidence | The model is uncertain or the customer answer contradicts the consensus. Flag for human review. |
insufficient_information | The model didn't produce enough usable candidates to assess. Often signals an out-of-scope or malformed prompt. |
uncertainty_class
The kind of uncertainty involved. This is the diagnostic signal that distinguishes "model doesn't know" from "model is making it up" — most use cases should branch on this field.
| Value | What it means |
|---|---|
none | All candidates agreed. No uncertainty detected. |
surface_variation | Candidates split into clusters, but the disagreement is just phrasing on the same core fact. |
factual_disagreement | Candidates disagree on substance. The model has multiple incompatible answers. |
knowledge_gap | The model is consistently signaling it doesn't have this information (refusals, "I don't know"). |
degenerate_default | The model is hedging with defaults like 0, N/A, undefined. Not actually answering. |
insufficient_information | No valid answers generated. The prompt may be malformed or outside the model's capability. |
customer_answer_relation
(Only on /v1/verify.) How the customer's submitted answer
relates to QEAR's consensus answer.
| Value | What it means |
|---|---|
ENTAILMENT | The customer answer matches or follows from the consensus. Likely correct. |
NEUTRAL | The customer answer is neither clearly supported nor contradicted. Ambiguous relationship. |
CONTRADICTION | The customer answer contradicts the consensus. Likely wrong. |
Diagnostic fields
| Field | Description |
|---|---|
confidencenumber, 0–1 |
Calibrated confidence in the answer. Higher = more reliable. Use for fine-grained thresholding. |
diagnosisstring |
Human-readable explanation of how the verdict was reached. Surfaces well in error messages and audit logs. |
recommended_actionstring |
What QEAR thinks you should do with this answer. |
consensus_answerstring |
QEAR's best-guess answer, derived from the candidate consensus. May be null if no candidates agreed. |
alternative_answersstring[] |
Up to 3 alternative answers from other clusters (when candidates disagreed). |
semantic_entropynumber, 0–1 |
Normalized entropy across answer clusters. 0 = unanimous, 1 = maximally split. |
Cascade & usage metadata
Auxiliary fields useful for debugging and cost tracking.
| Field | Description |
|---|---|
cascade.rounds | How many sampling rounds were needed to reach the confidence target. Higher = harder question. |
cascade.cumulative_samples | Total candidate answers generated across all rounds. |
cascade.cluster_count | Number of semantically distinct answer clusters. |
cascade.nli_calls | Number of NLI (entailment) classifier calls used to merge clusters. |
usage.input_tokens / output_tokens | Token usage from the underlying model. |
usage.latency_ms | End-to-end response time. |
model.provider / id | The model that did the work (e.g. groq / llama-3.3-70b-versatile). |
uncertainty_class values, but won't remove or rename existing
ones without a major version bump.