Docs
API Reference

GET /health

Check if the server is running. Returns the server status and active model version. The account operator key returns an extended diagnostic view.

Public response (no API key)

curl https://dish-embed.latimal.com/health
{
  "status": "healthy",
  "model": "latimal-food-embed-v1"
}

Operator diagnostics

Calls made with the account's operator key return an extended view with model load state, inference backend, and cache utilization. Regular customer keys receive the standard {status, model} response shown above.

curl https://dish-embed.latimal.com/health -H "X-API-Key: OPERATOR_KEY"
FieldTypeDescription
statusstringServer status ("healthy")
modelstringEmbedding model version (e.g. latimal-food-embed-v1)
biencoderstringEmbedding service status
backendstringInference backend (torch or onnx)
rerankerstring or nullDedup precision-stage status, null if not loaded
search_rerankerstring or nullSearch precision-stage status, null if not loaded
classifierstring or nullCuisine classifier status, null if not loaded
supported_dimensionsint[]Available embedding dimensions
embedding_cache_sizeintNumber of cached embeddings (server-side)
load_time_secondsfloatServer startup time in seconds
{
  "status": "healthy",
  "model": "latimal-food-embed-v1",
  "biencoder": "loaded",
  "backend": "onnx",
  "reranker": "loaded",
  "search_reranker": "loaded",
  "classifier": "loaded",
  "supported_dimensions": [128, 256, 384],
  "embedding_cache_size": 12847,
  "load_time_seconds": 51.3
}

Availability

Free and public. This request is not metered and does not count against your monthly quota.

Notes

  • Use the public endpoint for uptime monitoring and health checks.
  • The operator diagnostics view is useful for confirming which models are loaded and checking cache utilization.
  • A 503 response means the server is still loading models at startup. Retry after a few seconds.