Embedding Dimensions
Variable-size output
The /embed endpoint can return vectors at 128, 256, or 384 dimensions, chosen per request with no need to re-run anything. Smaller vectors cost less to store and search, for a small and measurable accuracy tradeoff (see Quality comparison). Pick the size that fits your storage and accuracy budget.
Choosing a dimension
| Dimension | Storage per item | Use case |
|---|---|---|
| 128 | 0.5 KB | Cost-sensitive dedup, large catalogs, fast retrieval |
| 256 | 1.0 KB | Balanced quality and cost |
| 384 | 1.5 KB | Highest quality, fine-grained distinction |
128 dimensions
Good enough for most dedup and search tasks. Catches obvious duplicates ("Chiken Biryani" vs "Chicken Biryani") and handles broad search queries well. Use this when you have millions of items and storage or latency matters.
256 dimensions
Middle ground. Slightly better at distinguishing similar items without meaningful cost increase for most applications.
384 dimensions
Best accuracy. Use this when precision matters, for example distinguishing "Latte" from "Mocha" or "Butter Chicken" from "Chicken Butter Masala".
Storage math
For a catalog of 1 million items:
- 128d: ~512 MB
- 256d: ~1 GB
- 384d: ~1.5 GB
These are raw vector sizes. Your vector database adds overhead for indexing (typically 20-50% more).
How to specify dimension
Pass the dimension parameter when calling /embed:
# Compact embeddings for large-scale dedup
resp = requests.post(f"{BASE}/embed", headers=headers,
json={"items": menu_items, "dimension": 128})
# High-quality embeddings for precise matching
resp = requests.post(f"{BASE}/embed", headers=headers,
json={"items": menu_items, "dimension": 384})The dimension parameter applies to /embed, where you store the vectors yourself. The /search, /match, /dedup, and /report endpoints handle retrieval for you, so there is no dimension to set on those.
Quality comparison
Reducing the output dimension costs a small, measurable amount of accuracy:
- 384 to 256: ~1-2% drop on fine-grained benchmarks
- 384 to 128: ~3-5% drop on fine-grained benchmarks
- All sizes perform equally well on obvious duplicates
If your items are sufficiently distinct (pizza vs sushi vs biryani), 128 works perfectly. If you need to distinguish closely related items (Flat White vs Cappuccino vs Latte), use 384.
How Food Embeddings Work
How food embeddings work and why generic models fail on menus. Covers cosine similarity ranges, transliteration, noise, and cross-lingual mapping.
Built-in Preprocessing
dish-embed strips noise (promo text, prices, sizes) and normalizes spelling on every input automatically. No client-side cleanup required before sending.