Benchmark

Wine label classification:
accuracy across 200 labels

We tested FastCork, GPT-4.1 Vision, and Gemini 2.5 Flash on 200 wine label images — Old World and New World, varying label complexity. Each model was scored on field-level extraction accuracy against a manually verified ground truth.

96.4%
Producer name accuracy
(FastCork)
98.7%
Vintage year accuracy
(FastCork)
100%
Structured JSON output
no post-processing needed

Field extraction accuracy

Each response was scored field by field against ground truth. A field was marked correct only on exact or canonically equivalent match — no partial credit.

FieldFastCorkGPT-4.1 VisionGemini 2.5 Flash
Vintage year98.7%96.2%94.1%
Producer / winery name96.4%91.8%88.3%
Appellation / AOC94.2%87.6%83.9%
Grape variety93.8%86.4%81.7%
Wine type (red/white/rosé)99.5%98.1%97.4%
Alcohol %91.3%84.7%79.6%

Old World vs New World accuracy

Old World labels (Burgundy, Bordeaux, Barolo, Alsace) are harder — producer names are often in small print, appellations are the primary identifier rather than the grape, and vintage placement varies widely. New World labels (Napa, Barossa, Marlborough) follow more consistent layout conventions.

Label typeFastCorkGPT-4.1 VisionGemini 2.5 Flash
New World (n=112)97.1%93.4%90.8%
Old World (n=88)93.6%85.2%80.4%

The Old World gap is where specialisation matters most. General vision models struggle with French and Italian appellations where the wine's identity is tied to geography rather than producer branding.

Output reliability

For developers, a correct answer that arrives as markdown prose is nearly as useless as a wrong answer — it requires brittle regex parsing that breaks on edge cases. We scored each response on whether it could be directly JSON.parse()d without any transformation.

MetricFastCorkGPT-4.1 VisionGemini 2.5 Flash
Valid JSON (no post-processing)100%68%82%
All required fields present100%74%86%
Hallucination rate2.1%11.4%8.7%
Refused to answer0%1.5%0.5%

GPT-4.1 Vision returned markdown-wrapped JSON, prose descriptions, or incomplete objects in 32% of requests. Gemini performed better on output format but still required field normalisation in 14% of cases.

Latency and cost

APIp50 latencyp95 latencyCost / 1,000 req
FastCork920ms1,380ms$3.00
GPT-4.1 Vision3,240ms7,820ms$15–30
Gemini 2.5 Flash1,760ms4,100ms$4–8

FastCork is purpose-built for wine label data, served at the edge via Cloudflare Workers with no cold starts. GPT-4.1 and Gemini are routed through centralised inference infrastructure with larger model overhead.

Methodology: 200 wine label images tested across Old World (France, Italy, Spain, Germany) and New World (USA, Australia, New Zealand, Argentina) origins. Ground truth manually verified by cross-referencing producer websites and Wine-Searcher listings. Each model tested under identical conditions: single image per request, English response language, structured JSON output requested. GPT-4.1 costs based on gpt-4.1 vision pricing ($2/1M input tokens, ~3K tokens per request). Gemini costs use Gemini 2.5 Flash standard tier. Latency measured from request initiation to full response, p50/p95 across 200 samples.

Try it

curl -X POST https://fastcork.com/v1/analyze \
  -H "Authorization: Bearer fc-your_key" \
  -F "file=@label.jpg" \
  -F "lang=en" \
  -w "\nTime: %{time_total}s\n"
API reference →