leaderboard-pilot-smoke

2026-03-12 07:32:36 UTC · 35.4s · 2 models · 1 prompts · Total cost: $0.01
#1 z-image-turbo 5.0/5 WINNER | Best Value: flux-schnell
2 models · 1 prompts (n=1) · $0.01 · 35s · ⚠ Small sample — scores may vary with more prompts

Model Rankings

Rank Model Visual Quality Prompt Adherence Text Rendering CLIP sharpness Overall (weighted) Conf
1 z-image-turboWINNER 5.0 5.0 5.0 0.358 0.943 5.0 n=1 98%
2 flux-schnell 4.0 5.0 5.0 0.313 0.945 4.5 n=1 97%

Dimension Profile

Cost Efficiency

RankModelScoreCost/ImageScore/$vs Winner
1 flux-schnellBEST VALUE 4.5 $0.0030 1507 -0.5 quality, 40% cheaper
2 z-image-turboWINNER 5.0 $0.0050 1000 baseline

flux-schnell is 40% cheaper than z-image-turbo with only 0.5 points less quality

A glowing neon sign reading EVALYTIC in a dark rainy alley, reflections on wet cobblestones, cinematic lighting

flux-schnell flux-schnell
4.7/5
1354ms · $0.0030
Score Details
Visual Quality: 4.0/5 (90% conf)
The image exhibits high visual quality with excellent atmospheric rendering, realistic lighting, and convincing reflections on the wet ground. The neon sign is sharp and legible, creating a strong focal point and contributing to a cohesive scene. The only minor flaw is some illegible text on a sign on the right building.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly matches all aspects of the prompt, featuring a clearly legible neon sign 'EVALYTIC' in a dark, rainy alley. The reflections on the wet cobblestones are prominent and well-rendered, contributing to the overall cinematic lighting and atmosphere.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered on the neon sign. The letters are crisp, clear, and accurately spelled, with no visible rendering issues or artifacts.
CLIP: 0.3127
sharpness: 0.9446
z-image-turbo z-image-turbo
5.0/5
2204ms · $0.0050
Score Details
Visual Quality: 5.0/5 (95% conf)
The image exhibits excellent visual quality, effectively capturing a moody, rainy night scene with realistic lighting and atmospheric effects. The neon sign is perfectly rendered, and its reflections on the wet cobblestones are highly convincing, contributing to a professional-grade aesthetic.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures every element described in the prompt. The neon sign clearly reads 'EVALYTIC' and glows brightly in a dark, rainy alley, with excellent reflections on the wet cobblestones and a strong cinematic atmosphere.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered on the neon sign. It is crisp, accurate, and clearly legible with no spelling errors or visual artifacts.
CLIP: 0.3581
sharpness: 0.9432

Cost Summary

CategoryRequestsCost
fal.ai generation2$0.0080
gemini-2.5-flash judge6$0.0030
Local metrics$0.0000
Total$0.0110

Configuration

Modelsflux-schnell, z-image-turbo
Judgegemini-2.5-flash
Dimensionsvisual_quality, prompt_adherence, text_rendering
Pipelinetext2img
Evalytic Version0.3.10
Platformdarwin-arm64
Metric Scoring clip_score: threshold=0.18, weight=0.2, lpips: threshold=0.4, weight=0.2, face_similarity: threshold=0.6, weight=0.2