leaderboard-test-3x5

2026-03-12 10:46:21 UTC · 135.7s · 5 models · 3 prompts · Total cost: $0.29
#1 nano-banana 4.2/5 WINNER | Best Value: flux-schnell
5 models · 3 prompts (n=3) · $0.29 · 136s · ⚠ Small sample — scores may vary with more prompts

Model Rankings

Rank Model Visual Quality Prompt Adherence Text Rendering CLIP sharpness aesthetic_score Overall (weighted) Conf
1 nano-bananaWINNER 5.0 5.0 4.3 0.314 0.864 0.531 4.2 n=3 97%
2 z-image-turbo 4.0 4.7 4.0 0.346 0.901 0.501 4.0 n=3 93%
3 flux-2-dev 4.0 4.7 3.7 0.351 0.870 0.523 4.0 n=3 95%
4 ideogram-v3 4.0 4.7 4.0 0.300 0.940 0.529 3.8 n=3 94%
5 flux-schnell 3.3 4.7 3.7 0.305 0.928 0.503 3.6 n=3 94%

Dimension Profile

Cost Efficiency

RankModelScoreCost/ImageScore/$vs Winner
1 flux-schnellBEST VALUE 3.6 $0.0030 1190 -0.7 quality, 92% cheaper
2 z-image-turbo 4.0 $0.0050 804 -0.2 quality, 87% cheaper
3 flux-2-dev 4.0 $0.0120 334 -0.2 quality, 70% cheaper
4 ideogram-v3 3.8 $0.0300 126 -0.5 quality, 25% cheaper
5 nano-bananaWINNER 4.2 $0.0398 106 baseline

flux-schnell is 92% cheaper than nano-banana with only 0.7 points less quality

Metric-VLM Correlation

PairPearson rp-valueAgreement
clip score vs prompt adherence -0.08 0.9578 low agreement

A glowing neon sign reading EVALYTIC in a dark rainy alley, reflections on wet cobblestones, cinematic lighting

flux-schnell flux-schnell
4.7/5
1417ms · $0.0030
Score Details
Visual Quality: 4.0/5 (90% conf)
The image exhibits high visual quality with excellent atmospheric rendering, particularly the neon sign and its reflections on the wet ground. However, there is a minor artifact in the form of illegible text on a sign on the right-hand building.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all elements of the prompt, featuring a clearly legible neon sign 'EVALYTIC' in a dark, rainy alley. The reflections on the wet cobblestones are prominent and well-rendered, contributing to the overall cinematic lighting and atmosphere.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered on the neon sign. Each letter is crisp, correctly formed, and highly legible, matching the expected content precisely.
CLIP: 0.3127
sharpness: 0.9446
aesthetic_score: 0.4613
flux-2-dev flux-2-dev
5.0/5
3285ms · $0.0120
Score Details
Visual Quality: 5.0/5 (95% conf)
The image demonstrates professional quality with sharp, coherent rendering of the neon sign and highly realistic reflections on the wet cobblestone street. The atmospheric rain and moody lighting are consistently well-executed, creating a compelling scene without any visible artifacts or distortions.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all elements of the prompt, featuring a clear neon sign with the specified text, set in a dark, rainy alley with excellent reflections on wet cobblestones and a strong cinematic atmosphere.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered as a neon sign. Each letter is clear, correctly spelled, and the neon effect is well-executed with appropriate glow and reflections.
CLIP: 0.3409
sharpness: 0.8577
aesthetic_score: 0.5005
z-image-turbo z-image-turbo
5.0/5
2217ms · $0.0050
Score Details
Visual Quality: 5.0/5 (95% conf)
The image exhibits excellent visual quality with no discernible artifacts or inconsistencies. The lighting, reflections, and atmospheric effects are rendered with professional precision, creating a highly coherent and immersive scene.
Prompt Adherence: 5.0/5 (100% conf)
The image is an excellent representation of the prompt, perfectly capturing every detail requested. The neon sign, rainy alley, wet cobblestone reflections, and cinematic lighting are all accurately and beautifully rendered.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered as a glowing neon sign. Each letter is crisp, correctly spelled, and highly legible, with no artifacts or distortions.
CLIP: 0.3547
sharpness: 0.9440
aesthetic_score: 0.4901
ideogram-v3 ideogram-v3
5.0/5
57167ms · $0.0300
Score Details
Visual Quality: 5.0/5 (95% conf)
The image exhibits excellent visual quality with sharp details, realistic lighting, and accurate reflections. The neon sign is perfectly rendered with a convincing glow, and the wet ground accurately mirrors the scene's elements without artifacts.
Prompt Adherence: 5.0/5 (95% conf)
The image perfectly matches all elements of the prompt. It features a clear neon sign reading 'EVALYTIC' in a dark alley, with excellent reflections on wet cobblestones, and a strong sense of cinematic lighting.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered on the neon sign. Each letter is crisp, correctly formed, and clearly legible, including its reflection on the wet ground.
CLIP: 0.3036
sharpness: 0.9552
aesthetic_score: 0.4452
nano-banana nano-banana
5.0/5
53802ms · $0.0398
Score Details
Visual Quality: 5.0/5 (100% conf)
The image exhibits excellent visual quality with professional-grade rendering. All elements, from the sharp neon sign and realistic rain to the detailed wet ground reflections and atmospheric mist, are rendered flawlessly without any visible artifacts or inconsistencies.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all elements of the prompt. It features a glowing neon sign reading 'EVALYTIC' in a dark, rainy alley, with excellent reflections on the wet ground and a strong sense of cinematic lighting.
Text Rendering: 5.0/5 (100% conf)
The text 'EVALYTIC' is perfectly rendered as a neon sign. It is crisp, clearly legible, and accurately spelled, with excellent reflections on the wet ground.
CLIP: 0.3206
sharpness: 0.8651
aesthetic_score: 0.4912

A black cat sitting on a stack of old books in a cozy library, warm golden light from a desk lamp, dust particles in the air

flux-schnell flux-schnell
4.3/5
1137ms · $0.0030
Score Details
Visual Quality: 4.0/5 (80% conf)
The image exhibits good visual quality with a well-rendered main subject and atmospheric lighting. The black cat and the stack of books it sits on are detailed and realistic, contributing to a strong foreground focus. However, the background bookshelves are noticeably blurry and lack distinct book details, appearing somewhat like a textured backdrop rather than individual books.
Prompt Adherence: 4.0/5 (90% conf)
The image strongly matches most aspects of the prompt, featuring a black cat on old books in a library with warm golden light from a lamp. The only minor deviation is the absence of clearly visible 'dust particles in the air'.
Text Rendering: 5.0/5 (100% conf)
No text was expected in the image based on the prompt, and no text is present in the generated image. The image successfully avoids generating any garbled or incorrect text.
CLIP: 0.3366
sharpness: 0.8650
aesthetic_score: 0.5734
flux-2-dev flux-2-dev
5.0/5
3319ms · $0.0120
Score Details
Visual Quality: 5.0/5 (90% conf)
The image exhibits excellent visual quality with sharp details on the main subjects (cat, lamp, foreground books) and a well-executed depth of field. The dramatic and realistic lighting creates a strong atmosphere without any noticeable artifacts or inconsistencies.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures every element described in the prompt. The black cat, stack of old books, cozy library setting, warm golden light from the lamp, and dust particles are all clearly and accurately depicted.
Text Rendering: 5.0/5 (100% conf)
No text was expected in the image based on the provided prompt, and indeed, no text is present in the generated image. The image accurately reflects the absence of text.
CLIP: 0.3561
sharpness: 0.8551
aesthetic_score: 0.5742
z-image-turbo z-image-turbo
5.0/5
2471ms · $0.0050
Score Details
Visual Quality: 5.0/5 (95% conf)
The image exhibits professional-level quality with excellent atmospheric lighting, realistic textures, and a well-executed depth of field. The subtle details, such as the illuminated dust particles and the soft fur of the cat, significantly enhance the realism and overall aesthetic.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all elements described in the prompt, including the black cat, old books, cozy library setting, warm golden light from a desk lamp, and visible dust particles in the air. Every detail is accurately represented.
Text Rendering: 5.0/5 (100% conf)
No text was expected to be rendered in this image based on the provided prompt, which describes a scene rather than specific text content. The image successfully depicts the scene without any text elements.
CLIP: 0.3278
sharpness: 0.7843
aesthetic_score: 0.5808
ideogram-v3 ideogram-v3
5.0/5
56379ms · $0.0300
Score Details
Visual Quality: 5.0/5 (90% conf)
The image demonstrates excellent visual quality with sharp details, realistic textures, and atmospheric lighting. There are no visible artifacts or anatomical inconsistencies, and the depth of field is well-executed, contributing to a professional-quality render.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all elements of the prompt. A black cat is sitting on a stack of old books in a library, illuminated by warm golden light from a desk lamp, with visible dust particles in the air.
Text Rendering: 5.0/5 (100% conf)
No text was expected in the image based on the provided prompt, and indeed, no text is present. The image successfully avoids generating any garbled or unintended text.
CLIP: 0.3369
sharpness: 0.8983
aesthetic_score: 0.6036
nano-banana nano-banana
5.0/5
7046ms · $0.0398
Score Details
Visual Quality: 5.0/5 (90% conf)
The image exhibits excellent visual quality with sharp details, realistic textures, and masterful lighting. The composition is well-balanced, and elements like the cat's fur and the metallic lamp are rendered with high fidelity, creating a coherent and appealing scene.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly matches every detail specified in the prompt. All elements, from the subject to the setting and atmospheric effects, are accurately represented.
Text Rendering: 5.0/5 (100% conf)
The prompt did not specify any text to be included in the image. While there are open books with patterns resembling text, these are not intended to be readable or convey specific information, thus no text rendering quality issues are present.
CLIP: 0.3298
sharpness: 0.7948
aesthetic_score: 0.6110

A vintage coffee shop storefront with hand-lettered chalkboard menu, morning sunlight, 35mm film photography style

flux-schnell flux-schnell
2.7/5
1146ms · $0.0030
Score Details
Visual Quality: 2.0/5 (90% conf)
The image suffers from significant text incoherence, with all prominent signs and menus displaying garbled and unreadable words. While the overall composition, lighting, and general object rendering are acceptable, the failure to render legible text is a major quality issue.
Prompt Adherence: 5.0/5 (95% conf)
The image perfectly captures all aspects of the prompt, featuring a vintage-style coffee shop storefront with clear morning sunlight and a hand-lettered chalkboard menu. The overall aesthetic strongly evokes a 35mm film photography style.
Text Rendering: 1.0/5 (100% conf)
The text throughout the image is largely unreadable, garbled, or contains significant misspellings. The main sign 'Coffeee Chop' is misspelled, and the extensive text on both chalkboard menus is nonsensical gibberish, failing to convey any coherent information.
CLIP: 0.2655
sharpness: 0.9759
aesthetic_score: 0.4752
flux-2-dev flux-2-dev
2.3/5
3269ms · $0.0120
Score Details
Visual Quality: 2.0/5 (90% conf)
The image suffers from significant quality issues, primarily the illegible and nonsensical text on the chalkboard menu, which is a common AI artifact. Additionally, a strong light flare on the left side overexposes a large portion of the image, obscuring details and contributing to an overall lack of clarity.
Prompt Adherence: 4.0/5 (90% conf)
The image strongly adheres to the prompt, featuring a vintage coffee shop storefront with prominent morning sunlight and a clear 35mm film photography style. While a hand-lettered chalkboard menu is present, the text on it is illegible, which is a minor deviation.
Text Rendering: 1.0/5 (90% conf)
While the main title 'COFFEE SHOP' is legible, the vast majority of the text on the chalkboard menu consists of garbled, unreadable gibberish. The text does not form coherent words or prices, making the menu content incomprehensible.
CLIP: 0.3557
sharpness: 0.8976
aesthetic_score: 0.4947
z-image-turbo z-image-turbo
2.7/5
2107ms · $0.0050
Score Details
Visual Quality: 2.0/5 (85% conf)
While the image exhibits good lighting, sharpness, and composition, the primary visual elements—the menu items written on the coffee shop window—are largely incoherent. The text appears as garbled, nonsensical words, significantly detracting from the overall quality and purpose of the displayed information.
Prompt Adherence: 4.0/5 (85% conf)
The image strongly matches most aspects of the prompt, including the vintage coffee shop storefront, morning sunlight, and 35mm film photography style. The main deviation is the illegible text on the hand-lettered chalkboard menus.
Text Rendering: 2.0/5 (80% conf)
While the main headings like 'COFFEE' and 'Morning' are mostly readable and the hand-lettered style is well-rendered, the vast majority of the menu items listed below are garbled and nonsensical words. This makes the core content of the 'chalkboard menu' largely unreadable.
CLIP: 0.3568
sharpness: 0.9745
aesthetic_score: 0.4312
ideogram-v3 ideogram-v3
2.7/5
11662ms · $0.0300
Score Details
Visual Quality: 2.0/5 (90% conf)
The image suffers from significant text artifacts, with prominent signs displaying nonsensical or misspelled words. While the overall composition and lighting are acceptable, these text issues severely detract from the realism and professional quality of the image.
Prompt Adherence: 4.0/5 (90% conf)
The image strongly matches the prompt's request for a vintage coffee shop storefront with morning sunlight and a 35mm film photography style. However, the hand-lettered chalkboard menu and door sign contain several noticeable typos and nonsensical words, which detract from the overall quality.
Text Rendering: 2.0/5 (90% conf)
The text rendering is poor, with significant errors and garbled words. While some words like 'Fresh Brews' and 'PASTRIES' are clear, others are misspelled or completely unreadable.
CLIP: 0.2595
sharpness: 0.9661
aesthetic_score: 0.5377
nano-banana nano-banana
4.3/5
56429ms · $0.0398
Score Details
Visual Quality: 5.0/5 (90% conf)
The image exhibits excellent visual quality with sharp focus, clear details, and realistic rendering of all elements. There are no visible artifacts, distortions, or inconsistencies in lighting or textures.
Prompt Adherence: 5.0/5 (100% conf)
The image perfectly captures all aspects of the prompt, depicting a vintage coffee shop storefront with a hand-lettered chalkboard menu, bathed in morning sunlight, and rendered in a convincing 35mm film photography style.
Text Rendering: 3.0/5 (90% conf)
While most of the text, such as 'THE MORNING JOE' and several menu items, is clearly rendered and readable, there is a significant misspelling on the chalkboard menu. The word 'CAPPUCUCE' is incorrectly spelled instead of 'CAPPUCCINO'.
CLIP: 0.2906
sharpness: 0.9329
aesthetic_score: 0.4912

Cost Summary

CategoryRequestsCost
fal.ai generation15$0.2694
gemini-2.5-flash judge45$0.0225
Local metrics$0.0000
Total$0.2919

Configuration

Modelsflux-schnell, flux-2-dev, z-image-turbo, ideogram-v3, nano-banana
Judgegemini-2.5-flash
Dimensionsvisual_quality, prompt_adherence, text_rendering
Pipelinetext2img
Evalytic Version0.3.10
Platformdarwin-arm64
Metric Scoring clip_score: threshold=0.18, weight=0.2, lpips: threshold=0.4, weight=0.2, face_similarity: threshold=0.6, weight=0.2, aesthetic_score: threshold=0.3, weight=0.15