Text-to-Speech (TTS)

Data as of May 24, 2026

Model							ArenaB
1.□	Vocu V3.0Vocu	-	200	20	-	○	11582	1
2.□	Inworld TTS MAXInworld AI	2025/6	20	11	$5	○	21575	2
3.□	CastleFlow v1.0CastleFlow	-	50	10	-	○	31574	3
4.□	Orpheus 3BCanopy Labs	2025/3	8	7	$22	●	41570	4
5.□	Hume OctaveHume AI	2025/1	100	11	$50	○	51563	5
6.□	Papla P1Papla	-	30	15	-	○	61562	6
7.□	MiniMax Speech-02-HDMiniMax	2025/4	300	32	$30	○	71544	7
8.□	Fish Speech S2 ProFish Audio	2026/3	-	80	free	●	-	8
9.□	Voxtral TTSMistral	2026/3	-	9	$16	●	-	8
10.□	Dia 1.6BNari Labs	2025/4	-	1	free	●	-	8
11.□	PiperRhasspy / community	-	-	30	free	●	-	8
12.□	GPT-Realtime-2OpenAI	2026/5	-	-	-	○	-	8
13.□	ElevenLabs v3ElevenLabs	2025/6	-	70	-	○	-	8
14.□	Sesame CSM-1BSesame AI Labs	2025/3	1	1	free	●	-	8
15.□	Eleven Turbo v2.5ElevenLabs	2024/7	5000	32	$150	○	81544	8
16.□	Ming-Omni-TTSInclusionAI	2026/3	100	2	free	●	-	8
17.□	Cartesia Sonic 2Cartesia	2025/2	200	25	$20	○	91513	9
18.□	Chatterbox MultilingualResemble AI	2025/5	100	23	free	●	101506	10
19.□	Kokoro v1.0Community	2025/1	54	8	free	●	111500	11
20.□	NeuTTS MaxNeuTTS	-	40	8	-	○	121479	12
21.□	PlayHT 2.0PlayHT	2023/8	600	28	$50	○	131405	13
22.□	StyleTTS 2Community	2023/6	1	2	free	●	141369	14
23.□	CosyVoice 3Alibaba	2025/5	200	9	free	●	151358	15
24.□	Spark TTSiFlytek	-	80	10	-	○	161342	16

1.Vocu V3.0

Vocu20020

Arena 1582

2.Inworld TTS MAX

Inworld AI2011$5

Arena 1575

3.CastleFlow v1.0

CastleFlow5010

Arena 1574

4.Orpheus 3B●

Canopy Labs87$22

Arena 1570

5.Hume Octave

Hume AI10011$50

Arena 1563

6.Papla P1

Papla3015

Arena 1562

MiniMax Speech-02-HD

MiniMax30032$30

Arena 1544

Fish Speech S2 Pro●

Fish Audio80free

Voxtral TTS●

Mistral9$16

10.Dia 1.6B●

Nari Labs1free

11.Piper●

Rhasspy / community30free

12.

GPT-Realtime-2

OpenAI

13.

ElevenLabs v3

ElevenLabs70

14.

Sesame CSM-1B●

Sesame AI Labs11free

15.

Eleven Turbo v2.5

ElevenLabs500032$150

Arena 1544

16.

Ming-Omni-TTS●

InclusionAI1002free

17.Cartesia Sonic 2

Cartesia20025$20

Arena 1513

18.Chatterbox Multilingual●

#10

Resemble AI10023free

Arena 1506

19.Kokoro v1.0●

#11

Community548free

Arena 1500

20.NeuTTS Max

#12

NeuTTS408

Arena 1479

21.PlayHT 2.0

#13

PlayHT60028$50

Arena 1405

22.StyleTTS 2●

#14

Community12free

Arena 1369

23.

CosyVoice 3●

#15

Alibaba2009free

Arena 1358

24.

Spark TTS

#16

iFlytek8010

Arena 1342

Music Generation

Data as of June 9, 2026

Model						Quality		Notes▼
1.□	Suno v5.5Suno	2026/3	8 min	$0.08	○	194.0	1	Voices (voice cloning), Custom Models (fine-tuning), My Taste; Suno Studio DAW. quality=editorial est.
2.□	Suno v5Suno	2025/10	8 min	$0.08	○	292.0	2	Most popular consumer music gen
3.□	Udio 2Udio	2025/6	15 min	$0.10	○	390.0	3	Longer form; a16z-backed
4.□	Lyria 3 ProGoogle DeepMind	2026/3	3 min	-	○	490.0	4	Full vocals, image-guided gen, negative prompts, structural control (intro/verse/chorus); licensed training data. quality=editorial est.
5.□	MiniMax Music 2.5MiniMax	2026/1	4 min	-	○	590.0	5	Studio-grade humanized vocals, 100+ instrument tones, 14 composition tags for structural control. quality=editorial est.
6.□	Suno v4.5Suno	2025/4	4 min	$0.08	○	688.0	6	Widely used; MIT lawsuit pending
7.□	Lyria 2Google DeepMind	2025/5	5 min	-	○	787.0	7	YouTube Music AI; restricted API
8.□	Stable Audio 2.5Stability AI	2025/11	4 min	free	●	880.0	8	Open weights; instrumental focus
9.□	Riffusion 3Riffusion	2025/8	3 min	$0.05	○	978.0	9	YC-backed; fast generation
10.□	MusicLMGoogle	2023/1	5 min	-	○	1072.0	10	Research preview; superseded by Lyria
11.□	MusicGen LargeMeta	2024/6	30s	free	●	1170.0	11	3.3B params; open research baseline

Suno v5.5

Suno8 min$0.08

Quality 94.0

Voices (voice cloning), Custom Models (fine-tuning), My Taste; Suno Studio DAW. quality=editorial est.

2.Suno v5

Suno8 min$0.08

Quality 92.0

Speech-to-Text (ASR)

Data as of May 11, 2026

Model							WER %B		Notes▼
1.□	Cohere TranscribeCohere	2026/3	2B	14	free	●	115.4	1	Open-source; tops HF Open ASR leaderboard; 525x real-time on consumer GPUs; free API
2.□	NVIDIA Canary 1BNVIDIA	2024/2	1B	4	free	●	106.5	2	EN/ES/FR/DE only; top of HF OpenASR
3.□	Deepgram Nova-3Deepgram	2025/2	-	36	$4	○	96.8	3	Lowest WER in English; 54% lower than Whisper
4.□	ElevenLabs ScribeElevenLabs	2025/2	-	99	$3	○	87.2	4	Multi-speaker; ElevenLabs first ASR
5.□	AssemblyAI Universal-2AssemblyAI	2024/10	-	25	$6	○	77.7	5	Best speaker diarization
6.□	Whisper Large v3 TurboOpenAI	2024/10	809M	99	$6	●	68.4	6	Best open baseline; 8x faster than v3
7.□	Whisper Large v3OpenAI	2023/11	1550M	99	$6	●	38.5	7	Gold standard; 680k hours of training data
8.□	Fireworks Whisper-v3Fireworks AI	2024/9	-	99	$3	○	48.5	8	Cheapest Whisper API; 300x real-time
9.□	Groq Whisper Large v3Groq	2024/7	-	99	$3	○	58.5	9	Fastest inference via LPU
10.□	Gladia Whisper-ZeroGladia	2024/9	-	100	$5	○	29.1	10	Hallucination-resistant; EU-hosted
11.□	Moonshine TinyUseful Sensors	2024/10	27M	1	free	●	19.8	11	On-device; English only
12.□	GPT-Realtime-WhisperOpenAI	2026/5	-	99	-	○	-	12	Live transcription Realtime API model. Pricing: $0.017/minute. Same launch as GPT-Realtime-2.
13.□	GPT-Realtime-TranslateOpenAI	2026/5	-	-	-	○	-	13	Live translation Realtime API model. Pricing: $0.034/minute.
14.□	MiMo-V2.5-ASRXiaomi MiMo	2026/4	?	1	-	●	-	14	Newest XiaomiMiMo release; Mandarin-focused; flagged by daily audit