🚀 Fun-ASR-Nano

LLM-Powered Speech Recognition — 31 Languages, Dialects & Accents

End-to-end ASR trained on tens of millions of hours of data. Supports Chinese (+ dialects), English, Japanese, Korean, French, German, Spanish, and 24 more languages.

⭐ GitHub (Fun-ASR) · 🛠️ FunASR Toolkit · 🎙️ SenseVoice · 🤗 Model Card

Model Languages Architecture Best For
Fun-ASR-Nano 31 LLM-based Multi-language, dialects, highest accuracy
SenseVoice 5 CTC (non-AR) Speed + Emotion + Audio events
Model
Language (SenseVoice only)

Supported Languages (Fun-ASR-Nano)

Chinese (Mandarin, Cantonese, Sichuan, Shanghai, Minnan, Wenzhou, Hakka, Gan, and more), English, Japanese, Korean, French, German, Spanish, Italian, Portuguese, Russian, Arabic, Hindi, Thai, Vietnamese, Indonesian, Malay, Turkish, Polish, Dutch, Swedish, Hebrew, Greek, Czech, Romanian, Hungarian, Finnish, Danish, Norwegian, Ukrainian.

Tips

  • Fun-ASR-Nano: Best for multi-language & Chinese dialects. Outputs punctuation natively.
  • SenseVoice: Ultra-fast (7x faster than Whisper-small), also detects emotions & audio events.
  • For long audio (>5min), consider using FunASR locally with GPU.