Latest Blockchain & Cryptocurrency Updates

yesterday
Source DeCrypt

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs

Summary

StepFun, a Shanghai-based AI lab, launched StepAudio 2.5 Realtime—a real-time, end-to-end voice model that processes audio directly, without converting speech to text. It supports both Chinese and English. Benchmarks indicate strong performance, notably in paralinguistic comprehension, where StepAudio scored 82.18 out of 100, outperforming GPT Realtime 1.5 and other competitors. In human evaluation tests, StepAudio scored 80.41, also leading the field. StepFun addresses common AI persona stability issues, such as out-of-character (OOC) drift, by using roleplay-specific reinforcement learning from human feedback (RLHF) and a vast, diversified dataset, aiming for consistent and robust character behavior even in unusual conversations. The model's algorithm can interpret non-verbal cues—such as emotion, speech rate, and age—from input audio. StepFun, founded in 2023 by Jiang Daxin and backed by $1.7 billion in funding, positions its technology as a direct competitor to OpenAI’s advanced voice mode, claiming superior results. The launch includes Xiao Yue, a highly customizable AI persona, and an API for developers to build custom characters. The model is available at platform.stepfun.com.

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs

Related News

OpenAI Foundation Pledges $250... OpenAI Foundation Pledges $250 Million to Help Cushion AI's Economic Disruption

YouTube Makes AI Content Labels... YouTube Makes AI Content Labels More Prominent as Google Pushes Video Remix Tools

Huawei's New Benchmark Gives AI... Huawei's New Benchmark Gives AI Agents Months of Your Life—Then Watches Them Fail

Robinhood Opens Platform to AI... Robinhood Opens Platform to AI Agents for Stock Trading and Credit Card Spending

This Half-Gigabyte AI Model Runs... This Half-Gigabyte AI Model Runs Local Agents on Your Phone

Inaudible Audio Attacks Can Hijack... Inaudible Audio Attacks Can Hijack AI Voice Models, Study Finds

Latest News!