- YouTube Premium is the only streaming service that can hike prices
- Rockstar Games says hack will have ‘no impact’
- I deleted all my Linux folders and found files faster without them
- Somehow Paramount’s ‘Avatar’ Movie Has Leaked, Too
- My smartphone made me verify my age, and you might be next
- Samsung’s Galaxy Z TriFold is back in stock at 10AM ET today
- What Garmin patents & trademarks tell us about the upcoming Fenix 9
- The smart home was supposed to be open, but it’s becoming a toll booth
Browsing: speech
Mistral AI Releases Voxtral TTS: A 4B Open-Weight Streaming Speech Model for Low-Latency Multilingual Voice Generation
Mistral AI has released Voxtral TTS, an open-weight text-to-speech model that marks the company’s first major move into audio generation. Following the release of its transcription…
Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI
Building natural conversational experiences requires speech synthesis that keeps pace with real-time interactions. Today, we’re excited to announce the new Bidirectional Streaming API for Amazon Polly,…
Cohere AI Releases Cohere Transcribe: A SOTA Automatic Speech Recognition (ASR) Model Powering Enterprise Speech Intelligence
In the landscape of enterprise AI, the bridge between unstructured audio and actionable text has often been a bottleneck of proprietary APIs and complex cascaded pipelines.…
Tencent AI Open Sources Covo-Audio: A 7B Speech Language Model and Inference Pipeline for Real-Time Audio Conversations and Reasoning
Tencent AI Lab has released Covo-Audio, a 7B-parameter end-to-end Large Audio Language Model (LALM). The model is designed to unify speech processing and language intelligence by…
Google AI Releases WAXAL: A Multilingual African Speech Dataset for Training Automatic Speech Recognition and Text-to-Speech Models
Speech technology still has a data distribution problem. Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems have improved rapidly for high-resource languages, but many African languages…
IBM AI Releases Granite 4.0 1B Speech as a Compact Multilingual Speech Model for Edge AI and Translation Pipelines
IBM has released Granite 4.0 1B Speech, a compact speech-language model designed for multilingual automatic speech recognition (ASR) and bidirectional automatic speech translation (AST). The release…
This post is a collaboration between AWS, NVIDIA and Heidi. Automatic speech recognition (ASR), often called speech-to-text (STT) is becoming increasingly critical across industries like healthcare,…
Image by Author # Introduction Before we start anything, I want you to watch this video: Your browser does not support the video tag.F Isn’t this…
The bipartisan Kids Online Safety Act, designed to protect minors from age-inappropriate online content, will head to the House floor for a vote. But critics say…
Anchoring in the African AI ecosystem Crucial to the WAXAL project was our commitment to working with, and contributing directly to, the African AI ecosystem. The…
