Speaker Diarization Python

wcvessels/transcription-pipeline-plugin

Local, token-free audio + video transcription with speaker diarization and screenshot curation. Runs entirely on your machine - WhisperX large-v3 + a token-free pyannote clone. No HuggingFace account ...

GitHub

sherpa-onnx-offline-speaker-diarization.cc

#include "sherpa-onnx/csrc/parse-options.h" #include "sherpa-onnx/csrc/wave-reader.h" wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker ...

note

I added my own voice to the away-from-desk summary tool, and the audio part broke

Last time, I wrote about a locally running "away-from-desk summary" tool. It transcribes the audio from a study session and summarizes only the parts where I was away from my seat. At the end of the ...

Frontiers

AI-assisted vocal emotion analysis in forensic interview with children: an exploratory study

To isolate children’s speech, a preprocessing pipeline combining automatic speaker diarization and manual verification was applied. Initially, speaker segmentation and timestamp-based utterance ...

note

Gemma 4 12B Practical Use & Hardware Verification: Running on 16GB Laptops / Agent ...

In our previous article (Gemma 4 12B In-Depth: A New Model Bringing Full-Scale Multimodality to Laptops via Encoder-Free Architecture), we focused on the architecture and specifications. As a sequel, ...

Memeburn

Google's Gemma 4 12B Runs AI Natively on Your Laptop — No Cloud Needed

Gemma 4 12B is Google DeepMind's first medium-sized open model with native audio in 2026 — and it runs entirely on a 16GB laptop. You don't need cloud credits or a data center. The model processes ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果