feat: Add support for quantized Whisper model and update audio transc… #1508

EzraEllette · 2025-02-28T03:18:30Z

…ription workflow

Integrate whisper-rs library for improved audio transcription
Add WhisperLargeV3TurboQuantized transcription engine
Modify STT processing to use whisper-rs context and state
Update Cargo.toml to include whisper-rs with GPU support
Refactor transcription methods to work with new whisper-rs workflow
Add download function for quantized Whisper model
Update CLI and core audio transcription engine to support new quantized model

name: pull request
about: submit changes to the project
title: "[pr] "
labels: ''
assignees: ''

description

brief description of the changes in this pr.

related issue: #587

how to test

add a few steps to test the pr in the most time efficient way.

run accuracy example
run screenpipe with -a whisper-large-v3-turbo-quantized

if relevant add screenshots or screen captures to prove that this PR works to save us time (check Cap).

if you are not the author of this PR and you see it and you think it can take more than 30 mins for maintainers to review, we will tip you between $20 and $200 for you to review and test it for us.

…ription workflow - Integrate whisper-rs library for improved audio transcription - Add WhisperLargeV3TurboQuantized transcription engine - Modify STT processing to use whisper-rs context and state - Update Cargo.toml to include whisper-rs with GPU support - Refactor transcription methods to work with new whisper-rs workflow - Add download function for quantized Whisper model - Update CLI and core audio transcription engine to support new quantized model

vercel · 2025-02-28T03:18:33Z

@EzraEllette is attempting to deploy a commit to the louis030195's projects Team on Vercel.

A member of the Team first needs to authorize it.

louis030195 · 2025-02-28T03:24:31Z

screenpipe-audio/src/whisper/process_chunk.rs

+    // Enable translation.
+    params.set_translate(true);
+    // Set the language to translate to to English.
+    params.set_language(Some("en"));


multi language?

I need to remove this line. below it is the language setting. this was quick and dirty

louis030195 · 2025-02-28T03:24:50Z

screenpipe-audio/src/whisper/process_chunk.rs

+    whisper_model.pcm_to_mel(audio, 2)?;
+    let (_, lang_tokens) = whisper_model.lang_detect(0, 4)?;
+    let lang_token = get_lang_token(lang_tokens, languages)?;
+    params.set_language(get_lang_str(lang_token));


louis030195 · 2025-02-28T03:25:32Z

cool

AntonIXO · 2025-02-28T17:43:52Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

EzraEllette · 2025-02-28T17:55:37Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?

Waiting for merge!

Yes, will you collaborate with me to test on Linux?

AntonIXO · 2025-02-28T19:15:37Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

Yes, will you collaborate with me to test on Linux?

Of course!
What Linux are you running? Do you use X.org or Wayland, and do all screenpipe features work fine for you?
I am on Arch KDE Wayland and I could use only screen record(no app recognition) with this patch: #1496

louis030195 reviewed Feb 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for quantized Whisper model and update audio transc… #1508

feat: Add support for quantized Whisper model and update audio transc… #1508

EzraEllette commented Feb 28, 2025

vercel bot commented Feb 28, 2025

louis030195 Feb 28, 2025

EzraEllette Feb 28, 2025

louis030195 Feb 28, 2025

louis030195 commented Feb 28, 2025

AntonIXO commented Feb 28, 2025 •

edited

Loading

EzraEllette commented Feb 28, 2025

AntonIXO commented Feb 28, 2025 •

edited

Loading

feat: Add support for quantized Whisper model and update audio transc… #1508

Are you sure you want to change the base?

feat: Add support for quantized Whisper model and update audio transc… #1508

Conversation

EzraEllette commented Feb 28, 2025

description

how to test

vercel bot commented Feb 28, 2025

louis030195 Feb 28, 2025

Choose a reason for hiding this comment

EzraEllette Feb 28, 2025

Choose a reason for hiding this comment

louis030195 Feb 28, 2025

Choose a reason for hiding this comment

louis030195 commented Feb 28, 2025

AntonIXO commented Feb 28, 2025 • edited Loading

EzraEllette commented Feb 28, 2025

AntonIXO commented Feb 28, 2025 • edited Loading

AntonIXO commented Feb 28, 2025 •

edited

Loading

AntonIXO commented Feb 28, 2025 •

edited

Loading