refactor: optimize audio recording in `record_and_transcribe` #1492

EzraEllette · 2025-02-26T03:29:13Z

#DRAFT

Features

audio manager API
- start audio processing pipeline
- stop pipeline
- change settings without restarting the program (WIP)
- start/stop device recording
- more
Device Manager
- track all device connections (WIP)
- manage running states for audio devices

…tion module

This commit introduces several improvements to the audio transcription and processing modules: - Restructured whisper transcription code for better modularity - Extracted language constants to a separate module - Enhanced language detection and token processing logic - Simplified audio stream and device handling - Improved error handling and code readability - Extracted utility functions for better separation of concerns

This commit introduces significant refactoring of the project's database and type management: - Created a new `screenpipe-db` module to centralize database-related types - Removed redundant `db_types.rs` and `db.rs` from `screenpipe-server` - Added conversion traits between different module types - Updated imports and references across multiple modules - Simplified type conversions and reduced code duplication - Improved overall project structure and modularity

This commit introduces new management modules in the screenpipe-audio crate: - Added audio_manager module - Introduced device_manager module - Created segmentation_manager module - Updated core module imports and structure The changes improve the organization and modularity of audio-related functionality.

This commit introduces several improvements to the audio recording and processing workflow: - Refactored audio segment collection to use a more memory-efficient approach - Added dynamic buffer management with overlap handling - Moved segmentation manager to a dedicated module - Updated embedding extractor to defer session creation - Improved error handling in audio stream processing The changes enhance the robustness and memory efficiency of audio recording and segmentation.

This commit introduces several improvements to the speech-to-text and embedding modules: - Converted STT functions to async to improve concurrency - Updated embedding extractor to cache ONNX session - Fixed audio overlap buffer calculation in core module - Enhanced error handling and async processing in transcription logic

This commit introduces several improvements to audio processing modules: - Converted `pcm_to_mel` and related functions to async for better concurrency - Enhanced error handling in segment preparation - Updated thread-based processing to use tokio async tasks - Improved error propagation in audio segment processing

This commit introduces several improvements to the AudioManager and server integration: - Refactored AudioManager to use Arc and improve thread safety - Added device disabling functionality - Simplified audio device handling in server startup - Updated CLI and server initialization to work with new AudioManager - Removed redundant device control mechanisms - Improved error handling for audio device management

vercel · 2025-02-26T03:29:17Z

@EzraEllette is attempting to deploy a commit to the louis030195's projects Team on Vercel.

A member of the Team first needs to authorize it.

This commit introduces several improvements to audio device handling: - Updated DeviceManager to dynamically list and start audio devices - Modified AudioManagerBuilder to automatically select default devices - Added server endpoint for starting audio devices dynamically - Improved error handling and device initialization in AudioManager - Refactored device management to be more flexible and robust

- Updated `stop_device` method in AudioManager to be immutable - Added new `/audio/device/stop` endpoint for stopping audio recording devices - Implemented error handling and response formatting for device stop operation - Added TODO comment for future device start method refactoring

- Implemented `/audio/start` and `/audio/stop` endpoints for global audio processing - Updated AudioManager to support starting and stopping audio processing - Added status checks to prevent redundant start/stop operations - Improved error handling and response formatting for audio processing endpoints

EzraEllette added 18 commits February 17, 2025 15:31

initial refactor

2655f6a

refactor: Improve audio file path generation and writing in transcrip…

51dbf77

…tion module

refactor: Replace log with tracing in screenpipe-audio module

b594f09

feat: fix stt example

3328b77

start audio manager

3a23735

move pcm_to_mel to audio crate

8124c32

fix run_record_and_transcribe memory usage

c1559a8

add LAST_AUDIO_CAPTURE to recording pipeline

ba09846

move run_record_and_transcribe to own function

fb3130b

add transcript insertions to screenpipe audio

eea39da

minor updates

c2c90de

EzraEllette added 4 commits February 26, 2025 12:47

add realtime to audio_manager

597abb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: optimize audio recording in `record_and_transcribe` #1492

refactor: optimize audio recording in `record_and_transcribe` #1492

EzraEllette commented Feb 26, 2025

vercel bot commented Feb 26, 2025

refactor: optimize audio recording in record_and_transcribe #1492

Are you sure you want to change the base?

refactor: optimize audio recording in record_and_transcribe #1492

Conversation

EzraEllette commented Feb 26, 2025

Features

vercel bot commented Feb 26, 2025

refactor: optimize audio recording in `record_and_transcribe` #1492

refactor: optimize audio recording in `record_and_transcribe` #1492