-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: optimize audio recording in record_and_transcribe
#1492
Draft
EzraEllette
wants to merge
22
commits into
mediar-ai:main
Choose a base branch
from
EzraEllette:audio-manager0
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit introduces several improvements to the audio transcription and processing modules: - Restructured whisper transcription code for better modularity - Extracted language constants to a separate module - Enhanced language detection and token processing logic - Simplified audio stream and device handling - Improved error handling and code readability - Extracted utility functions for better separation of concerns
This commit introduces significant refactoring of the project's database and type management: - Created a new `screenpipe-db` module to centralize database-related types - Removed redundant `db_types.rs` and `db.rs` from `screenpipe-server` - Added conversion traits between different module types - Updated imports and references across multiple modules - Simplified type conversions and reduced code duplication - Improved overall project structure and modularity
This commit introduces new management modules in the screenpipe-audio crate: - Added audio_manager module - Introduced device_manager module - Created segmentation_manager module - Updated core module imports and structure The changes improve the organization and modularity of audio-related functionality.
This commit introduces several improvements to the audio recording and processing workflow: - Refactored audio segment collection to use a more memory-efficient approach - Added dynamic buffer management with overlap handling - Moved segmentation manager to a dedicated module - Updated embedding extractor to defer session creation - Improved error handling in audio stream processing The changes enhance the robustness and memory efficiency of audio recording and segmentation.
This commit introduces several improvements to the speech-to-text and embedding modules: - Converted STT functions to async to improve concurrency - Updated embedding extractor to cache ONNX session - Fixed audio overlap buffer calculation in core module - Enhanced error handling and async processing in transcription logic
This commit introduces several improvements to audio processing modules: - Converted `pcm_to_mel` and related functions to async for better concurrency - Enhanced error handling in segment preparation - Updated thread-based processing to use tokio async tasks - Improved error propagation in audio segment processing
This commit introduces several improvements to the AudioManager and server integration: - Refactored AudioManager to use Arc and improve thread safety - Added device disabling functionality - Simplified audio device handling in server startup - Updated CLI and server initialization to work with new AudioManager - Removed redundant device control mechanisms - Improved error handling for audio device management
@EzraEllette is attempting to deploy a commit to the louis030195's projects Team on Vercel. A member of the Team first needs to authorize it. |
This commit introduces several improvements to audio device handling: - Updated DeviceManager to dynamically list and start audio devices - Modified AudioManagerBuilder to automatically select default devices - Added server endpoint for starting audio devices dynamically - Improved error handling and device initialization in AudioManager - Refactored device management to be more flexible and robust
- Updated `stop_device` method in AudioManager to be immutable - Added new `/audio/device/stop` endpoint for stopping audio recording devices - Implemented error handling and response formatting for device stop operation - Added TODO comment for future device start method refactoring
- Implemented `/audio/start` and `/audio/stop` endpoints for global audio processing - Updated AudioManager to support starting and stopping audio processing - Added status checks to prevent redundant start/stop operations - Improved error handling and response formatting for audio processing endpoints
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#DRAFT
Features