-
Notifications
You must be signed in to change notification settings - Fork 71
Input output, playback and recording
At the moment, PCM WAV is the only audio format supported in NWaves library. For transcoding files of any other format (mp3/ogg/flac) to WAV, some external tools can be used, like FFmpeg.
For the sake of universality, NWaves does not work with files per se, but it provides the special WaveFile
class for reading/writing from/to general .NET Stream
objects. It can be any stream - in particular, FileStream
or MemoryStream
. As of ver.0.9.5 signals can be read/written from/to byte[]
as well.
Example of loading a signal from wave file:
DiscreteSignal signal;
using (var stream = new FileStream("sample.wav", FileMode.Open))
{
var waveFile = new WaveFile(stream);
signal = waveFile[Channels.Left];
}
Note. WaveFile
is not intended to be a "wrapper around the stream", or to acquire any resource (thus, it doesn't implement IDisposable
interface, for example, and it doesn't affect the underlying stream). It's more like a "constructor of signals in memory based on data from the stream" and its lifetime is not synchronized with the stream whatsoever. Maybe the better name for it would be something like WaveContainer
. The following code is functionally identical to the previous one, and it shows that no resource is acquired by WaveFile
:
WaveFile waveFile;
using (var stream = new FileStream("sample.wav", FileMode.Open))
{
waveFile = new WaveFile(stream);
}
var signal = waveFile[Channels.Left];
WAV-container can store several signals (1 - mono, 2 - stereo, etc.).
In order to address these signals, Channels
enum is used:
// address signals with Channels enum (Left, Right, Average, Sum, Interleave):
var signalLeft = waveFile[Channels.Left];
var signalRight = waveFile[Channels.Right];
var signalSum = waveFile[Channels.Sum];
var signalAverage = waveFile[Channels.Average];
var signalInterleaved = waveFile[Channels.Interleave];
// or simply like this:
signalLeft = waveFile.Signals[0]; // Channels.Left
signalRight = waveFile.Signals[1]; // Channels.Right
signalThird = waveFile.Signals[(Channels)2]; // if it exists
By default the sample values are normalized by the maximum bit depth value onto range [-1, 1]. If you need to work with original sample values, set the parameter normalized
in WaveFile
constructor to false:
var waveFile = new WaveFile(stream, false);
WaveFile
properties:
- SupportedBitDepths
- WaveFmt
- Signals
Supported bit depths are { 8, 16, 24, 32 }.
WaveFmt
is the conventional WAV-structure containing the following information:
Property | Meaning |
---|---|
AudioFormat | 1 (PCM) |
ChannelCount | 1 - mono, 2 - stereo, etc. |
SamplingRate | Sampling rate (frequency) |
BitsPerSample | Bit depth (8, 16, 24, 32) |
Align | ChannelCount * BitsPerSample / 8 |
ByteRate | SamplingRate * ChannelCount * BitsPerSample / 8 |
Example of saving one signal to wave file (mono):
var waveFile = new WaveFile(signal, 24);
using (var stream = new FileStream("saved_mono.wav", FileMode.Create))
{
waveFile.SaveTo(stream);
}
The second parameter is optional and represents the bit depth, or number of bits per sample. By default it's 16.
Saving two signals to wave file (stereo):
var waveFile = new WaveFile(new [] { signal1, signal2 });
using (var stream = new FileStream("saved_stereo.wav", FileMode.Create))
{
waveFile.SaveTo(stream);
}
In NWaves all processing is done on floating point numbers. However, depending on a particular audio-capturing technology, the audio data may come in arrays of bytes. Usually the byte order is little-endian and samples from different channels interleave. There's a static class ByteConverter
that takes care about all these nuances and provides methods for conversions between arrays of bytes and floats:
ToFloats8Bit(bytes, floats, normalize)
FromFloats8Bit(floats, bytes, normalize)
ToFloats16Bit(bytes, floats, normalize, bigEndian)
FromFloats16Bit(floats, bytes, normalize, bigEndian)
By default, float samples are normalized onto [-1, 1].
var sizeInBytes = 8192;
var sizeInFloats = sizeInBytes / sizeof(short); // for Pcm 16bit
byte[] _bytes = new byte[sizeInBytes];
float[][] _data = new float[numChannels][];
for (var i = 0; i < numChannels; i++)
_data[i] = new float[sizeInFloats];
//...
while (_isRecording)
{
// read bytes in Pcm16bit format:
await _recorder.ReadAsync(_bytes, 0, sizeInBytes);
// convert to float[] arrays (for each channel); normalized, little-endian
ByteConverter.ToFloats16Bit(_bytes, _data);
// ... process _data
}
See code of Xamarin demo application.
MciAudioPlayer
and MciAudioRecorder
are classes responsible for recording and playback, and they work only on Windows, since they use winmm.dll and MCI commands.
Only very basic recording is supported. During the recording process no audiodata can be processed online. After calling the StopRecording
method the new file will be created containing recorded sound:
IAudioRecorder recorder = new MciAudioRecorder();
// ...in some event handler
recorder.StartRecording(16000);
// ...in some event handler
recorder.StopRecording("temp.wav");
IAudioPlayer player = new MciAudioPlayer();
// play entire file
await player.PlayAsync("temp.wav");
// play file from 16000th sample to 32000th sample
await player.PlayAsync("temp.wav", 16000, 32000);
// ...in some event handler
player.Pause();
// ...in some event handler
player.Resume();
// ...in some event handler
player.Stop();
Playing audio from buffers in memory is implied by design but it's not implemented in MciAudioPlayer
.
// this won't work, unfortunately:
// await player.PlayAsync(signal);
// await player.PlayAsync(signal, 16000, 32000);
However, at Windows-side a very simple wrapper around System.Media.SoundPlayer
can be made - the MemoryStreamPlayer
class. This class should implement the same IAudioPlayer
interface as MciAudioPlayer
. Possible code can look like this:
/// <summary>
/// Simple player wrapped around System.Media.SoundPlayer
/// </summary>
public class MemoryStreamPlayer : IAudioPlayer
{
private SoundPlayer _player;
public async Task PlayAsync(string location, int startPos = 0, int endPos = -1)
{
_player?.Dispose();
_player = new SoundPlayer(location);
_player.Play();
}
public async Task PlayAsync(DiscreteSignal signal, int startPos = 0, int endPos = -1, short bitDepth = 16)
{
var stream = new MemoryStream();
var wave = new WaveFile(signal, bitDepth);
wave.SaveTo(stream);
stream = new MemoryStream(stream.ToArray());
_player?.Dispose();
_player = new SoundPlayer(stream);
_player.Stream.Seek(0, SeekOrigin.Begin);
_player.Play();
}
public void Pause()
{
_player.Stop();
}
public void Resume()
{
_player.Play();
}
public void Stop()
{
_player.Stop();
}
public float Volume { get; set; }
}
With MemoryStreamPlayer
playing a signal from the memory stream is very easy:
var player = new MemoryStreamPlayer();
await player.PlayAsync(signal);
There's also another possible workaround: in the calling code the signal can be saved to a temporary wave file, and then player can play this file.
// looks not so cool, but at least it works:
// create temporary file
var filename = string.format("{0}.wav", Guid.NewGuid());
using (var stream = new FileStream(filename, FileMode.Create))
{
var waveFile = new WaveFile(signal);
waveFile.SaveTo(stream);
}
await player.PlayAsync(filename);
// cleanup temporary file
File.Delete(filename);