Skip to content

Input output, playback and recording

Tim Sharii edited this page Aug 31, 2021 · 5 revisions

At the moment, PCM WAV is the only audio format supported in NWaves library. For transcoding files of any other format (mp3/ogg/flac) to WAV, some external tools can be used, like FFmpeg.

For the sake of universality, NWaves does not work with files per se, but it provides the special WaveFile class for reading/writing from/to general .NET Stream objects. It can be any stream - in particular, FileStream or MemoryStream. As of ver.0.9.5 signals can be read/written from/to byte[] as well.

Example of loading a signal from wave file:

DiscreteSignal signal;

using (var stream = new FileStream("sample.wav", FileMode.Open))
{
    var waveFile = new WaveFile(stream);
    signal = waveFile[Channels.Left];
}

Note. WaveFile is not intended to be a "wrapper around the stream", or to acquire any resource (thus, it doesn't implement IDisposable interface, for example, and it doesn't affect the underlying stream). It's more like a "constructor of signals in memory based on data from the stream" and its lifetime is not synchronized with the stream whatsoever. Maybe the better name for it would be something like WaveContainer. The following code is functionally identical to the previous one, and it shows that no resource is acquired by WaveFile:

WaveFile waveFile;

using (var stream = new FileStream("sample.wav", FileMode.Open))
{
    waveFile = new WaveFile(stream);
}

var signal = waveFile[Channels.Left];

WAV-container can store several signals (1 - mono, 2 - stereo, etc.).

In order to address these signals, Channels enum is used:

// address signals with Channels enum (Left, Right, Average, Sum, Interleave):

var signalLeft = waveFile[Channels.Left];
var signalRight = waveFile[Channels.Right];
var signalSum = waveFile[Channels.Sum];
var signalAverage = waveFile[Channels.Average];
var signalInterleaved = waveFile[Channels.Interleave];
	
// or simply like this:

signalLeft = waveFile.Signals[0];   // Channels.Left
signalRight = waveFile.Signals[1];  // Channels.Right
signalThird = waveFile.Signals[(Channels)2];  // if it exists

By default the sample values are normalized by the maximum bit depth value onto range [-1, 1]. If you need to work with original sample values, set the parameter normalized in WaveFile constructor to false:

var waveFile = new WaveFile(stream, false);

WaveFile properties:

  • SupportedBitDepths
  • WaveFmt
  • Signals

Supported bit depths are { 8, 16, 24, 32 }.

WaveFmt is the conventional WAV-structure containing the following information:

Property Meaning
AudioFormat 1 (PCM)
ChannelCount 1 - mono, 2 - stereo, etc.
SamplingRate Sampling rate (frequency)
BitsPerSample Bit depth (8, 16, 24, 32)
Align ChannelCount * BitsPerSample / 8
ByteRate SamplingRate * ChannelCount * BitsPerSample / 8

Example of saving one signal to wave file (mono):

var waveFile = new WaveFile(signal, 24);

using (var stream = new FileStream("saved_mono.wav", FileMode.Create))
{
    waveFile.SaveTo(stream);
}

The second parameter is optional and represents the bit depth, or number of bits per sample. By default it's 16.

Saving two signals to wave file (stereo):

var waveFile = new WaveFile(new [] { signal1, signal2 });

using (var stream = new FileStream("saved_stereo.wav", FileMode.Create))
{
    waveFile.SaveTo(stream);
}

Converting bytes

In NWaves all processing is done on floating point numbers. However, depending on a particular audio-capturing technology, the audio data may come in arrays of bytes. Usually the byte order is little-endian and samples from different channels interleave. There's a static class ByteConverter that takes care about all these nuances and provides methods for conversions between arrays of bytes and floats:

  • ToFloats8Bit(bytes, floats, normalize)
  • FromFloats8Bit(floats, bytes, normalize)
  • ToFloats16Bit(bytes, floats, normalize, bigEndian)
  • FromFloats16Bit(floats, bytes, normalize, bigEndian)

By default, float samples are normalized onto [-1, 1].

var sizeInBytes = 8192;
var sizeInFloats = sizeInBytes / sizeof(short);  // for Pcm 16bit

byte[] _bytes = new byte[sizeInBytes];
float[][] _data = new float[numChannels][];
for (var i = 0; i < numChannels; i++)
    _data[i] = new float[sizeInFloats];

//...

while (_isRecording)
{
    // read bytes in Pcm16bit format:
    await _recorder.ReadAsync(_bytes, 0, sizeInBytes);

    // convert to float[] arrays (for each channel); normalized, little-endian
    ByteConverter.ToFloats16Bit(_bytes, _data);

    // ... process _data
}

See code of Xamarin demo application.

Playing and recording (Windows only)

MciAudioPlayer and MciAudioRecorder are classes responsible for recording and playback, and they work only on Windows, since they use winmm.dll and MCI commands.

Recording

Only very basic recording is supported. During the recording process no audiodata can be processed online. After calling the StopRecording method the new file will be created containing recorded sound:

IAudioRecorder recorder = new MciAudioRecorder();

// ...in some event handler
recorder.StartRecording(16000);

// ...in some event handler
recorder.StopRecording("temp.wav");

Playback

IAudioPlayer player = new MciAudioPlayer();

// play entire file
await player.PlayAsync("temp.wav");

// play file from 16000th sample to 32000th sample
await player.PlayAsync("temp.wav", 16000, 32000);


// ...in some event handler
player.Pause();

// ...in some event handler
player.Resume();

// ...in some event handler
player.Stop();

Playing audio from buffers in memory is implied by design but it's not implemented in MciAudioPlayer.

// this won't work, unfortunately:

// await player.PlayAsync(signal);
// await player.PlayAsync(signal, 16000, 32000);

However, at Windows-side a very simple wrapper around System.Media.SoundPlayer can be made - the MemoryStreamPlayer class. This class should implement the same IAudioPlayer interface as MciAudioPlayer. Possible code can look like this:

/// <summary>
/// Simple player wrapped around System.Media.SoundPlayer
/// </summary>
public class MemoryStreamPlayer : IAudioPlayer
{
    private SoundPlayer _player;

    public async Task PlayAsync(string location, int startPos = 0, int endPos = -1)
    {
        _player?.Dispose();
        _player = new SoundPlayer(location);
        _player.Play();
    }

    public async Task PlayAsync(DiscreteSignal signal, int startPos = 0, int endPos = -1, short bitDepth = 16)
    {
        var stream = new MemoryStream();
        var wave = new WaveFile(signal, bitDepth);
        wave.SaveTo(stream);

        stream = new MemoryStream(stream.ToArray());

        _player?.Dispose();
        _player = new SoundPlayer(stream);
        _player.Stream.Seek(0, SeekOrigin.Begin);
        _player.Play();
    }
        
    public void Pause()
    {
        _player.Stop();
    }

    public void Resume()
    {
        _player.Play();
    }

    public void Stop()
    {
        _player.Stop();
    }

    public float Volume { get; set; }
}

With MemoryStreamPlayer playing a signal from the memory stream is very easy:

var player = new MemoryStreamPlayer();
await player.PlayAsync(signal);

There's also another possible workaround: in the calling code the signal can be saved to a temporary wave file, and then player can play this file.

// looks not so cool, but at least it works:

// create temporary file
var filename = string.format("{0}.wav", Guid.NewGuid());
using (var stream = new FileStream(filename, FileMode.Create))
{
	var waveFile = new WaveFile(signal);
	waveFile.SaveTo(stream);
}

await player.PlayAsync(filename);

// cleanup temporary file
File.Delete(filename);