Link Search Menu Expand Document

WAV-file output

Table of Contents

Overview

The WAV is a very popular format to store uncompressed audio as file. Currently, WaveBeans supports only files with single channel – mono. The sampling rate and bit depth can vary. It can be used only to store stream of samples BeanStream<Sample> or buffered stream of samples BeanStream<SampleVector>.

Writing to wav-file is 2 step process:

  1. While the stream is being processed, the sample is stored as a temporary file. That means you could run the stream as long as you need, and you don’t need to define the length of the output beforehand.
  2. The final wav-file is formed when you’re attempting to close the output. It is very important to remember this – you always need to close the output before trying to find the file in the file system.

To store the stream into a wav-file you call one of the following function, each function defines specific parameters of the container:

  1. Mono 8 bit – toMono8bitWav("file:///path/to/file.wav")
  2. Mono 16 bit – toMono16bitWav("file:///path/to/file.wav")
  3. Mono 24 bit – toMono24bitWav("file:///path/to/file.wav")
  4. Mono 32 bit – toMono32bitWav("file:///path/to/file.wav")
440.sine()
    .trim(1000)
    .toMono16bitWav("file:///path/to/file.wav")

Note: Don’t forget to follow general rules to execute the stream.

Controlling output

The same stream parts can be stored into different files if that is needed. The example of such cases: you want to cut the signal into let’s say equal-sized parts, or detect the silence and store samples into multiple files removing the silence on the way.

In order to do that you need to wrap the Sample or SampleVector with Managed class and then whenever you feel it is the time – send the flush signal. For convenience, you may use function io.wavebeans.lib.io.AbstractWriterKt.withOutputSignal on top of any non-nullable type, though sometimes compilere can’t interfer the types, and you would need to specify them explicitly:

// from Sample type and no arguments for `NoopOutputSignal` signal
sample.withOutputSignal<Sample, Unit>(NoopOutputSignal)

// from Int type and new instance of custom type `ArgumentType`of argument for `FlushOutputSignal` signal 
int.withOutputSignal<Int, ArgumentType>(FlushOutputSignal, ArgumentType("some-value"))

To be able to output Managed stream into wav-file you need to call one of the wav output functions (see above) specifying the suffix function that translates the argument into a string:

managedStream.toMono8bitWav("file:///path/to/file.wav") { argument -> "-${format(argument)}" } 
managedStream.toMono16bitWav("file:///path/to/file.wav") { argument -> "-${format(argument)}" } 
managedStream.toMono24bitWav("file:///path/to/file.wav") { argument -> "-${format(argument)}" } 
managedStream.toMono32bitWav("file:///path/to/file.wav") { argument -> "-${format(argument)}" } 

The argument is provided at the moment the signal is fired.

Note: it is recommended to apply buffering to avoid excessive object creation for each sample, even small buffers help.

Noop signal

As the stream becomes “managed” the signal must be specified, if nothing needs to be performed you still forced to specify something. io.wavebeans.lib.io.AbstractWriterKt.NoopOutputSignal is not handled by the output and completely ignored, use it every time you don’t want to affect the stream.

Flush signal

The flush signal io.wavebeans.lib.io.AbstractWriterKt.FlushOutputSignal allows you to tell the output to immediately close and flush the current buffer and start a new one. The file name is augmented with suffix: file:///path/to.wav becomes file:///path/to${suffix}.wav, the suffix is generated by the provided function.

Here is an example that cuts signal into ~1 sec pieces. The idea here is to mark every sample with its time marker, group together all samples within ~2ms time range to make sure the time markers are unique, and flush everytime the ~2ms time range crosses the 1 sec notch:

val timeStreamMs = input { (i, sampleRate) -> i / (sampleRate / 1000.0).toLong() }
440.sine()
        .merge(timeStreamMs) { (signal, time) ->
            checkNotNull(signal)
            checkNotNull(time)
            signal to time
        }
        // ~2ms windows within desired sample rate 44100Hz
        .window(89) { ZeroSample to 0 }
        .map { window ->
            val samples = window.elements.map { it.first }
            val timeMarker = window.elements.first().second
            sampleVectorOf(samples).withOutputSignal(
                    if (timeMarker > 0 // ignore the first marker to avoid flushing empty file
                            && timeMarker % 1000 < 2 // target every second notch with 2 ms precision
                    ) FlushOutputSignal else NoopOutputSignal,
                    ZonedDateTime.now() to timeMarker
            )
        }
        .trim(20, TimeUnit.SECONDS)
        .toMono16bitWav("file:///home/user/sine.wav") { a ->
            val dtf = DateTimeFormatter.ofPattern("yyyy-MM-dd-HH-mm-ss-SSS")
            "-${dtf.format(a?.first ?: ZonedDateTime.now())}-${a?.second ?: 0}"
        }

If you run example above you’ll have something like this in your output directory:

$ la    
  total 3536
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-18-805-0.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-213-1001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-254-2000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-337-3001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-385-4000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-411-5000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-438-6001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-463-7000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-488-8001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-517-9001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-541-10000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-561-11001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-578-12000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-596-13000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-610-14001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-623-15000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-637-16001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-652-17001.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-670-18000.wav
  -rw-r--r--  1 user  staff    86K Oct 20 17:06 sine-2020-10-20-17-06-19-684-19001.wav
  -rw-r--r--  1 user  staff   4.0K Oct 20 17:06 sine-2020-10-20-17-06-19-697-20000.wav

Open and close gate signals

The gate allows you to define if the output should be stored or ignored. When the gate is opened, all samples which are coming in are stored in the buffer, when the gate is closed the current buffer (if it’s not empty) flushed on the disk. The following coming in samples are ignored unless the next open gate signal is emitted. When the output is created, the gate is already opened.

To open the gate send io.wavebeans.lib.io.AbstractWriterKt.OpenGateOutputSignal, to close io.wavebeans.lib.io.AbstractWriterKt.CloseGateOutputSignal. The same consequent signals has no effect. I.e. if the gate is already opened, the open gate signal will remain the gate opened, and the signal is completely omitted, even monitoring metrics are not affected. The similar is true for close gate signal.

As an application the following may work as an example. Let’s assume we have a stream of signals with silence between them, and the signal overall is noisy. The task is to sample the signal into separate files removing out the silence. The idea would be to chunk the singal, and check if the chunk represents silence, hence when we see the silence we start sending CloseGateOutputSignal, first of which will flush the output, when sigmal gets back to normal, we continue sending OpenGateOutputSignal, which opens the gate and start filling in th buffer.

val noise = input { sampleOf(Random.nextInt()) }
val silence = (noise * 0.1).trim(100)
val sample1 = (440.sine() + noise * 0.01).trim(500)
val sample2 = (220.sine() + noise * 0.01).trim(500)
val sample3 = (880.sine() + noise * 0.01).trim(500)

(sample1..silence..sample2..silence..sample2..sample3) // represents the signal
        .window(20)
        .map {
            val noiseLevel = 0.11 
            val signal =
                    if (
                        it.elements.map(::abs).average() // The "level" for the chunk as an average function 
                                                         // of all absolute values of the waveform
                             < noiseLevel                // should be less than defined noise level
                                                         // to call it "silence".
                    ) {
                        CloseGateOutputSignal
                    } else {
                        OpenGateOutputSignal
                    }

            sampleVectorOf(it).withOutputSignal(signal, ZonedDateTime.now())
        }
        .toMono16bitWav("file:///home/user/sine.wav") { a ->
            val dtf = DateTimeFormatter.ofPattern("yyyy-MM-dd-HH-mm-ss-SSS")
            "-${dtf.format(a ?: ZonedDateTime.now())}-${Random.nextInt(Int.MAX_VALUE).toString(36)}"
        }

which results in something like this, where each file contains a specific piece of the input signal:

$ la 
total 352
-rw-r--r--  1 user  staff    43K Oct 27 16:35 sine-2020-10-27-16-35-16-216-wkimkn.wav
-rw-r--r--  1 user  staff    43K Oct 27 16:35 sine-2020-10-27-16-35-16-483-l0rgtv.wav
-rw-r--r--  1 user  staff    86K Oct 27 16:35 sine-2020-10-27-16-35-16-572-h9d60l.wav

Close output signal

Close output signal io.wavebeans.lib.io.AbstractWriterKt.CloseOutputSignal allows you to end the stream even if the actual stream is not over. Technically, it forces the writer to tell the executor that it has finished. When the CloseOutputSignal is handled the current buffer including the current sample is also flushed, respecting the gate.

For example, that allows you to provide managing signal in your initial signal and populate it into the output accordingly, i.e. close the stream as soon as you get the certain sample sequence. Here on every window it checks if the endSequence is inside it, and if that is so, returns the wave before the endSequence along with CloseOutputSignal:

val endSequence = listOf(1, 2, 1, 3, 1, 4, 1, 5, 1, 6).map { sampleOf(it * 1000) }
val endSignal = endSequence.input()
val signal = 440.sine().trim(1000)
val noise = input { sampleOf(Random.nextInt()) }

class SequenceDetectFn(initParameters: FnInitParameters) : Fn<Window<Sample>, Managed<OutputSignal, Unit, SampleVector>>(initParameters) {

    constructor(endSequence: List<Sample>) : this(FnInitParameters().addDoubles("endSequence", endSequence))

    override fun apply(argument: Window<Sample>): Managed<OutputSignal, Unit, SampleVector> {
        val es = initParams.doubles("endSequence")
        val ei = argument.elements.iterator()
        var ai = es.iterator()
        var startedAt = -1
        var i = 0
        while (ei.hasNext() && ai.hasNext()) {
            val e = ei.next()
            val a = ai.next()
            if (a != e) {
                ai = es.iterator()
                startedAt = -1
            } else if (startedAt == -1) {
                startedAt = i
            }
            i++
        }

        if (ai.hasNext()) startedAt = -1

        return if (startedAt == -1) {
            sampleVectorOf(argument).withOutputSignal(NoopOutputSignal)
        } else {
            sampleVectorOf(argument.elements.subList(0, startedAt)).withOutputSignal(CloseOutputSignal)
        }
    }
}

(signal..endSignal..noise)
        .window(endSequence.size * 10)
        .map(SequenceDetectFn(endSequence))
        .toMono16bitWav("file:///home/user/sine.wav") { "-${Random.nextInt().toString(36)}" }

The code above will generate only one file that contains the signal value. One thing to point out here, the initial signal may not end, i.e. the rest of the singal noise is infinite stream, but it is still work as expected.

Performance boost

For some cases it is possible to have a small buffer while processing the stream. In this case, using SampleVector may help to reduce the overhead on the processing pipeline. Wav output support working with this type of sample out of the box, so you won’t need to flatten it back. To use that approach, simply window the stream with desired length, remap it to SampleVector, and you may store it directly to the wav-file:

stream
    .window(128).map { sampleVectorOf(it) }
    .trim(20, TimeUnit.SECONDS)
    .toMono16bitWav("file:///home/user/sine.wav")

Low-level API

As any other API within WaveBeans framework, WAV output is just a wrapper around a class. You may create the instance of this class by specifying the stream it needs to read from and a set of parameters.

The parameters are create via instantiating class io.wavebeans.lib.io.WavFileOutputParams:

  • uri – the location of the file to write to. Should be valid URI, for file in local file system use scheme file:// and then absolute path.
  • bitDepth – how many bites per sample to use when storing into a file. The type is io.wavebeans.lib.BitDepth. Supported bit rates are 8, 16, 24, 32, and kinda 64 (can store it but AFAIK it is not officially supported and hence no one understands this format).
  • numberOfChannels – The number of channels to store, should be greater or equal to 1. Currently, stream can work only with one channel, so the value different to 1 (Mono) will produce unexpected results.

The input stream should be the type of SampleBeanStream<Sample>.

So for example to store 440Hz sine you would need to write the code like:

val stream = 440.sine()
        .trim(1000)

WavFileOutput(stream, WavFileOutputParams(
    uri = "file:///path/to/file.wav", 
    bitDepth = BitDepth.BIT_32,
    numberOfChannels = 1
))