An introduction to the wrassp package

Lasse Bombien & Raphael Winkelmann

2016-05-30

Introduction

This document is meant as an introduction to the wrassp package. wrassp is a wrapper for R around Michel Scheffers’s libassp (Advanced Speech Signal Processor). The libassp library aims at providing functionality for handling speech signal files in most common audio formats and for performing analyses common in phonetic science/speech science. This includes the calculation of formants, fundamental frequency, root mean square, auto correlation, a variety of spectral analyses, zero crossing rate, filtering etc. This wrapper provides R with a large subset of libassp’s signal processing functions and provides them to the user in a (hopefully) user-friendly manner.

File I/0 and the AsspDataObj

Let’s get started by locating some example material distributed with the package.

# load the package
library(wrassp)
# get the path to the data that comes with the package
wavPath = system.file('extdata', package='wrassp')
# now list the .wav files so we have some audio files to play with
wavFiles = list.files(wavPath, pattern=glob2rx('*.wav'), full.names=TRUE)

One of the aims of wrassp is to provide mechanisms to handle speech-related files such as sound files and parametric data files. wrassp therefore comes with a class called AsspDataObj which does just that.

# load an audio file, e.g. the first one in the list above
au = read.AsspDataObj(wavFiles[1])
# show class
class(au)
## [1] "AsspDataObj"
# print object description
print(au)
## Assp Data Object of file /private/var/folders/ll/pw5k9y6x64q5xqys350rz2ph0000gn/T/RtmpWIINp0/Rinstbeb418737d29/wrassp/extdata/lbo001.wav.
## Format: WAVE (binary)
## 19983 records at 16000 Hz
## Duration: 1.248938 s
## Number of tracks: 1 
##   audio (1 fields)

au is an object of the class AsspDataObj and, using print, we can get some information about the object, such as its sampling rate, its duration and what kind of data are stored in what form. Since the file we loaded is audio only, the object contains exactly one track. And since it’s a mono file, this track only has one field. We will later encounter different types of data with more than one track and more fields per track.

Here are some more ways of extracting attributes from the object, such as duration, sampling rate and the number of records:

# extract duration
dur.AsspDataObj(au)
## [1] 1.248938
# extract sampling rate
rate.AsspDataObj(au)
## [1] 16000
# extract number of records/samples
numRecs.AsspDataObj(au)
## [1] 19983
# extract additional attributes
attributes(au)
## $names
## [1] "audio"
## 
## $trackFormats
## [1] "INT16"
## 
## $sampleRate
## [1] 16000
## 
## $filePath
## [1] "/private/var/folders/ll/pw5k9y6x64q5xqys350rz2ph0000gn/T/RtmpWIINp0/Rinstbeb418737d29/wrassp/extdata/lbo001.wav"
## 
## $origFreq
## [1] 0
## 
## $startTime
## [1] 0
## 
## $startRecord
## [1] 1
## 
## $endRecord
## [1] 19983
## 
## $class
## [1] "AsspDataObj"
## 
## $fileInfo
## [1] 21  2

An important property of AsspDataObj is of course that it contains data tracks, or at least one data track. As mentioned above, the currently loaded object contains a single mono audio track. Accessing the data is easy: AsspDataObj stores data in simple matrices, one matrix for each track. Broadly speaking, AsspDataObj is nothing but a list of at least one matrix. All of them have the same number of rows (number of records) but each can have a different number of columns (number of fields). Each track has a name and we can access the track using that name.

# extract track names
tracks.AsspDataObj(au)
## [1] "audio"
# or an alternative way to extract track names
names(au)
## [1] "audio"
# show head of samples
head(au$audio)
##      [,1]
## [1,]    5
## [2,]   -2
## [3,]   17
## [4,]   -5
## [5,]   -5
## [6,]   -2
# and we can of course also plot these samples 
# (only plot every 10th element to accelerate plotting)
plot(seq(0,numRecs.AsspDataObj(au) - 1, 10) / rate.AsspDataObj(au), 
     au$audio[c(TRUE, rep(FALSE,9))], 
     type='l', 
     xlab='time (s)', 
     ylab='Audio samples')

Now, purely to give us something unequal to the original au object to write to disc, let’s manipulate the audio data by simply multiplying all the sample values by a factor of 0.5. The resulting AsspDataObj will then be saved to a temporary directory provided by R.

# manipulate the audio
au$audio = au$audio * 0.5
# write file to tempdir
dir = tempdir()
writeres = write.AsspDataObj(au, file.path(dir, 'newau.wav'))

Signal processing

wrassp is of course capable of more than just the mere reading and writing of specific signal file formats. We will now use wrassp to calculate the formant values, their corresponding bandwidths, the fundamental frequency contour and the RMS energy contour of the audio file wavFiles[1].

Formants and their bandwidths

# calculate formants and corresponding bandwidth values
fmBwVals = forest(wavFiles[1], toFile=F)
# due to toFile=F this returns an object of the type AsspDataObj and 
# prevents the result being saved to disc as an SSFF file
class(fmBwVals)
## [1] "AsspDataObj"
# extract track names
# this time the object contains muliple tracks (formants + their bandwidths)
tracks.AsspDataObj(fmBwVals)
## [1] "fm" "bw"
# with more than one field (in this case 250 F1/F2/F3/F4 values)
dim(fmBwVals$fm)
## [1] 250   4
# plot the formant values
matplot(seq(0,numRecs.AsspDataObj(fmBwVals) - 1) / rate.AsspDataObj(fmBwVals) + 
          attr(fmBwVals, 'startTime'), 
        fmBwVals$fm, 
        type='l', 
        xlab='time (s)', 
        ylab='Formant frequency (Hz)')

Fundamental frequency contour

# calculate the fundamental frequency contour
f0vals = ksvF0(wavFiles[1], toFile=F)
# plot the fundamental frequency contour
plot(seq(0,numRecs.AsspDataObj(f0vals) - 1) / rate.AsspDataObj(f0vals) +
       attr(f0vals, 'startTime'),
     f0vals$F0, 
     type='l', 
     xlab='time (s)', 
     ylab='F0 frequency (Hz)')

RMS energy contour

Seeing as one might want to reuse some of the computed signals at a later stage, wrassp allows the user to write the result out to file by leaving the toFile parameter set to TRUE. This also allows users to process more than one file at once.

# calculate the RMS-energy contour for all wavFiles
rmsana(wavFiles, outputDirectory = tempdir())
## 
##   INFO: applying rmsana to 9 files
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=======                                                          |  11%
  |                                                                       
  |==============                                                   |  22%
  |                                                                       
  |======================                                           |  33%
  |                                                                       
  |=============================                                    |  44%
  |                                                                       
  |====================================                             |  56%
  |                                                                       
  |===========================================                      |  67%
  |                                                                       
  |===================================================              |  78%
  |                                                                       
  |==========================================================       |  89%
  |                                                                       
  |=================================================================| 100%
# list new files using wrasspOutputInfos$rmsana$ext (see below)
rmsFilePaths = list.files(tempdir(), 
                          pattern = paste0('*.',wrasspOutputInfos$rmsana$ext), 
                          full.names = T)
# read first rms file 
rmsvals = read.AsspDataObj(rmsFilePaths[1])
# plot the RMS energy contour
plot(seq(0,numRecs.AsspDataObj(rmsvals) - 1) / rate.AsspDataObj(rmsvals) +
       attr(rmsvals, 'startTime'), 
     rmsvals$rms, 
     type='l', 
     xlab='time (s)', 
     ylab='RMS energy (dB)')

The wrasspOutputInfos object

wrasspOutputInfos stores meta information associated with the different signal processing functions wrassp provides.

# show all function names
names(wrasspOutputInfos)
##  [1] "acfana"      "afdiff"      "affilter"    "cepstrum"    "cssSpectrum"
##  [6] "dftSpectrum" "ksvF0"       "mhsF0"       "forest"      "lpsSpectrum"
## [11] "rfcana"      "rmsana"      "zcrana"

This object can be useful to get additional information about a specific wrassp function. It contains information about the default file extension ($ext), the tracks produced ($tracks) and the output file type ($outputType) of any given wrassp function.

# show output infos of function forest
wrasspOutputInfos$forest
## $ext
## [1] "fms"
## 
## $tracks
## [1] "fm" "bw"
## 
## $outputType
## [1] "SSFF"

For a list of the available signal processing function provided by wrassp simply open the package documentation:

# open wrassp package documentation
?wrassp

Conclusion

We hope this document gives you a rough idea of how to use the wrassp package and what it is capable of. For more information about the individual functions please consult the respective R documentations (e.g. ?dftSpectrum).

To find questions that might have already been answered or if you have an issue or a bug to report please use our GitHub issue tracker.