Spectrograms

A basic function is provided to plot spectrograms that look familiar to phoneticians. It makes use of the spectrogram function from DSP.jl to perform the short-time Fourier analysis. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.

Examples

A standard broadband spectrogram can be created without using optional parameters.

using WAV
using Plots
s, fs = wavread("assets/iwantaspectrogram.wav")
s = vec(s)
phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)")
Example block output

A color scheme more similar to the Praat grayscale can be achieved using the color argument from Plots.jl and the :binary color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.

phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)", color=:binary)
Example block output

A narrowband style spectrogram can be plotted using the winlen argument:

phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)", winlen=0.03)
Example block output

And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.

phonspec(s, fs, pre_emph=0, xlab="Time (s)", ylab="Frequency (Hz)")
Example block output

Function documentation

Phonetics.phonspecFunction

phonspec(s, fs; pre_emph=0.97, dbr=55, win=:gaussian, winparam=nothing, winlen=0.005, winstep=0.002, kw...)

Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Defaults to a Gaussian window with a standard deviation of 1/6.

For a broadband spectrogram, use a value around 0.005 for winlen. For a narrowband spectrogram, use a value around 0.03 for winlen.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.

Args

  • s A vector containing the samples of a sound
  • fs Sampling frequency of s in Hz
  • pre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins
  • dbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument
  • win The type of window to use; must be one of :gaussian or :kaiser
  • winparam The parameter affecting the scale of the window; if nothing passed, uses 1/6 for a Gaussian window or 3 for a Kaiser window
  • winlen The length of the window in seconds (note that this value gets doubled in the code)
  • winstep How far apart each window is in seconds
  • kw... extra named parameters to pass to heatmap
source