Spectrograms
A basic function is provided to plot spectrograms that look familiar to phoneticians. It makes use of the spectrogram function from DSP.jl to perform the short-time Fourier analysis. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.
Examples
A standard broadband spectrogram can be created without using optional parameters.
using WAV
using Plots
s, fs = wavread("assets/iwantaspectrogram.wav")
s = s[:,1]
phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)")A color scheme more similar to the Praat grayscale can be achieved using the color argument from Plots.jl and the :binary color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.
phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)", color=:binary)A narrowband style spectrogram can be plotted using the winlen argument:
phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)", winlen=0.03)And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.
phonspec(s, fs, pre_emph=0, xlab="Time (s)", ylab="Frequency (Hz)")The dB scale can be made to reflect one more similar to Praat's using the db argument set to :spl. The spectrogram image does not change, and the dbr argument will function the same, but the scale on the legend for the heatmap will change and be similar to how Praats calculates dB.
phonspec(s, fs, xlab="Time (s)", ylab="Frequency (Hz)", db=:spl)Function documentation
Phonetics.phonspec — Functionphonspec(s, fs; pre_emph=0.97, dbr=55, win=:gaussian, winparam=nothing, winlen=0.005, winstep=0.002, db=:rel, kw...)
Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Defaults to a Gaussian window with a standard deviation of 1/6.
For a broadband spectrogram, use a value around 0.005 for winlen. For a narrowband spectrogram, use a value around 0.03 for winlen.
Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.
Args
sA vector containing the samples of a soundfsSampling frequency ofsin Hzpre_emphThe α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase beginsdbrThe dynamic range; all frequencies that aredbrdecibels quieter than the loudest frequency will not be displayed; will specify theclimargumentwinThe type of window to use; must be one of:gaussianor:kaiserwinparamThe parameter affecting the scale of the window; if nothing passed, uses 1/6 for a Gaussian window or 3 for a Kaiser windowwinlenThe length of the window in seconds (note that this value gets doubled in the code)winstepHow far apart each window is in secondsdbHow to calculate the scale for decibels; these options result in the same spectrogram image and same functionality ofdbr, but the numbers on the heatmap scale will change
* `:rel` will scale all intensities relative to the loudest frequency component
* `:spl` will use a scale relative to Praat's normative threshold (that is, relative to (2e-5)^2 Pa^2), which produces a scale similar to Praat'skw...extra named parameters to pass toheatmap