Package lib :: Module grabdata2
[frames] | no frames]

Module grabdata2

source code

This module computes the uncompressed feature vectors for music rhythm determination.

This can be run as a script for debugging/exploratory purposes. When run as a script, it produces plots of various stages of the signal processing sequence.

It is also imported as a module, in which case the plotting is off. The normal call is to make_vector().

Classes
  faxis
This is a logarithmically spaced frequency axis.
Functions
numpy.ndarray 2-dimensional.
s_preprocess(f, Dt=None, plotit=False)
First stage.
source code
 
plot_speech(smd)
Plot the acoustic data.
source code
numpy.ndarray, 2-dimensional array of floats.
spectral(sd, Dt, plotaudio=False)
Compute beat rates from a spectrogram.
source code
 
stage2(s, fa, plotit=False) source code
A pickle of (name, feature vector)
make_vector(f1, name, plotaudio=False)
The basic idea here is that we should not expect a song to have a consistent rhythmic pattern thoughout.
source code
Variables
  CACHEROOT = '/tmp/artdist'
Where to cache intermediate results.
  CWT = 1.0
  MEDRATE = 1.0
  __package__ = 'lib'

Imports: os, math, numpy, pylab, cPickle, subprocess, wavio, cache, VM, FV


Function Details

s_preprocess(f, Dt=None, plotit=False)

source code 

First stage. This does the signal processing to (essentially) compute a spectrogram of the music. The spectrogram is a two-dimensional representation of the sound: time is one axis, and the other axis is frequency.

It uses a "perceptial spectrum" which is more representative of what humans actually hear than the raw spectrogram. Essentially, it matches the spectral bandwidth to the ear: it uses a filterbank where each filter is 0.7 erb wide. Then, it half-wave rectified the signal and takes the cube root, in order to match the amplitude compression in the ear.

Parameters:
  • f (str) - filename for the input audio file.
  • Dt (float) - The desired sampling interval of the output feature vectors. (E.g. 10ms.)
Returns: numpy.ndarray 2-dimensional.
A sequence of feature vectors that span the specified region. This is a 2-dimensional spectrogram.

spectral(sd, Dt, plotaudio=False)

source code 

Compute beat rates from a spectrogram.

Parameters:
  • sd (numpy.ndarray 2-dimensional array of floats.) - his starts from a spectrogram with one time axis and one frequency axis,
Returns: numpy.ndarray, 2-dimensional array of floats.
It computes an array where the time axis is converted to a beat-rate axis. So, the value of each element tells you how strong a particular beat is, at a particular pitch. An element might reveal the strength of a 60BPM beat in the high pitch range near 3 kHz (e.g. snare drum). So, the frequency axis separates the track by instrument, and the beats-per-minute axis lets you look at the rhythm of each instrument. (Of course, that's a somewhat idealized view...)

make_vector(f1, name, plotaudio=False)

source code 

The basic idea here is that we should not expect a song to have a consistent rhythmic pattern thoughout. So, we divide the track into blocks, and nside each block, we assume a consistent pattern. (Obviously, this is not perfect: we chop it into blocks without regard for the song. A better scheme mght be to try to find boundaries in the song where the rhythm changes.) If the rhythm changes within a block, you'll get a blurred rhythmic pattern.

At the end, we report the average rhythm pattern, over all the blocks.

Parameters:
  • f1 (str) - name of the audio file
  • name (normally str) - the track's label. (Note: this is just passed through to the output.)
  • plotaudio (boolean) - display plots (true) or run silently (false). This particular routine plots the averaged beat rate as a function of frequency, and also the averaged feature vector.
Returns: A pickle of (name, feature vector)
It returns an opaque blob (a pickle) that you can write to a file and easily read back. The pickle contains the track's label, along with the uncompressed feature vector. This feature vector contains the beat rate as a function of frequency, followed by the part of the feature vector that describes the averaged rhythmic pattern.