Polyphonic music segmentation using wavelet based tree-structured filter banks with improved time-frequency resolution
The logarithmic structure of the music signal spectrum requires a constant-Q transform for front-end signal processing in polyphonic music segmentation. The Short- Time Fourier Transform (STFT) offers a linear segmentation of the frequency space, i.e., a variable-Q analysis. On the other hand, the Dyadic Wavelet Transform performs a constant-Q analysis, but with a Q-factor that is much below the required level. Tree- structured filter banks that are related to wavelet packets offer a nearly-constant-Q analysis with an arbitrarily high Q-factor that can be increased using more stages in the tree structure.
To improve selectivity in tree-structured filter banks, a new family of Quadrature Mirror Filters (QMFs) with a narrow transition region and a low reconstruction error was designed and implemented. It is shown that these QMFs significantly reduce the frequency leakage in tree-structured filter banks that occurs when the QMF or perfect reconstruction filter pairs with a wide transition region are used. The total reconstruction error in the tree-structured filter banks when the new QMFs are used is smaller than the round-off error which occurs during the implementation.
For music signal segmentation, an 11-stage tree structure was designed and implemented. The tree-structured filter bank has sufficient frequency resolution over a wide range of frequencies, while the time resolution at high frequencies satisfies the minimum time resolution for the human ear.
The tree-structured filter banks were implemented in two applications. The first one is the transcription of polyphonic piano music where the filter banks were used as the front-end signal processing. In the application, synthesized as well as a real-life polyphonic piano recordings were successfully coded into MIDI streams. In the second application, both the analysis and the synthesis parts of the tree-structure were implemented in a specially designed editor that was used for voice separation in polyphonic piano music as well as for signal denoising.
Thesis96b.B57.pdf
13.49 MB
Unknown
e4f1ea90b102e445755f3ee442e069cb