Doctoral Dissertations

Author

Miljko Bobrek

Date of Award

8-1996

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Electrical Engineering

Major Professor

Daniel B. Koch

Committee Members

Mongi A. Abidi, Mark E. Boling, Michael J. Roberts

Abstract

The logarithmic structure of the music signal spectrum requires a constant-Q transform for front-end signal processing in polyphonic music segmentation. The Short- Time Fourier Transform (STFT) offers a linear segmentation of the frequency space, i.e., a variable-Q analysis. On the other hand, the Dyadic Wavelet Transform performs a constant-Q analysis, but with a Q-factor that is much below the required level. Tree- structured filter banks that are related to wavelet packets offer a nearly-constant-Q analysis with an arbitrarily high Q-factor that can be increased using more stages in the tree structure.

To improve selectivity in tree-structured filter banks, a new family of Quadrature Mirror Filters (QMFs) with a narrow transition region and a low reconstruction error was designed and implemented. It is shown that these QMFs significantly reduce the frequency leakage in tree-structured filter banks that occurs when the QMF or perfect reconstruction filter pairs with a wide transition region are used. The total reconstruction error in the tree-structured filter banks when the new QMFs are used is smaller than the round-off error which occurs during the implementation.

For music signal segmentation, an 11-stage tree structure was designed and implemented. The tree-structured filter bank has sufficient frequency resolution over a wide range of frequencies, while the time resolution at high frequencies satisfies the minimum time resolution for the human ear.

The tree-structured filter banks were implemented in two applications. The first one is the transcription of polyphonic piano music where the filter banks were used as the front-end signal processing. In the application, synthesized as well as a real-life polyphonic piano recordings were successfully coded into MIDI streams. In the second application, both the analysis and the synthesis parts of the tree-structure were implemented in a specially designed editor that was used for voice separation in polyphonic piano music as well as for signal denoising.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS