Date of Award
Doctor of Philosophy
Hairong Qi, Jens Gregor, Amir Sadovnik, Chuanren Liu
Deep learning-based algorithms have remarkably improved the performance in many computer vision tasks. However, deep networks often demand a large-scale and carefully annotated dataset and sufficient sample coverage of every training category. However, it is not practical in many real-world applications where only a few examples may be available, or the data annotation is costly and require expert knowledge. To mitigate this issue, learning with limited data has gained considerable attention and is investigated thorough different learning methods, including few-shot learning, weakly/semi supervised learning, open-set learning, etc.
In this work, the classification problem is investigated under an open-world assumption to handle unpredictable categories in a long-tail distribution dataset. For the open-set recognition, a representative discriminative multi-task learning framework is presented which can characterize the categories more effectively for known vs. unknown categories recognition. Experiments on multiple satellite benchmarks and RGB image datasets demonstrate significant improvement over state-of-the-art open-set recognition algorithms
In addition, the generalization capability of a recognition system is studied where the goal is to build a model which is able to perform well across diverse datasets for a certain task. To bridge the gap between different but related datasets, a unmixing- based domain adaptation approach is proposed which extract the domain-invariant representations leading to stable deployment performance. The experimental results show stable deployment performance across multiple satellite datasets.
To extend this work to higher dimensional data, this work studies semi-supervised learning for video segmentation task as certain frames are often annotated in a pixel level. To take advantage of labeled frames more efficiently, a bidirectional framework is proposed to segment the unlabeled frames with the help of segmented ones. Besides, an occlusion estimation approach is introduced to improve the segmentation performance by effectively fusing the bidirectional propagated semantic labels. Extensive experiments on driving video benchmarks demonstrate superiority of the proposed method on segmentation accuracy, temporal consistency, and computation cost compared to the state-of-the-art methods.
Kaviani Baghbaderani, Razieh, "Learning with Limited Labeled Data for Image and Video Understanding. " PhD diss., University of Tennessee, 2022.