Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Learning with Limited Labeled Data for Image and Video Understanding
Details

Learning with Limited Labeled Data for Image and Video Understanding

Date Issued
August 1, 2022
Author(s)
Kaviani Baghbaderani, Razieh  
Advisor(s)
Hairong Qi
Additional Advisor(s)
Hairong Qi
Jens Gregor
Amir Sadovnik
Chuanren Liu
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/28534
Abstract

Deep learning-based algorithms have remarkably improved the performance in many computer vision tasks. However, deep networks often demand a large-scale and carefully annotated dataset and sufficient sample coverage of every training category. However, it is not practical in many real-world applications where only a few examples may be available, or the data annotation is costly and require expert knowledge. To mitigate this issue, learning with limited data has gained considerable attention and is investigated thorough different learning methods, including few-shot learning, weakly/semi supervised learning, open-set learning, etc.


In this work, the classification problem is investigated under an open-world assumption to handle unpredictable categories in a long-tail distribution dataset. For the open-set recognition, a representative discriminative multi-task learning framework is presented which can characterize the categories more effectively for known vs. unknown categories recognition. Experiments on multiple satellite benchmarks and RGB image datasets demonstrate significant improvement over state-of-the-art open-set recognition algorithms

In addition, the generalization capability of a recognition system is studied where the goal is to build a model which is able to perform well across diverse datasets for a certain task. To bridge the gap between different but related datasets, a unmixing- based domain adaptation approach is proposed which extract the domain-invariant representations leading to stable deployment performance. The experimental results show stable deployment performance across multiple satellite datasets.

To extend this work to higher dimensional data, this work studies semi-supervised learning for video segmentation task as certain frames are often annotated in a pixel level. To take advantage of labeled frames more efficiently, a bidirectional framework is proposed to segment the unlabeled frames with the help of segmented ones. Besides, an occlusion estimation approach is introduced to improve the segmentation performance by effectively fusing the bidirectional propagated semantic labels. Extensive experiments on driving video benchmarks demonstrate superiority of the proposed method on segmentation accuracy, temporal consistency, and computation cost compared to the state-of-the-art methods.

Subjects

Deep learning

Open-set recognition

Domain adaptation

Video semantic segmen...

Disciplines
Other Computer Engineering
Degree
Doctor of Philosophy
Major
Electrical Engineering
Embargo Date
August 15, 2023
File(s)
Thumbnail Image
Name

PhD_Dissertation_Razieh_KavianiBaghbaderani_v4.pdf

Size

8.87 MB

Format

Adobe PDF

Checksum (MD5)

85da18e5129b961ae5baf7fc7205dbd0

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify