Doctoral Dissertations

Date of Award

12-2014

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Engineering

Major Professor

Jian Huang

Committee Members

Joshua Fu, James Plank, Michael Berry

Abstract

Data can create enormous values in both scientific and industrial fields, especially for access to new knowledge and inspiration of innovation. As the massive increases in computing power, data storage capacity, as well as capability of data generation and collection, the scientific research communities are confronting with a transformation of exploiting the advanced uses of the large-scale, complex, and high-resolution data sets in situation awareness and decision-making projects. To comprehensively analyze the big data problems requires the analyses aiming at various aspects which involves of effective selections of static and time-varying feature patterns that fulfills the interests of domain users. To fully utilize the benefits of the ever-growing size of data and computing power in real applications, we proposed a general feature analysis pipeline and an integrated system that is general, scalable, and reliable for interactive feature selection and visualization of large observational data for situation awareness.

The great challenge tackled in this dissertation was about how to effectively identify and select meaningful features in a complex feature space. Our research efforts mainly included three aspects:

1. Enable domain users to better define their interests of analysis;

2. Accelerate the process of feature selection;

3. Comprehensively present the intermediate and final analysis results in a visualized way.

For static feature selection, we developed a series of quantitative metrics that related the user interest with the spatio-temporal characteristics of features. For timevarying feature selection, we proposed the concept of generalized feature set and used a generalized time-varying feature to describe the selection interest. Additionally, we provided a scalable system framework that manages both data processing and interactive visualization, and effectively exploits the computation and analysis resources. The methods and the system design together actualized interactive feature selections from two representative large observational data sets with large spatial and temporal resolutions respectively. The final results supported the endeavors in applications of big data analysis regarding combining the statistical methods with high performance computing techniques to visualize real events interactively.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS