Date of Award


Degree Type


Degree Name

Doctor of Philosophy


Civil Engineering

Major Professor

Lee D. Han

Committee Members

Shashi Nambisan, Christopher Cherry, Hamparsum Bozdogan


Data aggregation, which is a process to combine information by defined groups for statistical analysis, summary, data size reduction, or other purposes, has fundamental challenges, such as loss of the original information. Improper data aggregation, such as sampling bias or incorrect calculation of average, may cause misreading of information. In first chapter, it is revealed that the harmonic mean, which is used to calculate space mean speed for fixed segment, has a sampling bias, i.e., overestimation with small samples. The several impact analyses show that the sampling bias is affected by sampling rate, time interval, segment length, and distribution type.

If the data aggregation is properly used, it can help us improve analytical efficiency, encounter some of critical problems, or reveal its casualties and other relevant information. Second and third chapters utilize the aggregation of multi-source data to estimate error distributions of data sources and improve accuracy of their measurements. This is a leaping point of evaluating data sources as the proposed model does not require ground truth data. Second chapter focuses more on the methodology, i.e., a modified Approximate Bayesian Computation, incorporated to construct the error distribution with numerous simulations. In the simulated experiment, the proposed model outperformed the alternative approach, which is a conventional way of evaluating data source that is gathering error information by comparing with ground data source. Several sensitivity analyses explore that how the model performance is affected by sample size, number of data sources, and distribution types. The proposed model in chapter II is limited to one dimensional variable, and then the application is expanded to improving the position and distance measurement of connected vehicle environment. The proposed model can be used to further improve the accuracy of vehicle positioning with other existing methods, such as simultaneous localization and mapping (SLAM). The estimation process can be conducted in real-time operation, and the learning process will try to keep improving the accuracy of estimation. The results show that the proposed model noticeably improves the accuracy of position and distance measurements.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."