Doctoral Dissertations
Date of Award
5-2025
Degree Type
Dissertation
Degree Name
Doctor of Philosophy
Major
Business Analytics
Major Professor
Yuanyang Liu
Committee Members
Chuanren Liu, Tingliang Huang, Xiaojia Guo
Abstract
This dissertation investigates job market dynamics through machine learning and natural language processing applied to job posting data. The research aims to enhance labor market transparency by providing accurate salary predictions, insights into skills’ monetary value, and identification of high-demand skill combinations. Analysis utilizes a comprehensive dataset of Data Scientist job postings across the USA, focusing on the technical labor market to deliver actionable insights.
The first essay develops a robust salary prediction model leveraging both unstructured and structured job posting data. Textual information from job descriptions is transformed using various embedding techniques (Word2Vec, Doc2Vec, BERT, and OpenAI embeddings), while structured variables are extracted directly from job attributes. These features are processed through H2O Automated Machine Learning (AutoML), which evaluates multiple model families—including linear models, tree-based algorithms, and multilayer perceptrons— to create an optimized ensemble model. Our best model achieves a Mean Absolute Percentage Error (MAPE) of 16%, demonstrating strong predictive performance.
The second essay introduces a framework for estimating the monetary value of individual skills and skill combinations in Data Science. Using a quasi-experimental design, job postings are segmented into treatment and control groups based on the presence of specific skill terms, isolating each skill’s marginal impact on salary outcomes. The concept of skill complementarity captures synergistic effects where certain skill pairs yield higher salary premiums together than the sum of their individual contributions. These findings benefit multiple stakeholders: job seekers can target high-value skills, while employers can make informed decisions about recruitment requirements.
This dissertation advances labor market transparency by developing text-driven machine learning models for salary prediction and skill valuation. By integrating advanced analytics with large-scale labor market data, the research contributes to labor market analytics with data-driven insights for policymakers, employers, and job seekers, highlighting the importance of strategic skill development in our increasingly dynamic job market.
Recommended Citation
Narayane, Nikhil Bharat, "Enhancing Data Science Job Market Transparency: Salary Prediction and Skill Valuation. " PhD diss., University of Tennessee, 2025.
https://trace.tennessee.edu/utk_graddiss/12397
Included in
Business Analytics Commons, Labor Relations Commons, Management Sciences and Quantitative Methods Commons