Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Enhancing Data Science Job Market Transparency: Salary Prediction and Skill Valuation
Details

Enhancing Data Science Job Market Transparency: Salary Prediction and Skill Valuation

Date Issued
May 1, 2025
Author(s)
Narayane, Nikhil Bharat  
Advisor(s)
Yuanyang Liu
Additional Advisor(s)
Chuanren Liu, Tingliang Huang, Xiaojia Guo
Abstract

This dissertation investigates job market dynamics through machine learning and natural language processing applied to job posting data. The research aims to enhance labor market transparency by providing accurate salary predictions, insights into skills’ monetary value, and identification of high-demand skill combinations. Analysis utilizes a comprehensive dataset of Data Scientist job postings across the USA, focusing on the technical labor market to deliver actionable insights.


The first essay develops a robust salary prediction model leveraging both unstructured and structured job posting data. Textual information from job descriptions is transformed using various embedding techniques (Word2Vec, Doc2Vec, BERT, and OpenAI embeddings), while structured variables are extracted directly from job attributes. These features are processed through H2O Automated Machine Learning (AutoML), which evaluates multiple model families—including linear models, tree-based algorithms, and multilayer perceptrons— to create an optimized ensemble model. Our best model achieves a Mean Absolute Percentage Error (MAPE) of 16%, demonstrating strong predictive performance.

The second essay introduces a framework for estimating the monetary value of individual skills and skill combinations in Data Science. Using a quasi-experimental design, job postings are segmented into treatment and control groups based on the presence of specific skill terms, isolating each skill’s marginal impact on salary outcomes. The concept of skill complementarity captures synergistic effects where certain skill pairs yield higher salary premiums together than the sum of their individual contributions. These findings benefit multiple stakeholders: job seekers can target high-value skills, while employers can make informed decisions about recruitment requirements.

This dissertation advances labor market transparency by developing text-driven machine learning models for salary prediction and skill valuation. By integrating advanced analytics with large-scale labor market data, the research contributes to labor market analytics with data-driven insights for policymakers, employers, and job seekers, highlighting the importance of strategic skill development in our increasingly dynamic job market.

Subjects

Salary Prediction

Natural Language Proc...

Skill Valuation

Machine Learning

Labor Market Analytic...

Artificial Intelligen...

Disciplines
Business Analytics
Labor Relations
Management Sciences and Quantitative Methods
Degree
Doctor of Philosophy
Major
Business Analytics
File(s)
Thumbnail Image
Name

04_22_2025_Nikhil_Narayane_dissertation_draft.pdf

Size

5.7 MB

Format

Adobe PDF

Checksum (MD5)

bcdcec9daf061c145f2664e720b0b857

Learn more about how TRACE supports reserach impact and open access here.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify