Analyzing Census Income Data Using Machine Learning Models

Overview

Explored supervised learning methods to analyze the Census Income Dataset, predicting whether an individual's income exceeds $50,000 annually. The project compared Naive Bayes and Decision Tree classifiers and investigated feature selection techniques to identify the attributes most correlated with income levels. Additionally, we analyzed dataset imbalances and missing values, which affected model performance and accuracy.

Course: Machine Learning for Media Technology
Date: January–March 2020
Collaborators: Beatrice Berg, Celine Helgesson Hallström, Ebba Rovig

Key Skills and Tools

  • Applied supervised learning methods (Naive Bayes, Decision Trees) for binary classification.

  • Used feature selection with SelectKBest (Chi-squared) to analyze feature importance.

  • Processed real-world datasets with Python libraries such as Pandas and scikit-learn.

  • Gained insights into data imbalances and the impact of missing values on model outcomes.

Previous
Previous

Perception of Sound through Haptics

Next
Next

Sound for Promoting Sleep