Victor Johnson
← Back to projects

Customer Segmentation Tool

October, 2017

Customer Segmentation Tool
Technologies used: Python (FastAPI) • scikit-learn • NumPy • Pandas
This project might be outdated as the repository is no longer maintained.

Python Jupyter Notebook scikit-learn Pandas NumPy Plotly

This project segments customers using K-Means clustering and PCA (Principal Component Analysis). It generates synthetic customer data, runs clustering, and shows results as an interactive scatter plot using Plotly.


Features

  • Generate sample customer data
  • Perform K-Means clustering
  • Apply PCA for 2D visualization
  • Create interactive Plotly scatter plots
  • Includes a Jupyter notebook for exploration

This project demonstrates a practical introduction to customer segmentation using unsupervised machine learning techniques. It leverages K-Means clustering to group customers based on behavioral patterns and applies Principal Component Analysis (PCA) to reduce dimensionality for clear, two-dimensional visualization. To keep the project accessible and self-contained, synthetic customer data is generated programmatically. The final output is an interactive Plotly scatter plot, allowing users to visually explore customer clusters and better understand how segmentation works in real-world analytics scenarios.

Designed as a beginner-friendly learning project, the solution includes a clean modular structure, a runnable command-line interface, and an exploratory Jupyter notebook for hands-on analysis. Users can easily adjust the number of clusters, run tests to validate functionality, and view results directly in a browser via generated HTML visualizations. Overall, this project serves as a solid foundation for learning Python-based data analysis, clustering concepts, and interactive data visualization workflows.