Victor Johnson | Senior AI & LLM Engineer | Production RAG & GenAI Systems

This project segments customers using K-Means clustering and PCA (Principal Component Analysis). It generates synthetic customer data, runs clustering, and shows results as an interactive scatter plot using Plotly.

Features

Generate sample customer data
Perform K-Means clustering
Apply PCA for 2D visualization
Create interactive Plotly scatter plots
Includes a Jupyter notebook for exploration

This project demonstrates a practical introduction to customer segmentation using unsupervised machine learning techniques. It leverages K-Means clustering to group customers based on behavioral patterns and applies Principal Component Analysis (PCA) to reduce dimensionality for clear, two-dimensional visualization. To keep the project accessible and self-contained, synthetic customer data is generated programmatically. The final output is an interactive Plotly scatter plot, allowing users to visually explore customer clusters and better understand how segmentation works in real-world analytics scenarios.

Designed as a beginner-friendly learning project, the solution includes a clean modular structure, a runnable command-line interface, and an exploratory Jupyter notebook for hands-on analysis. Users can easily adjust the number of clusters, run tests to validate functionality, and view results directly in a browser via generated HTML visualizations. Overall, this project serves as a solid foundation for learning Python-based data analysis, clustering concepts, and interactive data visualization workflows.

Customer Segmentation Tool

Features