Victor Johnson
Data Scientist / AI Engineer | Production RAG & LLM Systems | Data Science, AI, ML | Azure, Python
Data Scientist / AI Engineer with 7+ years in ML/data and 2+ years specializing in production LLM/GenAI systems, including RAG, intelligent search, and LLM-powered applications in Azure.
I own LLM systems end-to-end: ingestion, embeddings, retrieval, prompt orchestration, evaluation, deployment, and monitoring - focused on reliability, performance, and cost in real products.
Core: LLMs, RAG, vector search, LLM evaluation, Python, Azure OpenAI, Spark, MLOps, etc.
England, United Kingdom

Tech Stack
LLMs & Generative AI
- Large Language Models (LLMs)
- Generative AI
- Prompt Engineering
- Retrieval-Augmented Generation (RAG)
- Semantic Search
- Embeddings & Vector Search
- Vector Databases
- LLM Evaluation & Monitoring
- LLMOps
- Chatbot Development
- LlamaIndex
- Azure OpenAI
- Azure AI Foundry
Machine Learning & NLP
- Machine Learning
- Applied Machine Learning
- Natural Language Processing (NLP)
- Scikit-learn
- Artificial Intelligence (AI)
- Predictive Maintenance
Data Science & Analytics
- Data Science
- Python
- SQL
- Transact-SQL (T-SQL)
- Statistical Modeling
- Data Visualization
- Financial Modeling
Data Engineering & Big Data
- Apache Spark
- Azure Databricks
- Apache Kafka
- Kafka Streams
- Apache Airflow
- Azure Data Factory
- Elasticsearch
- Azure SQL
Knowledge Graphs & Search
- Knowledge Graphs
- Knowledge Graph Data Engineering
- Semantic Search & Information Retrieval
MLOps & Production AI
- MLOps
- MLflow
- Model Monitoring
- Model Drift & Bias Detection
- Docker
Cloud & Platforms
- Microsoft Azure
- Azure Machine Learning
- Oracle Cloud Infrastructure (AI Foundations)
Software & APIs
- FastAPI
- REST APIs
- Software Development
- Git
- GitHub
Experience
Data Scientist (AI/LLM Focus)
Feb 2021 — PresentCompany - SYMEUS LTD
England, United Kingdom · Hybrid | Industry - Finance
GenAI & LLM Systems (Primary): Led end-to-end design and production deployment of LLM, RAG, and Agentic AI systems, covering ingestion, embedding pipelines, retrieval, orchestration, simulation, evaluation, deployment, and monitoring - reducing manual effort by ~50% Built LLM-powered search, chatbots, and ranking workflows used in live products. Designed evaluation frameworks (precision/recall, F1, BLEU/ROUGE, relevance, faithfulness) to reduce hallucinations and improve grounding. Implemented prompt engineering, guardrails, output validation, and LoRA-based adaptation for domain-specific reliability and performance. Ran structured A/B testing across models, embeddings, retrievers, and prompts to optimize quality, latency, and cost. Contributed to DevOps and CI/CD workflows for deploying and monitoring GenAI / LLM solutions in production, supporting versioning, testing, and release automation. Machine Learning & Applied Analytics (Secondary): Developed forecasting and statistical models for revenue, traffic, and user activity using Python and ML frameworks. Implemented Monte Carlo simulations for scenario planning and risk modelling, improving forecast robustness and reducing manual sensitivity analysis time by ~30%. Built automated analytics pipelines integrating GA4, GSC, and internal datasets. Data Engineering, Platform & DevOps: Designed scalable ETL and data pipelines (Spark, Azure Synapse) supporting ML and near real-time analytics. Optimized complex SQL/T-SQL workloads and built KPI dashboards (Power BI, Streamlit). Strengthened data governance and validation, reducing reporting turnaround time by ~40%. Contributed to DevOps and CI/CD workflows for deploying and monitoring GenAI / LLM solutions in production, supporting versioning, testing, and release automation.
Data Scientist (ML Focus)
Dec 2018 — Dec 2020Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Applied Machine Learning & Predictive Systems: Developed production machine learning models for predictive maintenance, including Remaining Useful Life (RUL) estimation for critical train components Built time-series and survival analysis models to predict failures, degradation, and maintenance needs across multiple subsystems Analyzed high-volume telemetry data to identify failure patterns, sensor drift, and anomalous behavior in operational environments Designed component-level health indicators and engineered features that improved prediction accuracy and model stability Applied Monte Carlo simulations to model failure uncertainty and maintenance scheduling scenarios, improving preventive maintenance planning and reducing unplanned downtime risk while accelerating decision analysis cycles. Data Engineering & Model Integration Implemented robust data validation, preprocessing, and feature pipelines to ensure reliability of sensor and operational data Integrated predictive models into operational reporting and decision-support systems, enabling faster and more informed maintenance decisions Supported condition-based maintenance strategies that improved fleet availability and reduced unplanned downtime Analytics Platforms & Visualization Built dashboards in Shiny, Qlik Sense, and Tableau to visualize asset health, predictions, and maintenance KPIs Contributed to the setup of the operations center by delivering KPI-driven visualizations and automated model outputs Applied IEC 62541 standards to improve data acquisition consistency and interoperability across systems
Junior Data Scientist
Dec 2017 — Dec 2018Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Developed an Elasticsearch-based search application leveraging BM25 relevance scoring, inverted indexes, and NLP techniques for information retrieval. Developed automated ETL pipelines using Apache Airflow Contributed to Spark and Kafka streaming workflows for real-time telemetry data processing Built RPA workflows to automate SAP-based maintenance data handling
Data Science Intern
Sep 2017 — Nov 2017Company - Pi Revolutions
Bengaluru, India · On-site | Industry - Retail Tech
Supported data analysis and automation workflows for NFC-enabled billing kiosks Performed exploratory analysis to support process improvements
Intern
Aug 2016 — Sep 2016Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Automated routine business processes using VBA in Excel, ensuring compliance with internal data standards and improving efficiency, which saved more than 10 hours of reporting work each week
See what my peers and managers say about my work
View verified LinkedIn recommendations↗
Education
Master of Science in Data Science
2021 — 2022 | Grade: DistinctionUniversity of East Anglia
Academic Projects:
- –Depression Detection Using Machine Learning(2021)
Bachelor of Technology in Computer Science
2013 — 2017 | Grade: First ClassUniversity of Calicut
Academic Projects:
- –Weather Forecasting Using Data Mining(2017)
- –Traffic Sign Board Detection and Alerting using Computer Vision(2016)
Certifications
Agentic AI Design Patterns for GenAI and Predictive AI
CheckAzure OpenAI: Advanced Topics
CheckISO/IEC 42001:2023: Understanding and Implementing the Artificial Intelligence Management System (AIMS) Standard
CheckAI Security & Governance CertificationI
CheckSkills: Artificial Intelligence (AI) · Governance · AI Security
Data Versioning, Lineage, and Quality Monitoring for AI
CheckIntroduction to MLSecOps
CheckSkills: Machine Learning · MLOps · Artificial Intelligence (AI)
Knowledge Graph Data Engineering for Generative AI Use Cases
CheckSkills: Generative AI · Knowledge Graphs · Retrieval-Augmented Generation (RAG) · Artificial Intelligence (AI)
MLOps Essentials: Monitoring Model Drift and Bias
CheckSkills: MLOps · Artificial Intelligence (AI)
MLOps and Data Pipeline Orchestration for AI Systems
CheckSemantic Search and Information Retrieval using GenAI
CheckSkills: Generative AI · Semantic Search · Artificial Intelligence (AI)
Working with Data: Engineering, Integration, and MLOps for AI
CheckSkills: Large Language Model Operations (LLMOps) · Vector Databases · MLOps · Artificial Intelligence (AI)