Victor Johnson
Data Scientist | AI Engineer | Production RAG & LLM Systems | Agentic AI | Data Science, AI, ML | Azure, Python | LangGraph, LangChain
AI Engineer with 7+ years of experience and deep expertise in Generative AI, Retrieval-Augmented Generation (RAG), and multi-agent LLM systems. Proven track record designing scalable AI platforms for semantic search, document QA, and agentic workflows using Azure OpenAI, vector databases, LangChain, LlamaIndex, and LangGraph.
Experienced in LLM evaluation, grounding, guardrails, and low-latency production deployments using Python, FastAPI, and cloud MLOps.
Core: LLMs, RAG, Vector Search, Agentic Systems, LLM Evaluation, Python, Azure OpenAI, Spark, LangChain,LangGraph, MLOps.
England, United Kingdom

Tech Stack
RAG & Knowledge Systems
- Retrieval-Augmented Generation (RAG)
- Semantic Search
- Embeddings & Vector Search
- Vector Databases
- Hybrid Retrieval
- Grounding & Guardrails
- LLM Evaluation & Monitoring
- RAGAS
- DeepEval
- LlamaIndex
Agentic AI & LLM Systems
- Large Language Models (LLMs)
- Generative AI
- Prompt Engineering
- Agentic Workflows
- Multi-Agent Systems
- Tool Use & Function Calling
- LangChain
- LangGraph
- Chatbot Development
- LLMOps
- Azure OpenAI
- Azure AI Foundry
Machine Learning & NLP
- Machine Learning
- Applied Machine Learning
- Natural Language Processing (NLP)
- Scikit-learn
- Artificial Intelligence (AI)
- Predictive Maintenance
- PyTorch
Data Science & Analytics
- Data Science
- Python
- SQL
- Transact-SQL (T-SQL)
- Statistical Modeling
- Data Visualization
- Financial Modeling
Data Engineering & Big Data
- Apache Spark
- Azure Databricks
- Apache Kafka
- Kafka Streams
- Apache Airflow
- Azure Data Factory
- Elasticsearch
- Azure SQL
Knowledge Graphs & Search
- Knowledge Graphs
- Knowledge Graph Data Engineering
- Semantic Search & Information Retrieval
MLOps & Production AI
- MLOps
- MLflow
- Model Monitoring
- Model Drift & Bias Detection
- Docker
Cloud & Platforms
- Microsoft Azure
- Azure Machine Learning
- Oracle Cloud Infrastructure (AI Foundations)
Software & APIs
- FastAPI
- REST APIs
- Software Development
- Git
- GitHub
Experience
Data Scientist (AI/LLM Focus)
Feb 2021 — PresentCompany - SYMEUS LTD
England, United Kingdom · Hybrid | Industry - Finance
Architected and deployed end-to-end RAG systems using Azure OpenAI (LLMs & embeddings), Azure Vector Search, Azure SQL, LlamaIndex, LangChain, FastAPI, reducing manual research effort by ~50% and improving retrieval relevance by 25–35% while optimising latency to production-acceptable ranges. Delivered AI-driven semantic search and dynamic content ranking across international financial platforms serving 100K+ monthly users, enabling scalable, context-aware LLM-powered content experiences. Built agentic LLM workflows using LangGraph with tool orchestration, guardrails, and Monte Carlo simulations for probabilistic financial modelling and scenario analysis, contributing to ~20% revenue uplift. Applied statistical modelling and quantitative financial analysis techniques to support AI-driven decision systems and financial insight generation. Implemented automated LLM evaluation, regression testing, and drift monitoring (data, embeddings, retrieval) to detect hallucinations and silent quality degradation; reduced invalid responses by ~20–25%. Enhanced RAG retrieval by combining vector similarity with entity-aware ranking and structured relationship signals to improve contextual grounding and complex query handling. Developed scalable ingestion and preprocessing pipelines (chunking, tokenisation, embeddings, Mistral OCR) processing tens of thousands of structured and unstructured documents per run. Built and maintained Spark/Azure Synapse pipelines supporting ML/AI workloads and high-volume data processing, ensuring reliable training and inference pipelines.
Data Scientist (ML Focus)
Dec 2018 — Dec 2020Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Built and productionised machine learning models for predictive maintenance, including Remaining Useful Life (RUL), degradation modelling, time-series forecasting, and survival analysis for critical train components. Analysed millions of telemetry events per day to detect anomalies, sensor drift, and early failure signals, delivering ~5–10% maintenance cost savings. Extended component usable life by ~10–15% through improved failure prediction and optimised maintenance scheduling strategies. Applied Monte Carlo simulations to model failure uncertainty and maintenance scenarios, supporting data-driven preventive maintenance planning. Developed scalable data preprocessing and feature engineering pipelines and integrated ML model outputs into operational decision-support systems used by engineering and maintenance teams. Collaborated with cross-functional engineering and operations teams to deploy and monitor ML models in production environments.
Junior Data Scientist
Dec 2017 — Dec 2018Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Built a search system using Elasticsearch and Python (Flask), improving document retrieval efficiency by ~20%. Developed automated ETL pipelines using Apache Airflow and contributed to Spark and Kafka streaming workflows for high-frequency telemetry data. Created RPA workflows to automate SAP-based data processing, reducing manual effort and improving throughput.
Data Science Intern
Sep 2017 — Nov 2017Company - Pi Revolutions
Bengaluru, India · On-site | Industry - Retail Tech
Supported data analysis and automation workflows for NFC-enabled billing kiosks Performed exploratory analysis to support process improvements
Intern
Aug 2016 — Sep 2016Company - Alstom
Bengaluru, India · On-site | Industry - Railways
Automated routine business processes using VBA in Excel, ensuring compliance with internal data standards and improving efficiency, which saved more than 10 hours of reporting work each week
See what my peers and managers say about my work
View verified LinkedIn recommendations↗
Education
Master of Science in Data Science
2021 — 2022 | Grade: DistinctionUniversity of East Anglia
Academic Projects:
- –Depression Detection Using Machine Learning(2021)
Bachelor of Technology in Computer Science
2013 — 2017 | Grade: First ClassUniversity of Calicut
Academic Projects:
- –Weather Forecasting Using Data Mining(2017)
- –Traffic Sign Board Detection and Alerting using Computer Vision(2016)
Certifications
Agentic AI Design Patterns for GenAI and Predictive AI
CheckAzure OpenAI: Advanced Topics
CheckISO/IEC 42001:2023: Understanding and Implementing the Artificial Intelligence Management System (AIMS) Standard
CheckAI Security & Governance Certification
CheckSkills: Artificial Intelligence (AI) · Governance · AI Security
Data Versioning, Lineage, and Quality Monitoring for AI
CheckIntroduction to MLSecOps
CheckSkills: Machine Learning · MLOps · Artificial Intelligence (AI)
Knowledge Graph Data Engineering for Generative AI Use Cases
CheckSkills: Generative AI · Knowledge Graphs · Retrieval-Augmented Generation (RAG) · Artificial Intelligence (AI)
MLOps Essentials: Monitoring Model Drift and Bias
CheckSkills: MLOps · Artificial Intelligence (AI)
MLOps and Data Pipeline Orchestration for AI Systems
CheckSemantic Search and Information Retrieval using GenAI
CheckSkills: Generative AI · Semantic Search · Artificial Intelligence (AI)
Working with Data: Engineering, Integration, and MLOps for AI
CheckSkills: Large Language Model Operations (LLMOps) · Vector Databases · MLOps · Artificial Intelligence (AI)