Cloud-agnostic data platforms | GenAI pipelines | Production Deployments | 100% Reliable delivery
I design and lead pragmatic data engineering solutions that turn raw data into reliable, production-grade insights. 6+ years with organisations accross the world like RAKEZ and BCG building metadata-driven ingestion, quality frameworks, and scalable pipelines on AWS, Azure, and GCP.
"A professional Data AI Engineer and DevOps Enthusiast who is building data recipes using the coding hands topped with GenAI fingers to fill the tummy with insights so that the mind can take decisions and unlock new possibilities."
Data & AI Engineer Professional | DevOps Enthusiast | 8x Databricks | 1x Cloudera | Hacker Rank Gold - SQL & Python | @RAKEZ | Ex-BCG | 11k+ on LinkedIn
Cloud-agnostic, metadata-driven frameworks for ingestion, data quality, Delta/Lakehouse design, and robust orchestration.
Infrastructure-as-code and platform automation on AWS, Azure, and GCP using Kubernetes, Terraform, Helm, and GitHub Actions.
Customer 360 analytics, feedback analysis, PII detection/masking, automated reporting, and LLM-powered assistants.
Mentoring, best-practice enablement, and documentation to help teams scale delivery safely and predictably.
DAMA-aligned strategy covering modeling, cataloging, lineage, observability, and automated quality guardrails across SAP, Salesforce, and ERP sources.
1:1 mentoring, 200+ mock interviews, and enablement programs that accelerate team onboarding and keep engineering rigor consistent.
Python, Bash, HCL
Databricks, Snowflake, Hive, Vertica
PySpark, Hadoop, Kafka, Airflow, NiFi, Sqoop, Oozie, Delta Lake
EMR, Glue, Lambda, Redshift, AKS, ADLS, Dataproc, GKE
Docker, Kubernetes (EKS/AKS/GKE), Helm, Terraform, CI/CD (GitHub Actions)
LLMs (GPT/Llama), OpenAI APIs, LangChain, LangGraph, Streamlit
MySQL, Postgres
Draw.io, Mermaid, Lucidchart, Miro
Linux (scripting), Windows, macOS
GitHub, GitLab, Bitbucket, PyCharm, Sublime
VPC, NAT/SG/NACL, EC2, S3, IAM, EMR, Glue, Lambda, Athena, Redshift, EKS, ECR
VNet, NSG, VM, ADLS/ABFS, IAM, Azure Functions, AKS, ACR, AD
VPC, Firewall, VM, GCS, IAM, Cloud Functions, Dataproc, GKE, GCR, Cloud NAT
"Karan builds pragmatic and reliable data platforms with impressive speed and precision. He communicates clearly, mentors his peers, and consistently raises the bar for quality and delivery. His work is focused, structured, and strategically executed, even in complex scenarios. I remember working with him on a critical finance data logic project, despite the complexity, he took full ownership and delivered an exceptional solution on time."
"His cloud-agnostic approach and automation mindset significantly accelerated our delivery timelines while enhancing data quality and observability. We were able to deploy infrastructure for multiple clients across different cloud environments seamlessly within just 1-2 weeks. His strong technical and coding skills truly set him apart and made a measurable difference to our success."
RAKEZ
BCG
BCG
TTND
TTND
APJ Abdul Kalam Technical University, India
Core focus on software engineering, data structures, analytics, and systems design.
Science coursework emphasizing mathematics, physics, and problem-solving.
Databricks
Databricks
Cloudera
Databricks
Databricks
Databricks
Databricks
Databricks
Databricks
Hackerrank
Hackerrank
End-to-end Customer 360 model (5TB+) covering email engagement (CTR/CTOR/OR/CR), RFM, and omni-channel analysis. Built with scalable ETL and robust governance.
Python library to submit/monitor Spark jobs remotely on AWS/Azure/GCP/Databricks, managing full cluster lifecycle, CI/CD, and artifacts.
GenAI-assisted pipelines for Customer 360 insights, PII detection/masking, automated report generation, and feedback analysis.
Scrapes, processes, and visualizes US real estate data monthly, built with BeautifulSoup, PySpark, and reporting automation.
Streamlit app that aggregates and analyzes agri data using PySpark and GPT-4o to produce farmer-ready insights.
Conversational portfolio experience using OpenAI APIs with WhatsApp integration for interactive Q&A.
Flask API + React frontend + MySQL, deployed on AWS EKS with ALB, CloudWatch, IAM, VPC networking, and Kubernetes best practices.
Enterprise pivot to governed, reusable data pipelines: ingestion, data quality, standardized logging, and observability patterns for scale.
Helped 500+ students from tech/non-tech backgrounds in DevOps, Data Engineering, and GenAI across LinkedIn, GitHub, Fiverr and more.
Conducted 200+ mock interviews for candidates targeting Data Engineering/DevOps/AI roles at BCG, McKinsey, Bain, Walmart, EY, Deloitte, KPMG, TCS, Infosys, Wipro and more.
Created 50+ Fiverr gigs, delivered for 350+ global clients with 5-star reviews, spanning AWS, Azure, GCP, Databricks, Snowflake, Docker, K8s, Terraform, Python/PySpark/SQL.