All Projects

Open-source projects spanning LLMs, ML, and data engineering

Featured

AI Research Agent

Autonomous research agent that decomposes queries, executes multi-turn tool calling with web search and arXiv, and validates completeness through self-reflection. Features real-time Streamlit UI.

LLMAI AgentsRAG
PythonGoogle GeminiStreamlit+1 more
View on GitHub
Featured

Weaviate MCP Inspector

Natural language interface for Weaviate vector databases through Claude using Model Context Protocol. Enables intuitive database exploration via conversation with 9 inspection tools.

Vector DBMCPTools
PythonWeaviateFastMCP+1 more
View on GitHub
Featured

Handwritten Text Recognizer

Deep learning OCR system with ResNet encoder and Transformer decoder (14M parameters). Achieves 70% error reduction via augmentation. Deployed as FastAPI microservice on GCP with monitoring.

Computer VisionDeep LearningMLOps
PyTorchFastAPIDocker+1 more
View on GitHub

Movie Recommendation System

Netflix-inspired collaborative filtering system using matrix factorization (SVD, SVD++) on 480K+ users and 17K+ movies. Implements KNN-based recommendations with comprehensive feature engineering.

MLRecommender SystemsCollaborative Filtering
PythonScikit-learnSurprise+1 more
View on GitHub

Social Network Link Prediction

Graph ML solution for predicting missing links in Facebook social network (1.86M nodes, 9.4M edges). Engineers 30+ graph features achieving 0.92 F1 score with Random Forest.

Graph MLSocial NetworksFeature Engineering
PythonNetworkXScikit-learn
View on GitHub

Tweet Sentiment Extractor

Transformer-based NLP model that extracts sentiment-bearing phrases from tweets. Fine-tuned question-answering implementation for Kaggle competition with comprehensive error analysis.

NLPTransformersSentiment Analysis
PythonTransformersTensorFlow
View on GitHub

ETL Pipeline

End-to-end data pipeline orchestrating NYC taxi data from extraction to BigQuery loading. Uses Apache Spark on Dataproc with Prefect orchestration and Terraform for infrastructure as code.

Data EngineeringETLCloud
PythonApache SparkGCP+2 more
View on GitHub

Showing 7 projects