Saumik Dana

Saumik Dana

Software Developer · Production ML Systems · PhD

US Green Card Holder

About

I'm a computational scientist building and deploying full-stack production-grade machine learning systems. I hold a PhD from the University of Texas at Austin, where I developed a strong foundation in numerical methods, stochastic modeling, and scientific computing. Today I apply that rigor to quantitative finance, optimization, anomaly detection, and NLP systems that need to work reliably in production.

I've shipped automated trading systems, RAG-based analytics platforms, structured-data Q&A agents, and daily performance dashboards. I care about the full lifecycle: feature engineering, model development, experiment design, deployment, evaluation, and monitoring.

Experience
Software Developer Feb 2024 – Dec 2025
Zebra Capital Management LLC · Stamford, CT

I built production ML systems and trading infrastructure for quantitative strategies. That included volatility arbitrage intraday options strategies on AWS Lambda, constrained nonlinear optimization for stochastic volatility model calibration, and NSGA-II/Pareto-front portfolio construction for sector ETF allocation with regime-adaptive rebalancing.

ML
Trading Systems
NLP
Analytics Apps
AWS
Serverless Deployment

I also worked deeply on model development and experimentation: optimizing options-chain backtesting pipelines with Pandas, deploying Hidden Markov Models with Bayesian Information Criterion for regime detection, and running A/B tests comparing Shapley-based and model-based feature attribution methods for feature selection. On the sequence-modeling side, I deployed Chronos embeddings, LoRA tuning workflows, and AutoGluon-backed trading systems on Modal and AWS Lambda.

Beyond trading systems, I built multiple NLP analytics products. For unstructured data, I delivered a RAG-powered PDF Q&A application with Streamlit, Groq-hosted Llama models, and ChromaDB/Qdrant vector backends, including ensemble retrieval, deduplication, reranking, and dual-LLM answer refinement. For structured data, I built a Groq-based Q&A agent with a SQL-backed connector layer, LLM-based planning and evaluation, and domain-aware query correction.

I rounded this out with a FastAPI and Streamlit explainer microservice for US Treasury press releases using Groq and FAISS, and production safeguards including semantic injection detection, Presidio-based PII redaction, Guardrails AI output enforcement, LangSmith tracing, and offline evaluation. I also built React performance dashboards backed by DynamoDB for day-to-day monitoring.

Computational Engineer Aug 2023 – Nov 2023
VISIE Inc. · Austin, TX

I joined during the early integration phase of a robotic surgical navigation platform that combined medical imaging with robotic actuation. My work focused on implementing TCP/UDP communication protocols for robotic arm control, contributing to end-to-end packaging with Poetry, and supporting deployment through Azure tooling. The team delivered a successful live system demonstration that helped support the company's Series A fundraising effort.

Computational Lead Aug 2022 – Mar 2023
Sophelio · Austin, TX

I adapted physics-informed modeling originally developed for fusion experiment data to financial time series. The resulting production pipeline combined sparse regression for PDE construction, automated signal generation, and automated execution. I also implemented a CAGR-maximizing Bayesian TPE optimizer for swing-trading thresholds and deployed the system on AWS Lambda with CloudFormation, Docker, GitHub Actions, and EventBridge scheduling.

Technical Skills

ML & Optimization

Model development, feature engineering, and optimization across classical ML, probabilistic models, and evolutionary search.

Scikit-learn AutoGluon Tree Models SHAP Optuna Pymoo TA-Lib PEFT SLSQP

Cloud & MLOps

Serverless-first production systems with CI/CD, infrastructure-as-code, containers, and model deployment workflows.

Lambda DynamoDB EventBridge CloudFormation Docker GitHub CI HuggingFace Modal

Programming

Python-first engineering with scientific computing, fast data processing, and frontend work for internal analytics tools.

Python Pandas NumPy SciPy JAX Numba PyTorch React HTML Linux/Bash

APIs & Web

Production APIs and app backends connecting market data, broker integrations, and interactive analytics interfaces.

Alpaca OANDA yfinance FastAPI Pydantic uvicorn Streamlit Groq

NLP & LLMs

RAG systems and LLM tooling with guardrails, observability, vector retrieval, and evaluation for production use.

LangChain LangSmith OpenAIGuard Presidio Guardrails AI FAISS ChromaDB Qdrant
Education
Doctor of Philosophy
University of Texas at Austin
Engineering Mechanics, Austin, TX. Advanced training in numerical simulation, scientific computing, and mathematical modeling that continues to inform my work in ML and quantitative systems.
Master of Engineering
Indian Institute of Science
Mechanical Engineering, Bangalore, India.
Bachelor of Engineering
University of Mumbai
Mechanical Engineering, Mumbai, India.
Life

Beyond the code

Beer Montage Road Trip Adventures