Saumik Dana

Saumik Dana

Computational Engineer · Full-Stack ML/LLM Systems · PhD

US Green Card Holder

About

I'm a computational engineer who designs, ships, and operates full-stack ML and LLM-based systems end-to-end. A PhD at the University of Texas at Austin grounded me in numerical methods, stochastic modeling, and scientific computing — the toolkit I now turn on quantitative finance, retrieval-augmented generation, and multimodal document intelligence.

In practice, that means systematic trading systems running in production, LLM-backed research and discovery tooling, hybrid retrieval pipelines, and the serverless cloud scaffolding around them. I'm comfortable owning research, infrastructure, and production code as a single workflow — feature engineering, model selection, calibration, deployment, evaluation, and the unglamorous infrastructure that keeps any of it from breaking at 3 a.m.

Experience
Quantitative Researcher Feb 2024 – Dec 2025
Asset Management Firm · Stamford, CT

Production trading systems. Shipped several systematic strategies: an interday equity options book (LoRA-tuned Chronos embeddings + TabPFN classifiers), a long-short equity portfolio driven by LightGBM alpha/beta forecasts and genetic-algorithm optimization that backtested to 20%+ CAGR with 10% annualized alpha, an intraday options strategy on anomaly-detection signals with Heston/FFT calibration, and an intraday FX system steered by ensemble LLM inference (Groq) over macro news and rate differentials.

Infrastructure & LLM tooling. Built it all on a serverless-first AWS stack (Lambda, DynamoDB, S3, EventBridge, CloudFormation, Modal) with GitHub Actions CI/CD and digest-pinned ECR images. On the LLM side: a Treasury/macro RAG pipeline benchmarking FAISS, Whoosh, SQLite FTS5, and Postgres tsvector; a Llama-4-Scout multimodal PDF Q&A system with Qdrant indexing; and a Streamlit research tool that mines trading rules backtesting above 1.5 Sharpe.

Research. Foundation time-series benchmarking (Chronos vs. TimesFM, base vs. LoRA-adapted), XGBoost-vs-LightGBM alpha/beta studies with SHAP, and experimentation with HMM volatility-regime detection, Black-Litterman shrinkage, and NSGA-II Pareto optimization across the alpha/beta tradeoff.

Computational Engineer Aug 2023 – Nov 2023
VISIE Inc. · Austin, TX

I joined during the early integration phase of a surgical navigation platform combining imaging and robotic actuation. My work centered on implementing TCP/UDP communication protocols for robotic arm motion control, supporting end-to-end product packaging and deployment, and helping stage a live demonstration that anchored the company's successful Series A round.

Computational Lead Aug 2022 – Mar 2023
Sophelio · Austin, TX

I adapted physics-informed modeling originally developed for fusion experiment data to financial time series. The resulting production pipeline used sparse regression for PDE construction and signal generation, paired with a CAGR-maximizing Bayesian TPE optimizer driving a swing-trading system deployed on AWS Lambda.

Technical Skills

ML & Optimization

Model development, feature engineering, and optimization across classical ML, probabilistic models, time-series foundation models, and evolutionary search.

PyTorch TabPFN Chronos PEFT/LoRA LightGBM XGBoost HMM SHAP Pymoo SciPy autograd

Cloud & MLOps

Serverless-first production systems with CI/CD, infrastructure-as-code, containers, and model deployment workflows.

Lambda DynamoDB S3 ECR EventBridge API Gateway CloudFormation Docker GitHub CI Modal

Programming

Python-first engineering with scientific computing, fast data processing, and frontend work for internal analytics tools.

Python Pandas NumPy PyArrow DuckDB Polars scikit-learn JavaScript React/JSX HTML

APIs & Dashboards

Production APIs and app backends connecting market data, broker integrations, feeds, and interactive analytics interfaces.

Alpaca OANDA TradeStation yfinance FastAPI Mangum Next.js Streamlit Recharts RSS/XML

LLM & RAG

LLM tooling and hybrid retrieval systems combining lexical and vector search for production RAG, plus multimodal document intelligence.

Groq LangChain LangGraph FAISS Qdrant Whoosh SQLite FTS5 PostgreSQL tsvector sentence-transformers
Education
Doctor of Philosophy in Engineering Mechanics
University of Texas at Austin
Austin, TX. Advanced training in numerical simulation, scientific computing, and mathematical modeling that continues to inform my work in ML and quantitative systems.
Life

Beyond the code

Beer Montage Road Trip Adventures