Saumik Dana

Saumik Dana

Computational Engineer · Full-Stack ML/LLM Systems · PhD

US Green Card Holder

About

I'm a computational engineer who designs, ships, and operates full-stack ML and LLM-based systems end-to-end. My PhD at the University of Texas at Austin grounded me in numerical methods, stochastic modeling, and scientific computing — a toolkit I now turn on quantitative finance, retrieval-augmented generation, and multimodal document intelligence.

In practice, that means production trading strategies, LLM-backed research and discovery agents, hybrid retrieval pipelines, and the cloud-native scaffolding around them. I care about the whole lifecycle — feature engineering, model selection, calibration, deployment, evaluation, and the boring infrastructure that keeps any of it from breaking at 3 a.m.

Experience
Quantitative Researcher Feb 2024 – Dec 2025
Asset Management Firm · Stamford, CT

I built and ran production ML and trading infrastructure on a serverless-first stack — AWS Lambda, DynamoDB, S3, EventBridge, CloudFormation, and Modal — wired together with a GitHub CI/CD pipeline using Docker buildx and an ECR registry cache, plus React dashboards over the DynamoDB/S3 state and lifecycle cleanup of stale ECR images. On top of that foundation I shipped an anomaly-detection-backed volatility arbitrage strategy for intraday options, a LoRA-tuned Chronos and TabPFN driven interday options system, and a bi-objective optimizer backed thematic equity portfolio with regime-adaptive rebalancing.

On the LLM side, I evaluated Groq-backed agents with RSS/XML ingestion, guardrails, and structured trading-signal outputs; engineered FX and macro-news driven intraday FX signal evaluation agents; and benchmarked hand-rolled prompt architectures against LangGraph for FX trade position sizing. I also built a ticker alert pipeline with hybrid retrieval — SQLite FTS5, PostgreSQL tsvector, Whoosh, and FAISS — feeding RAG, along with a query-based signal mining Streamlit app over backtested options strategies and a multimodal embedding model wired to a Next.js PDF Q&A app for market research.

Behind all of that was a steady stream of experimentation: bi-objective optimization with different Pareto-front selection rules, A/B tests comparing SHAP versus native feature importance, Black–Litterman portfolio construction, fast jump-diffusion-aware Heston model calibration via constrained optimization, and OOS benchmarks pitting Google TimesFM against Amazon Chronos and LoRA-adapted against base Chronos.

Computational Engineer Aug 2023 – Nov 2023
VISIE Inc. · Austin, TX

I joined during the early integration phase of a surgical navigation platform combining imaging and robotic actuation. My work centered on implementing TCP/UDP communication protocols for robotic arm motion control, supporting end-to-end product packaging and deployment, and helping stage a live demonstration that anchored the company's successful Series A round.

Computational Lead Aug 2022 – Mar 2023
Sophelio · Austin, TX

I adapted physics-informed modeling originally developed for fusion experiment data to financial time series. The resulting production pipeline used sparse regression for PDE construction and signal generation, paired with a CAGR-maximizing Bayesian TPE optimizer driving a swing-trading system deployed on AWS Lambda.

Technical Skills

ML & Optimization

Model development, feature engineering, and optimization across classical ML, probabilistic models, time-series foundation models, and evolutionary search.

PyTorch TabPFN Chronos PEFT/LoRA LightGBM HMM SHAP Pymoo SciPy autograd

Cloud & MLOps

Serverless-first production systems with CI/CD, infrastructure-as-code, containers, and model deployment workflows.

Lambda DynamoDB S3 ECR EventBridge CloudFormation Docker GitHub CI Modal

Programming

Python-first engineering with scientific computing, fast data processing, and frontend work for internal analytics tools.

Python Pandas NumPy PyArrow scikit-learn JavaScript React/JSX HTML

APIs & Dashboards

Production APIs and app backends connecting market data, broker integrations, feeds, and interactive analytics interfaces.

Alpaca OANDA Schwab yfinance FastAPI Mangum Recharts Tailwind CDN RSS/XML

LLM & RAG

LLM tooling and hybrid retrieval systems combining lexical and vector search for production RAG.

Groq LangChain LangGraph FAISS Whoosh SQLite FTS5 PostgreSQL tsvector
Education
Doctor of Philosophy in Engineering Mechanics
University of Texas at Austin
Austin, TX. Advanced training in numerical simulation, scientific computing, and mathematical modeling that continues to inform my work in ML and quantitative systems.
Life

Beyond the code

Beer Montage Road Trip Adventures