Senior Data Scientist · S&P Global Ratings

CHINMAY
GONDHALEKAR

What if the model is wrong?

6 published papers3 patents5 yrs building ML for financial intelligence

I build ML systems that turn noisy data into decisions — and I obsess over the question underneath all of it: what would have happened otherwise?

See the work Read the research
Production ML
Research
Knowledge Graphs
Automation / Agents
SCROLL
/ now

Currently

A live snapshot — updated every few weeks. Proof I'm a person, not a PDF.

Now reading
The Immortals of Meluha
Amish Tripathi — the Shiva trilogy, mythology rebuilt as story
Now building
Production-ready knowledge-graph pipelines
the kind that survive contact with real data
Now learning
Guitar
slowly, badly, happily
Thinking about
How AI reshapes work in the next five years
jobs, judgment, what stays human
/ 01

Selected Work

Six projects — each one asking the same question from a different angle.

Featured · Production ML · Pricing

Pricing Intelligence for Ratings Surveillance

$20M+
cumulative revenue impact across global ratings products

Built simulation-driven pricing models on S&P Global Ratings' proprietary data — mapping the factors that actually drive willingness-to-pay across product, region, and currency, then identifying the best strategy per line.

Problem
Pricing across long-term surveillance, short-term surveillance, and ICRs was made on intuition and stale benchmarks — across multiple regions and currencies.
Approach
Simulation models on proprietary data; willingness-to-pay analysis by product, region, currency; scenario testing per line.
Impact
Eight-figure cumulative revenue impact. Framework adopted across commercial teams.
Counterfactual
What would revenue have been under the price we didn't pick — and how do we make that question answerable instead of unknowable?
Production ML × Research

News Intelligence — CANAL & FANAL in production

10k+ articles/day · 200+ analysts

Layered pipeline classifying global cyber and financial news daily. CANAL (IEEE 2024) and FANAL — with my ORBERT-ORPO fine-tuning innovation — power the production system.

Automation · Vision-LLM

Document Extraction with Fine-Tuned Vision-LLMs

97% accuracy · Qwen-VL · LoRA

GPU-accelerated extraction pipeline on Qwen-VL. Tried LoRA, full fine-tune, and instruction tuning — LoRA won. PDFs to structured fields in seconds.

Automation · Agents

Agentic Automations at S&P Ratings

200+ daily users

A portfolio of specialized production agents — engagement-letter legal automation, market-outreach briefs, sales workflows, and others — in daily use across teams.

Production ML

Subscription Churn Prediction

30 years history · 3–6 mo horizon

Forward-looking churn model trained on three decades of client data. Fed directly into retention campaigns built around the actual drivers of churn — not just demographic correlates.

Knowledge Graphs · Research

GNN for Bond-Issuance Prediction

Internal research · ongoing

Graph neural network in PyTorch Geometric — GraphSAGE with BPR loss and negative sampling — exploring whether graph structure unlocks signal that flat-feature ML misses.

/ 02

Research

Published work and drafts in progress — each one a walkthrough of how the production work and the research feed back into each other.

6+2

6 published papers, including two first-author works on multimodal RAG and structured spreadsheet retrieval. 2 drafts on incremental knowledge-graph maintenance.

Google Scholar
2025
MultiFinRAG — Optimized Multimodal RAG for Financial QA
Multimodal retrieval across 10-Ks, 10-Qs, investor decks; 19 pts higher accuracy than GPT-4o free-tier.
first author
2025
SQuARE — Structured Query & Adaptive Retrieval for Tabular Formats
Hybrid retrieval for messy real-world spreadsheets — multi-row headers, merged cells, units — with complexity-aware routing.
first author
2024
FANAL — Financial Activity News Alerting Language Modeling Framework
12-category financial news classification framework outperforming GPT-4o, Llama-3.1 8B, Phi-3.
led the ORBERT-ORPO fine-tuning innovation
2025
AVATAAR — Agentic Video QA via Temporal Adaptive Alignment
Long-form video question answering with a Rethink Module; +5.6% on CinePile temporal reasoning.
co-author
IEEE
CANAL — Cyber Activity News Alerting Language Model
IEEE ICAIC 2024 — fine-tuned BERT with silver labeling via Random Forest for cyber event detection.
co-author
2026
Provenance Verification of AI-Generated Images on Blockchain
Blockchain-anchored perceptual hash registry for AI-image attribution.
co-author
DRAFT
FinKG-Update
An incremental knowledge-graph maintenance pipeline for financial data.
lead author
DRAFT
FinGraph-Adapt
Temporal GraphRAG with ontology adaptation against FIBO.
lead author
/ 03

About

Chinmay Gondhalekar
hello

Chinmay Vivek Gondhalekar

Senior Data Scientist · S&P Global Ratings · Jersey City, NJ

I build ML systems that turn messy financial data into decisions — pricing intelligence, knowledge graphs, agentic platforms, and the research that feeds back into them.

Over the last five years I've shipped pricing models that drove eight-figure revenue impact, a document-extraction pipeline running at 97% field accuracy, multi-agent automations now used by 200+ analysts daily, and a news-classification system that processes cyber and financial events end-to-end every day. My research — six peer-reviewed papers, including first-author work on multimodal RAG and structured spreadsheet retrieval — sits one step ahead of the production work.

The question I keep coming back to is the one this site is named after: what would have happened otherwise? Most of what's interesting in ML lives in that gap — between the decision you made and the one you didn't.

→ email→ linkedin→ google scholar
/ 04

Notes

Working notes, paper takes, and counterfactual reflections on what's happening in ML.

~

First posts coming soon

The kind of writing that survives the LLM era is the kind only a real practitioner can write — opinionated takes on new research, lessons from production systems, and "what if this paper is wrong?" reflections. Subscribe below to catch the first one.

/ 05

Resources

Things I'm building to give back — reading lists, paper walkthroughs, and interactive explainers on the topics I work on most.

// in progress
Temporal Knowledge-Graph Updates

An interactive explainer on incremental KG maintenance — what changes, what stays, and how to know.

// in progress
Pricing Methodology

How simulation-driven pricing actually works in production — willingness-to-pay, scenario analysis, and the counterfactual at the heart of it.

// in progress
Temporal Churn Modeling

Building churn models that look back to look forward — what 30 years of history teaches a 3-month prediction.