Counterfactual Club — Chinmay Gondhalekar

/ 01

Selected Work

Six projects — each one asking the same question from a different angle.

Featured · Production ML · Pricing

Pricing Intelligence for Ratings Surveillance

$20M+

cumulative revenue impact across global ratings products

Built simulation-driven pricing models on S&P Global Ratings' proprietary data — mapping the factors that actually drive willingness-to-pay across product, region, and currency, then identifying the best strategy per line.

Problem

Pricing across long-term surveillance, short-term surveillance, and ICRs was made on intuition and stale benchmarks — across multiple regions and currencies.

Approach

Simulation models on proprietary data; willingness-to-pay analysis by product, region, currency; scenario testing per line.

Impact

Eight-figure cumulative revenue impact. Framework adopted across commercial teams.

Counterfactual

What would revenue have been under the price we didn't pick — and how do we make that question answerable instead of unknowable?

↗

Production ML × Research

News Intelligence — CANAL & FANAL in production

10k+ articles/day · 200+ analysts

Layered pipeline classifying global cyber and financial news daily. CANAL (IEEE 2024) and FANAL — with my ORBERT-ORPO fine-tuning innovation — power the production system.

↗

Automation · Vision-LLM

Document Extraction with Fine-Tuned Vision-LLMs

97% accuracy · Qwen-VL · LoRA

GPU-accelerated extraction pipeline on Qwen-VL. Tried LoRA, full fine-tune, and instruction tuning — LoRA won. PDFs to structured fields in seconds.

↗

Automation · Agents

Agentic Automations at S&P Ratings

200+ daily users

A portfolio of specialized production agents — engagement-letter legal automation, market-outreach briefs, sales workflows, and others — in daily use across teams.

↗

Production ML

Subscription Churn Prediction

30 years history · 3–6 mo horizon

Forward-looking churn model trained on three decades of client data. Fed directly into retention campaigns built around the actual drivers of churn — not just demographic correlates.

↗

Knowledge Graphs · Research

GNN for Bond-Issuance Prediction

Internal research · ongoing

Graph neural network in PyTorch Geometric — GraphSAGE with BPR loss and negative sampling — exploring whether graph structure unlocks signal that flat-feature ML misses.

/ 02

Research

Published work and drafts in progress — each one a walkthrough of how the production work and the research feed back into each other.

6+2

6 published papers, including two first-author works on multimodal RAG and structured spreadsheet retrieval. 2 drafts on incremental knowledge-graph maintenance.

Google Scholar ↗

2025

MultiFinRAG — Optimized Multimodal RAG for Financial QA

Multimodal retrieval across 10-Ks, 10-Qs, investor decks; 19 pts higher accuracy than GPT-4o free-tier.

first author

2025

SQuARE — Structured Query & Adaptive Retrieval for Tabular Formats

Hybrid retrieval for messy real-world spreadsheets — multi-row headers, merged cells, units — with complexity-aware routing.

first author

2024

FANAL — Financial Activity News Alerting Language Modeling Framework

12-category financial news classification framework outperforming GPT-4o, Llama-3.1 8B, Phi-3.

led the ORBERT-ORPO fine-tuning innovation

2025

AVATAAR — Agentic Video QA via Temporal Adaptive Alignment

Long-form video question answering with a Rethink Module; +5.6% on CinePile temporal reasoning.

co-author

IEEE

CANAL — Cyber Activity News Alerting Language Model

IEEE ICAIC 2024 — fine-tuned BERT with silver labeling via Random Forest for cyber event detection.

co-author

2026

Provenance Verification of AI-Generated Images on Blockchain

Blockchain-anchored perceptual hash registry for AI-image attribution.

co-author

DRAFT

FinKG-Update

An incremental knowledge-graph maintenance pipeline for financial data.

lead author

DRAFT

FinGraph-Adapt

Temporal GraphRAG with ontology adaptation against FIBO.

lead author

/ 03

About

hello

Chinmay Vivek Gondhalekar

Senior Data Scientist · S&P Global Ratings · Jersey City, NJ

I build ML systems that turn messy financial data into decisions — pricing intelligence, knowledge graphs, agentic platforms, and the research that feeds back into them.

Over the last five years I've shipped pricing models that drove eight-figure revenue impact, a document-extraction pipeline running at 97% field accuracy, multi-agent automations now used by 200+ analysts daily, and a news-classification system that processes cyber and financial events end-to-end every day. My research — six peer-reviewed papers, including first-author work on multimodal RAG and structured spreadsheet retrieval — sits one step ahead of the production work.

The question I keep coming back to is the one this site is named after: what would have happened otherwise? Most of what's interesting in ML lives in that gap — between the decision you made and the one you didn't.

→ email → linkedin → google scholar

/ 05

Resources

Things I'm building to give back — reading lists, paper walkthroughs, and interactive explainers on the topics I work on most.

// in progress

Temporal Knowledge-Graph Updates

An interactive explainer on incremental KG maintenance — what changes, what stays, and how to know.

// in progress

Pricing Methodology

How simulation-driven pricing actually works in production — willingness-to-pay, scenario analysis, and the counterfactual at the heart of it.

// in progress

Temporal Churn Modeling

Building churn models that look back to look forward — what 30 years of history teaches a 3-month prediction.

CHINMAY
GONDHALEKAR

Currently

Selected Work

Pricing Intelligence for Ratings Surveillance

News Intelligence — CANAL & FANAL in production

Document Extraction with Fine-Tuned Vision-LLMs

Agentic Automations at S&P Ratings

Subscription Churn Prediction

GNN for Bond-Issuance Prediction

Research

About

Chinmay Vivek Gondhalekar

Notes

First posts coming soon

Resources

Temporal Knowledge-Graph Updates

Pricing Methodology

Temporal Churn Modeling