LLM-Powered News Sentiment for Indian Stocks: A Realistic Pipeline

News sentiment is the most over-promised and under-delivered application of AI to Indian markets. Everyone wants to “trade the headlines”; almost nobody describes the real pipeline honestly.

This post is the honest pipeline — what works, what doesn’t, and where LLMs help vs hurt.

The honest framing

LLMs are narrative-pattern recognizers. They’re useful for:

Classifying news as positive / neutral / negative for a specific stock.
Distinguishing material from non-material news.
Aggregating multiple noisy sources into a coherent score.

They are bad at:

Predicting price impact.
Reading numbers (use a real parser, not the LLM).
Real-time speed (your trader-bot competitor is millisecond-fast; your LLM is seconds-slow).

See AI signal accuracy: reality check for why “AI sentiment beats market” claims should be treated with suspicion.

The four-stage pipeline

[1. Ingest] → [2. Filter] → [3. Classify (LLM)] → [4. Aggregate + decide]

Skip any stage and the system degrades. Stage 2 is the most undervalued.

1. Ingest

Sources, in priority order:

BSE / NSE corporate announcements (the actual signal).
Reuters, Bloomberg India tickers.
Moneycontrol, ET Markets RSS feeds.
Twitter / X (very noisy; deprioritize).

Pull every 60–120 seconds. Store raw text + URL + timestamp + source.

2. Filter (do NOT skip this)

90% of news is non-material. Before sending to an LLM, filter:

Drop articles that don’t mention the symbol or its sector.
Drop press-release rehashes (similarity dedup).
Drop opinion / commentary articles (use a classifier or a “is this analysis or news?” prompt).
Drop earnings previews (the actual result is the signal).

You’ll lose 80–95% of the volume. Good. That’s the point.

3. Classify (LLM)

The prompt:

You are a financial news classifier for Indian listed equities.
For the article below, output a JSON object with:
  symbol: NSE ticker or null
  materiality: "high" | "medium" | "low" | "non_material"
  sentiment: "positive" | "neutral" | "negative"
  rationale: one sentence
  confidence: 0.0 – 1.0
Do not infer information not in the article. If uncertain, return null and low confidence.

Article: {{text}}

Hard rules:

Force JSON output (or use structured outputs / function-calling).
Require confidence. Filter out anything < 0.65.
Validate symbol against your NSE master list. If unknown, drop.
Log everything including the raw response — you’ll need it for debugging hallucinations.

4. Aggregate + decide

Don’t trade off a single article. Aggregate over a rolling window (e.g., last 60 minutes for the same symbol):

score = Σ (sentiment_sign × materiality_weight × confidence) over articles

Materiality weights: high = 1.0, medium = 0.4, low = 0.1.

Then:

|score| > threshold AND consistent across ≥ 2 sources → alert.
Combine with technical confirmation (VWAP, RSI). Never trade a sentiment signal alone.

Hallucination controls

LLMs hallucinate. In a trading pipeline, hallucinations cost money. Mandatory controls:

Schema validation. Reject anything that doesn’t parse.
Symbol whitelist. Reject unrecognized tickers.
Cross-source corroboration. A “positive” signal from one source is suspicious; from three sources is reliable.
Sanity check on extreme claims. “Company X to be acquired” needs a corroborating exchange filing.
Daily eval set. Hand-label 50 articles per week; track LLM accuracy. Re-evaluate when accuracy drops below threshold.

The architecture

[News sources (RSS / API)]
        ↓
[Ingestor service] — dedup, normalize
        ↓
[Pre-filter] — symbol matching, materiality heuristic
        ↓
[LLM classifier] — structured JSON, retry on parse fail
        ↓
[Aggregator] — rolling window per symbol
        ↓
[Signal store + alert]
        ↓
[Trader (human or automated)] with technical confirmation

Implementation stack used by most retail builders: Python + FastAPI + SQLite or Postgres + OpenAI/Anthropic API + a cron or background worker. See building a personal AI stock screener with Python for the broader build.

Cost realism

LLM API costs:

Per article: ~₹0.2–₹1 depending on length and model.
~500–2,000 material articles / day across Nifty 500 → ₹100–₹2,000 / day.
Cheaper models (GPT-4o-mini, Claude Haiku, Gemini Flash) for classification; reserve heavy models for summarization of aggregated context.

Aggressive cost cuts: - Pre-filter hard (kills 90% of cost). - Use a small open-source model (Llama 3.x, Mistral) self-hosted for the classification step; reserve hosted models for edge cases.

Where LLM sentiment actually edges out

Honestly:

Earnings calls — LLM-summarized management tone is a genuine signal, particularly the change in tone QoQ.
Regulatory filings — extracting material disclosures from 80-page filings.
Sector-wide news — “Banking sector RBI announcement, who’s most affected?” type queries.

Where it doesn’t:

High-frequency news trading (latency).
“Will it go up tomorrow?” type predictions.
Trading on Twitter sentiment as a primary signal.

Compliance reminder

This pipeline is for personal trading or research use. Distributing signals or running an advisory off it requires SEBI Research Analyst or Investment Adviser registration. See SEBI’s view on AI in trading advisory.

A clean MVP

If you want to ship one in a weekend:

RSS pull from Moneycontrol + ET Markets every 2 minutes.
Symbol matcher against NSE 500 list.
LLM call (Gemini Flash or GPT-4o-mini) with the JSON prompt above.
SQLite store.
Telegram alert when score > threshold + RSI in range.

That’s ~300 lines of Python. The real work is the eval loop that makes it usable.

FAQs

Does this work on Bank Nifty index trading? Marginally. Index moves are driven by macro flows; news sentiment is too slow.

Open-source models or hosted APIs? Hosted for the first version (faster to ship). Self-hosted Llama 3.x for cost optimization at scale.

What’s the realistic edge? Modest — maybe 5–10% improvement in entry timing on news-driven trades. Anyone selling more is selling a story.

For the broader AI-in-trading view, see AI stock analysis in India: an overview.