Github repository here

Introduction

This project combines an investment portfolio dashboard with a conversational AI chatbot that can answer questions about your holdings, fetch live market data, and retrieve information from policy documents — all through natural language. The AI layer is built on a multi-agent architecture using LangGraph, with Claude Sonnet as the reasoning engine and ChromaDB powering the retrieval-augmented generation (RAG) system.

System Architecture

The application is split into three layers that communicate over HTTP:

  • Frontend: React 19 + Vite with Recharts for interactive portfolio visualizations
  • Backend: .NET 8 Web API serving portfolio data from SQLite
  • AI Agent: Python FastAPI server (port 8000) running the LangGraph multi-agent system
  • The dashboard displays sector allocation, market cap breakdown, top 5 performers, and a portfolio summary header. An anchored chat button opens the AI assistant, which communicates with the FastAPI endpoint.

    Multi-Agent Design (Supervisor Pattern)

    The chatbot uses a supervisor pattern where an orchestrating LLM routes each user query to the most appropriate specialized agent. The supervisor uses Claude with a structured output Pydantic model (Route) that returns three fields: the target agent, the relevant RAG collection, and the routing reasoning. This prevents hallucination by forcing explicit, traceable routing decisions.

    Three agents handle different query types:

    Portfolio Agent

    Reads transactions from SQLite and computes live portfolio metrics. Holdings are calculated using the average-cost method, with closed positions filtered out. Live prices are fetched in parallel using ThreadPoolExecutor to avoid sequential API bottlenecks. The agent calculates market value, unrealized P&L, and P&L percentage for each position, then passes the formatted table to Claude for a natural language response.

    Market Data Agent

    Handles queries about current prices, company details, and recent news. Rather than naively fetching for any ticker mentioned, it first extracts stock symbols from the full conversation history to resolve contextual references like "my top stock" to an actual ticker. Data is pulled from yfinance and Finnhub, with fallback logic between the two sources.

    RAG Agent

    Answers questions about investment policy and financial news by performing semantic search against ChromaDB. The agent embeds the user query using all-MiniLM-L6-v2 (SentenceTransformers), retrieves the top 5 most relevant document chunks, and grounds Claude's response in the retrieved evidence. The system prompt enforces citation to prevent fabrication.

    RAG Pipeline

    Two separate ChromaDB collections are maintained:

  • policy_collection: Ingested from a PDF of investment policy documents. Text is chunked at 1,200 characters with 200-character overlap to preserve context at boundaries. Ingestion is idempotent — re-running does not create duplicate entries.
  • news collection: Populated at startup by fetching recent articles for every portfolio ticker via Finnhub (with yfinance as fallback), limited to 20 articles per ticker. An APScheduler job refreshes this collection every 60 minutes.
  • Both the embedding model and ChromaDB client are loaded using a singleton pattern to avoid redundant initialization across the request lifecycle.

    Data Layer

    The portfolio database is seeded with 80–100 synthetic trade records spanning January 2023 to December 2025 across 30 tickers. The seed script uses a weighted date distribution (60% recent, 25% mid-term, 15% older) to simulate realistic trading activity. Tickers cover 8 sectors and are pre-classified by market cap tier (Large/Mid/Small):

  • Technology: AAPL, MSFT, NVDA, GOOGL, META, AMD, INTC, CRM
  • Finance: JPM, BAC, GS, V
  • Healthcare: JNJ, UNH, ABBV
  • Energy: XOM, CVX, SLB
  • Consumer Discretionary: AMZN, TSLA, MCD
  • Plus Real Estate, Communication Services, and Industrials
  • API Design

    The FastAPI server exposes two endpoints: POST /chat for conversational queries and GET /health for uptime checks. On startup, the server ingests the policy PDF, fetches initial news for all holdings, and schedules the 60-minute news refresh. Empty messages return a 400 error; generation failures return a 500 error.

    Key Design Decisions

  • Supervisor routing over tool-calling: Using a dedicated routing LLM with structured output gives explicit, auditable routing decisions rather than relying on implicit tool selection
  • Dual RAG collections: Separating policy documents from news articles allows the supervisor to route to the right retrieval context, improving precision
  • Parallel price fetching: ThreadPoolExecutor in the portfolio agent reduces latency significantly when computing live P&L across many holdings
  • Singleton embeddings: Lazy-loading the SentenceTransformer model once avoids repeated initialization overhead on every request
  • Ticker extraction before API calls: Resolving contextual ticker references from conversation history before hitting market data APIs prevents hallucinated or irrelevant fetches
  • Github repository here