build-with-ag2

Due Diligence Agent System: AG2 + TinyFish

Created by John Marshall
Last revision: 03/11/2026

A multi-agent due diligence pipeline that automatically researches a company from a single URL. It uses AG2 to orchestrate specialist agents in parallel threads, each powered by TinyFish for deep web scraping.

Overview

Given a company URL, the system runs a 4-stage pipeline:

Seed Crawler — Scrapes the company website to build an initial profile (name, description, team pages, press pages, job URLs, etc.)
Parallel Specialists — Spawns 6 specialist agents concurrently, each using TinyFish to deep-scrape relevant sources:
- Founders & Team — LinkedIn, about/team pages
- Investors & Funding — Crunchbase, investor pages
- Press Coverage — Google News, company press pages
- Financials — Yahoo Finance, Crunchbase
- Technology Stack — BuiltWith, GitHub, job postings, engineering blogs
- Social Signals — LinkedIn, Twitter/X, GitHub
Validator — Cross-checks all collected data for contradictions, gaps, and low-confidence fields
Synthesis — Produces a structured markdown due diligence report

After the pipeline completes, an interactive Q&A mode lets you ask follow-up questions grounded in the collected data.

AG2 Features

ConversableAgent — Each specialist is an AssistantAgent with a focused system prompt
TinyFishTool API — AG2’s built-in TinyFish integration registered as a callable tool for agents

Workshop

This project is referenced in the AG2 Workshop:

Episode 13: Web Browsing Agents

Installation

Prerequisites

Python 3.10+
An OpenAI API key
A TinyFish API key

Setup

Clone and navigate to the folder:

git clone https://github.com/ag2ai/build-with-ag2.git
cd build-with-ag2/due-diligence-with-tinyfish

Install dependencies:
```
pip install -r requirements.txt
```

Set environment variables:

export OPENAI_API_KEY=your-openai-key
export TINYFISH_API_KEY=your-tinyfish-key

Usage

Run the full pipeline

python main.py --url https://example.com

This will:

Run all 4 stages of the pipeline
Save structured JSON outputs and a final report to a timestamped directory (e.g., due_diligence_acme_20260311_120000/)
Enter interactive Q&A mode

Q&A on an existing report

python main.py --report-path ./due_diligence_acme_20260311_120000/

Skip the pipeline and jump straight into Q&A over a previously generated report.

Output Structure

due_diligence_acme_20260311_120000/
├── company_profile.json      # Seed crawl results
├── founders_team/
│   ├── founders.json
│   ├── executives.json
│   └── headcount.json
├── investors.json
├── press/
│   ├── articles.json
│   └── sentiment.json
├── financials.json
├── tech_stack.json
├── social.json
├── validation_notes.json
├── report.md                 # Final synthesized report
└── references.md             # Index of all output files

Contact

For more information or any questions, please refer to the documentation or reach out to us!

View Documentation at: https://docs.ag2.ai/latest/
Find AG2 on GitHub: https://github.com/ag2ai/ag2
Join us on Discord: https://discord.gg/pAbnFJrkgZ

License

This project is licensed under the Apache License 2.0. See the LICENSE for details.

This site is open source. Improve this page.