0d 0h 0m

|Looking for 2-3 team members

AGI House AI for Science Hackathon

The Iterative
AI Scientist

Autonomous research that compounds insights between human check-ins — powered by continuous reasoning-chain evolution

"LLMs can be very smart, but if they're not iterating on science, they won't discover science."

— Ekin Dogus, Periodic Labs

Led byShan Rizvi

Why Standard Agents Stall

The Problem

Current AI systems operate in one-shot bursts. Without persistent reasoning chains, they forget past explorations, redraw the same conclusions, and never reach the breakthrough questions scientists chase.

Build persistent reasoning chains that evolve over time

Generate hypotheses from gaps in existing knowledge

Critique and refine their own ideas

Form connections across disparate research domains

Remember and build upon previous explorations

Status Quo

One-shot completions with brittle focus. No memory. No critique. No accumulation.

What We Need

Reasoning memories that compound instead of resetting
Self-critique loops that graduate weak ideas
Cross-domain graph search that spots hidden links
Human-style research workflows that stay in context

Persistent iteration turns scattered insights into a scientific discovery engine.

What Makes It Powerful

A platform for continuous reasoning that could be applied to any domain—once the right knowledge graph and specialist agents are in place.

Persistent Research Context

Unlike one-shot queries, the system maintains reasoning chains across sessions. Research from Monday informs Friday's hypotheses, creating compound insights over time.

Cross-Domain Pattern Recognition

By traversing graph connections between disparate fields, it surfaces patterns that single-domain researchers might miss—the foundation for interdisciplinary breakthroughs.

Hypothesis Generation Platform

With the right knowledge graph schema (proteins, materials, etc.), the continuous reasoning loop becomes applicable to specialized domains—from biology to quantum chemistry.

Autonomous Background Research

While you sleep, the system keeps exploring gaps, validating chains, and preparing insights for your next check-in—research that never stops compounding.

The infrastructure for autonomous research. The applications are limitless.

The Orchestration Flow

Coordinator → specialist handoffs with built-in guardrails and full trace observability

RECEIVE

User request routed to coordinator agent

DELEGATE

Coordinator hands off to Graph/Research/Outreach specialists

EXECUTE

Specialist runs MCP tools, browser automation, or outreach flows

GUARD

Shared guardrails block sensitive payloads and mask disclosures

TRACE

Every event tagged with traceId for replay and debugging

PERSIST

Graph writes and memories stored in Neo4j, streamed to dashboard

Runs between human check-ins

Why It's Trustworthy

Built-in observability and guardrails make autonomous research safe and debuggable

Trace-First Observability

Every event carries a traceId logged in memories for complete replay and debugging. Full transparency into agent handoffs and reasoning chains.

Persistent Guardrails

Shared reject_sensitive_requests and mask_sensitive_disclosures functions block sensitive payloads across all specialist agents.

Autonomous Background Research

Coordinator delegates work to specialists that run continuously between human check-ins, compounding insights over time.

Graph-Backed Memory

Neo4j MCP tools persist reasoning chains and research activities, creating a durable knowledge base that survives sessions.

Progress So Far

Now we need your help to resolve bugs and feature gaps to make it production-ready

Trace & Memory Plumbing

Week 1

TraceId metadata flows through all SSE events and persists in memory logs for complete replay capability

Multi-Agent Orchestration Live

Week 2

Coordinator agent delegates to Graph Ops, Research Ops, and Outreach Ops using OpenAI Agents SDK handoffs

Guardrails & Dashboard Streaming

Week 3

Shared guardrail functions block sensitive payloads; Next.js dashboard streams orchestrator events in real-time

Recruiting 2–3 Teammates

Now

Seeking contributors to add regression tests, UI observability, guardrail tuning, and MCP resilience before hackathon

Join us to push this across the finish line before the AGI House hackathon

What you'll build

The orchestration pipeline is live. You'll harden it with regression tests, observability UI, and production-grade resilience.

Multi-Agent Regression Prompts

Create deterministic test prompts for each specialist agent (Graph/Research/Outreach) to validate handoffs and outputs

PythonTestingPrompt Engineering

UI Trace + Handoff Surfacing

Build dashboard components that visualize trace lineage and agent handoff chains for full observability

ReactNext.jsSSE Streaming

Guardrail Coverage Tests

Develop automated tests for guardrails with real research payloads, then tune sensitivity thresholds

TestingSecurityPython

Neo4j MCP Write Resilience

Add retries, idempotency, and telemetry to graph write operations for production reliability

Neo4jMCP ToolsError Handling

This is a hackathon, not a job interview. We're looking for people who want to build something ambitious in a weekend.

Multi-Agent Orchestration

A coordinator delegates research, graph updates, and outreach to specialist agents running continuously in the background

Coordinator Agent

Multi-agent handoff orchestration
OpenAI Agents SDK routing
Trace-first observability (traceId)
Shared guardrail enforcement

Graph Ops Specialist

Neo4j MCP read/write tools
Reasoning-chain persistence
Knowledge graph maintenance
Subgraph retrieval

Research Ops Specialist

Browser Use + Hunter intelligence
Live data gathering
Structured research activities
Evidence compilation

Outreach Ops Specialist

Human-in-the-loop proposals
Composio Gmail drafts
Twilio voice calls
Contact management

Tested Reference Prototypes

We're not starting from scratch—leveraging prior prototypes to move fast

Semantic Graph Memory MCP

Cognitive knowledge graph with reasoning chain support and advanced graph operations

View on GitHub

Multi-Agent Graph Deep Research

Multi-step orchestrated research system with iterative refinement capabilities

View on GitHub

Voice Integration with Graph Context

Hume EVI and real-time interaction experience for natural scientific exploration

View on GitHub

What Success Looks Like

Clear milestones for the hackathon and beyond

Minimum Viable Demo

Ingest 100+ papers from arXiv/PubMed
Identify gaps in literature
Generate novel hypothesis
Grade with confidence scores
Real-time visualization

Stretch Goals

Cross-domain hypothesis generation
Experimental design suggestions
Multiple iteration cycles
Voice-guided exploration
Collaborative reasoning sessions

Team Roles

Join the Team

Looking for 2-3 passionate people to make this real. One spot already filled.

Neo4j/Graph Database

Optimize queries, design schema, and build efficient graph operations for scientific knowledge

Scientific Domain Knowledge

Understand research methodologies, reasoning patterns, and scientific validation processes

Frontend/Visualization

Build compelling graph visualizations and intuitive interfaces for exploring reasoning chains

Filled

AI/ML Engineering

Work on hypothesis generation, critique systems, and iterative reasoning algorithms

Bring your expertise, passion for science, and desire to push the boundaries of AI

Interested?

Drop your email and I'll reach out with more details.

Or reach out directly at shan@rizvi.nu

Let's Build the Future of Science

Ready to enable AI systems that truly discover? Get in touch.

Shan Rizvi

Project Lead

shan@rizvi.nu @ShanRizvi github.com/shuruheel

Let's Talk

AGI House AI for Science Hackathon 2025

The IterativeAI Scientist

The Problem

What Makes It Powerful

Persistent Research Context

Cross-Domain Pattern Recognition

Hypothesis Generation Platform

Autonomous Background Research

The Orchestration Flow

RECEIVE

DELEGATE

EXECUTE

GUARD

TRACE

PERSIST

Why It's Trustworthy

Trace-First Observability

Persistent Guardrails

Autonomous Background Research

Graph-Backed Memory

Progress So Far

Trace & Memory Plumbing

Multi-Agent Orchestration Live

Guardrails & Dashboard Streaming

Recruiting 2–3 Teammates

What you'll build

Multi-Agent Regression Prompts

UI Trace + Handoff Surfacing

Guardrail Coverage Tests

Neo4j MCP Write Resilience

Multi-Agent Orchestration

Coordinator Agent

Graph Ops Specialist

Research Ops Specialist

Outreach Ops Specialist

Tested Reference Prototypes

Semantic Graph Memory MCP

Multi-Agent Graph Deep Research

Voice Integration with Graph Context

What Success Looks Like

Minimum Viable Demo

Stretch Goals

Join the Team

Neo4j/Graph Database

Scientific Domain Knowledge

Frontend/Visualization

AI/ML Engineering

Interested?

Let's Build the Future of Science

Shan Rizvi

The Iterative
AI Scientist