Framing the Problem

What is the problem you are trying to solve? Who does it affect?

With the deployment of Artificial Intelligence (AI) and Large Language Models (LLMs), modern research is flooded with AI-generated and AI-assisted content. This flood creates a wave of unverifiable claims, unsupported evidence, and fabricated data. This erosion of research integrity directly affects students, educators, researchers, and publishers, who rely on trustworthy sources to advance their work.

Academics and organizations need better tools to evaluate the integrity of AI-generated content. Such tools ground humanity on a path toward scientific and cultural improvement by helping high-quality content remain relevant amidst a flood of superficial media.

What is your idea? How does it fix the problem?

The key idea is to create agentic-software that decomposes source documents into factual and logical claims before verifying these claims using a legion of AI agents which scour the internet for ground truth. Instead of forcing users to trust the accuracy of AI-generated reports, this software represents documents as visual flowcharts that users can explore to internalize the integrity of specific claims. These software tools help users inspect and reject speculative statements made by AI.

If the key problem is trusting AI-generated content, this approach provides a solution by evaluating and ranking digital content by mathematically-grounded scores of factual integrity. While the demo evaluates a single paper, high-throughput APIs could let language models (e.g. Claude, GPT-5, and Google Gemini) verify the integrity of millions of reports per second. At scale, this solution prevents AI from generating useless content, and instead builds toward an "AI-Internet of Integrity."

Implementation

How do all the pieces fit together? Does your frontend make requests to your backend? Where does your database fit in?

We have three primary components

Frontend (Gatsby/React): The front end lets uses input and explore document reports
Backend (Python - FastAPI): The backend processes python requests and evaluates document integrity
Docker Image (Python Web Agent): The Docker image lets a swarm of AI agents scour the internet

Together, these components form a pipeline that takes in research papers (via PDF upload or URL), extracts their content, builds a structured fact-graph, and verifies the accuracy of each claim through autonomous agents. More specifically, the frontend handles user interaction and visualization. It sends pos the backend coordinates analysis and verification. It uses the Dockerized web agent to provide a secure, sandboxed environment for browsing and data extraction. At this time, all memory is non-persistent, though long term the software could be updated to include a database to save user-specific research projects.

Challenges

What did you struggle with? How did you overcome it?

One of our biggest challenges early on was figuring out how each part of the system should communicate. Because our project relied on multiple moving pieces (agents, backend services, and the visualizer) it was easy for inputs and outputs to get out of sync. Without clear definitions, even small mismatches caused major slowdowns.

We overcame this by setting clear expectations for what data each agent should produce and consume. Once we standardized the JSON output for every agent, it became much easier for our DAG engineer to connect everything together and avoid merge or logic conflicts. This structure let us move faster and made debugging far simpler as the system grew.

Accomplishments

What did you learn? What did you accomplish?

There are several key aspects we learned: Building with agentic-AI systems. Asynchronous virtual machines, efficient prompts, and web development.

We effectively orchestrated asynchronous agent behavior in a distributed system, leveraging virtualized environments and communication protocols to enable concurrent task execution. This allows the backend to process large documents while maintaining responsiveness in the frontend.
Designed a full pipeline: PDF -> LLM -> Knowledge graph of claims -> agent swarm of fact verifiers -> mathematically grounded integrity score based on graph theory -> Integrity Score
Built a simple yet effective, intuitive UI that utilized interactive sliders that modify weights in a scoring algorithm, allowing the user to tailor the score to their needs
Deployed a scoring system that integrates the many unique qualities of research papers into a single mathematically grounded integrity score.
By utilizing browser-based agents, we enabled the system to analyze both static PDFs and live webpages, ultimately expanding its range of usable data sources.
Developed a dynamic visualization that clearly displays how each node of the selected media relates to other nodes and the corresponding strength of each edge.

Next Steps

What are the next steps for your project? How can you improve it?

Our current plan:

Create visualizations of "multi-paper analysis" showing how the credibility of one paper influences the integrity of downstream papers with cite a source paper
Allow users to see the exact source/methodology behind integrity ratings. This would be the quickest improvement. All of the information is neatly sorted in our backend, it's just a matter of displaying the information to users.
Let users save and load analysis reports into user profiles. Implement a database for personal account storage features.
Allowing Plato agents to reference previously reviewed papers.
Capture reference sources used to verify information and complete a similar level of verification. This could allow us to create a network of nodes that point outwards to where they are supported.

The largest improvements depend on a database implementation, which would transform Plato's Cave from a single session tool into a long-term academic assistant. By saving papers and their fact graphs, Plato’s Cave can reference prior findings. When a user analyzes a paper similar to one, they’ve already viewed, we can surface overlaps and let them link related papers. This way users can and grow a personal “fact graph” across projects.

Plato's Cave

Verify research claims—instantly, interactively, and with confidence.