3 modalities
Tabular, graph, and time-series anomaly detection
LLM-Driven Research Engineering
AD-AGENT turns natural-language requests into runnable anomaly detection pipelines across PyOD, PyGOD, and TSB-AD — with automated review, sandbox execution, and evaluation.
Supported by NSF POSE Phase II OpenAD (Award #2346158)
Tabular, graph, and time-series anomaly detection
Processor, Selector, Generator, Reviewer, Evaluator
Parse, select, codegen, test, and run
Running anomaly detection across different data modalities usually means switching libraries, re-learning APIs, and wiring up evaluation code by hand. AD-AGENT collapses that loop: a prompt describing the task you want to solve — "Run IForest on cardio.mat", "Detect anomalies in my graph data", or "Try all PyOD models on this dataset" — is turned into a working script, executed inside a secure sandbox, and evaluated end-to-end.
Write commands like: Run IForest on ./data/glass_train.mat and ./data/glass_test.mat.
Works across pyod (tabular), pygod (graph), and tsb_ad (time-series) — one interface for multiple data modalities.
Generated code is reviewed on synthetic data before real execution.
Call api.pipeline stages directly from Python to embed AD-AGENT in notebooks, scripts, or larger workflows.
When no algorithm is specified the selector agent recommends competitive candidates based on data modality and shape.
Generated code runs inside an isolated sandbox — Modal (remote, default) or Docker (local) — never on the host process.
-o) tunes hyper-parameters with LLM guidance and re-evaluates.git clone git@github.com:USC-FORTIS/AD-AGENT.git
cd AD-AGENT
python -m venv .venv
# macOS / Linux
source .venv/bin/activate
# Windows
.venv\Scripts\activate
pip install -r requirements.txt
export OPENAI_API_KEY=your-api-key-here # or set in src/config/config.py
python main.py
Then type a natural-language request, for example:
Run IForest on ./data/pyod_data/cardio.mat
Run DOMINANT on ./data/pygod_data/books.pt
Run IForest on ./data/SMAP/SMAP_train.npy
Run all on ./data/pyod_data/cardio.mat
Parallel: python main.py -p | Optimizer: python main.py -o | Sandbox: python main.py --sandbox docker
AD-AGENT runs its agent workflow on the host machine and executes generated model scripts
inside an isolated sandbox backend. This keeps the orchestration layer lightweight while
moving package-heavy model execution into containers. Workflow logs
([main], [selector], [reviewer] …) come from the
host; generated-script output is streamed back from the sandbox into the same terminal.
Remote execution in a managed cloud sandbox. No local Docker required.
Prerequisite — install and authenticate once:
pip install modal
modal setup
Paths inside Modal: /workspace (script), /data (datasets).
Local container execution. Requires Docker Desktop to be running. Useful for offline or air-gapped environments.
Resource limits applied to every container:
--memory=4g --cpus=2 --rm
Dataset files are bind-mounted read-only at the requested in-container paths.
python main.py --sandbox docker
python main.py --sandbox modal
# with debug retention (Modal only — sandboxes kept for post-run inspection)
ADAGENT_SANDBOX_DEBUG=1 python main.py --sandbox modal
Sandbox mode is resolved in this priority order:
--sandbox flag passed to main.pyADAGENT_SANDBOXOPENAD_SANDBOX (backward compatibility)src/config/settings.yaml if presentmodalAdditional environment variables:
ADAGENT_SANDBOX_DEBUG=1 # retain Modal sandboxes after run for inspection
ADAGENT_MODAL_APP_NAME=... # override Modal app name (default: adagent-sandbox)
ADAGENT_MODAL_VOLUME_NAME=... # override Modal volume name (default: adagent-data)
adagent-pyod:latest
adagent-pygod:latest
adagent-tsb-ad:latest
PYOD_IMAGE
PYGOD_IMAGE
TSB_AD_IMAGE
pygod requires PyG wheel deps (pyg_lib, torch_sparse, torch_scatter).
api.pipeline module
Step-by-step pipeline functions for anomaly detection. Each function can be called
individually (pass explicit arguments) or inside a graph (pass a FullToolState).
Bases: TypedDict
Shared state dictionary passed between all pipeline nodes.
Attributes
"pyod", "pygod", or "tsb_ad".num_samples, has_labels) inferred by the Selector.(tool, final_state) pairs after all tools are processed.Create a default FullToolState with all agent instances initialised.
Returns
Launch the interactive chatbot to collect algorithm, dataset, and parameter information from the user and populate experiment_config.
Parameters
None.Returns
experiment_config populated from user input.Resolve the AD library and tool list from experiment configuration or explicit arguments. Infers package_name, feature_dim, and dataset metadata.
Parameters
"all" to use every available algorithm for the detected library, or None to let the agent decide.state is None.experiment_config is used when provided.Returns
agent_selector, package_name, feature_dim, and metadata set.Query authoritative documentation for an algorithm and store the result in algorithm_doc. Results are cached to disk to avoid redundant API calls.
Parameters
state is None."pyod", "pygod", "tsb_ad"). Required when state is None.current_tool and package_name.Returns
algorithm_doc set to the retrieved documentation string.Generate an initial runnable script for the algorithm, or revise an existing one when a previous CodeQuality with errors is supplied.
Parameters
state is None.state is None.run_info_miner). Required when state is None.state is None.CodeQuality with error_message set, triggering a revision pass instead of fresh generation.Returns
code_quality.code containing the generated or revised script.Execute the generated code against synthetic data to catch runtime errors before real-data evaluation. Updates code_quality.error_message and increments review_count on failure.
Parameters
state is None.state is None.Returns
code_quality.error_message set (empty string on success).Convenience loop that alternates code generation and synthetic review until the code passes or max_reviews is reached.
Parameters
Returns
code_quality.Execute the reviewed code on real training and testing data inside the configured sandbox and compute AUROC / AUPRC metrics.
Parameters
state is None.state is None.Returns
code_quality.auroc and code_quality.auprc populated.Use an LLM to propose improved hyper-parameters and re-evaluate. Only active when the -o flag is passed on the command line. Returns the original code_quality unchanged otherwise.
Parameters
state is None.state is None.Returns
code_quality after up to 8 optimisation steps.Run an initial evaluation pass and then alternate optimizer and evaluator for optimizer_cycles iterations. Exits early if any step returns an error.
Parameters
Returns
code_quality achieved across all cycles.Validate that dataset files exist on disk before the pipeline starts.
Parameters
Raises
Print a formatted stage-tagged log line. Inserts a blank separator line when the stage or tool context changes.
Parameters
"code_generator", "reviewer".[stage][tool].If this project helps your work, cite the paper:
@inproceedings{yang2025ad,
title={AD-AGENT: A Multi-agent Framework for End-to-end Anomaly Detection},
author={Yang, Tiankai and Liu, Junjun and Siu, Michael and Wang, Jiahang
and Qian, Zhuangzhuang and Song, Chanjuan and Cheng, Cheng
and Hu, Xiyang and Zhao, Yue},
booktitle={Proceedings of the 14th International Joint Conference on
Natural Language Processing and the 4th Conference of the
Asia-Pacific Chapter of the Association for Computational Linguistics},
pages={191--205},
year={2025}
}
This project is supported by the U.S. National Science Foundation (NSF), TIP POSE program: NSF POSE: Phase II: OpenAD: An Integrated Open-Source Ecosystem for Anomaly Detection.