The three main attack vectors discussed are: tool attacks, memory poisoning, and prompt injection.
Sentence-BERT is used to convert the semantic text data (communication history and message content) into numerical embeddings (vectors). These numerical embeddings are then used to build the graph's node and edge features, which are subsequently fed into the Graph Neural Network (GNN) for analysis.
This video discusses the recent discovery of "toxic" AI agents in the global financial sector and explores potential solutions using graph topology. The presenter highlights studies from Bloomberg, universities, and research papers revealing significant safety gaps in current AI systems, particularly concerning multi-agent systems, and proposes a novel approach leveraging Graph Neural Networks (GNNs) to enhance security.
Toxic AI Agents in Finance: Studies reveal unsafe generative AI in the global financial sector, posing substantial risks due to the potential for data breaches, misinformation, and system-wide disruptions.
Safety Gaps in Existing Systems: Current guardrails and risk taxonomies are insufficient to detect most content risks in the financial services domain.
Multi-Agent System Vulnerabilities: Multi-agent systems, with their interconnectedness and open access to databases and servers, present a significant security vulnerability. Attacks such as prompt injection, memory poisoning, and tool attacks are highlighted.
GNN-based Safeguard: The video proposes a novel safeguard using a two-part system: (1) building a multi-agent utterance graph representing agent communication, and (2) employing a GNN for node classification to detect malicious agents and prune their communication pathways ("topological intervention").
Practical Implementation: The video provides system prompts for various attack vectors and outlines the steps involved in building and utilizing the GNN-based safeguard. Benchmark data demonstrating the effectiveness of the proposed system is also shown.