Building an Agentic Platform — Ben Kus, CTO Box

Sources

youtube.com

Answer

About this Video

Video Title: Building an Agentic Platform — Ben Kus, CTO Box
Channel: AI Engineer
Speakers: Ben Kus
Duration: 00:19:06

Introduction

This video presents Ben Kus, CTO of Box, discussing Box's journey in integrating AI, specifically focusing on their shift to an "agentic" architecture for data extraction from unstructured content. The video highlights the challenges of using LLMs alone for complex data extraction tasks and explains how the agentic approach addresses these limitations.

Key Takeaways

Challenges of Pre-Generative AI Data Extraction: Traditional methods for extracting structured data from unstructured data (like contracts) were difficult, requiring specialized models and large training datasets. This limited the automation of unstructured data processing in enterprises.
Initial LLM Approach and its Limitations: Box initially used LLMs for data extraction, which worked well for simple tasks. However, complex documents and numerous fields overwhelmed the LLMs, leading to accuracy issues and challenges in handling diverse file formats and languages.
Agentic Architecture Solution: Box transitioned to an agentic architecture, employing AI agents that follow instructions, have objectives, use tools (like OCR), and possess memory, enabling more sophisticated, multi-step data extraction processes. This approach improved accuracy and adaptability to complex scenarios.
Benefits of Agentic Architecture: The agentic system offers improved accuracy, flexibility, and scalability. It allows for iterative improvements and better handles complex documents, diverse file formats, and multiple languages. The architecture is also easier to evolve than a purely LLM-based approach.
Recommendation: Build Agentic Architecture Early: Kus recommends building an agentic architecture early in AI development, as it provides a more robust and adaptable foundation compared to relying solely on LLMs, especially when dealing with the complexities of enterprise data.

Ask me anything about this video:

1

Answer

The transcript indicates that Box initially used pre-processing steps involving OCR (Optical Character Recognition). More details on the specific OCR techniques or steps are not provided.

Apa langkah pre-proces tertentu yang telah Box gunakan sebelum mempunyai LLMs untuk ekstraksi

Answer

Maaf, transkrip hanya menyebutkan bahwa Box menggunakan langkah pra-pemrosesan yang melibatkan OCR (Pengenalan Karakter Optik). Tidak ada detail lebih lanjut tentang teknik atau langkah OCR tertentu yang diberikan.

What were the primary reasons why the pure LLm approach proved insufficient for Boxes' data extraction needs?

Answer

The pure LLM approach proved insufficient for Box's data extraction needs primarily due to the complexity of the documents and the large number of fields involved. LLMs struggled with:

Complex documents: Longer documents (e.g., 300-page leases) with numerous complex fields and instructions overwhelmed the LLMs' capacity.
Accuracy limitations: LLMs lacked the ability to consistently achieve high accuracy required in an enterprise setting. Confidence scores were unreliable and the models sometimes lost track of the task.
OCR limitations: The accuracy of the initial OCR process significantly impacted the performance of the LLMs. Imperfect OCR results fed inaccurate information to the models.
Language and file format diversity: Handling multiple languages and different file formats (like PDFs) presented further challenges to the LLMs.

About this Video

Video Title: Building an Agentic Platform — Ben Kus, CTO Box
Channel: AI Engineer
Speakers: Ben Kus
Duration: 00:19:06

Introduction

Key Takeaways

Challenges of Pre-Generative AI Data Extraction: Traditional methods for extracting structured data from unstructured data (like contracts) were difficult, requiring specialized models and large training datasets. This limited the automation of unstructured data processing in enterprises.
Initial LLM Approach and its Limitations: Box initially used LLMs for data extraction, which worked well for simple tasks. However, complex documents and numerous fields overwhelmed the LLMs, leading to accuracy issues and challenges in handling diverse file formats and languages.
Agentic Architecture Solution: Box transitioned to an agentic architecture, employing AI agents that follow instructions, have objectives, use tools (like OCR), and possess memory, enabling more sophisticated, multi-step data extraction processes. This approach improved accuracy and adaptability to complex scenarios.
Benefits of Agentic Architecture: The agentic system offers improved accuracy, flexibility, and scalability. It allows for iterative improvements and better handles complex documents, diverse file formats, and multiple languages. The architecture is also easier to evolve than a purely LLM-based approach.
Recommendation: Build Agentic Architecture Early: Kus recommends building an agentic architecture early in AI development, as it provides a more robust and adaptable foundation compared to relying solely on LLMs, especially when dealing with the complexities of enterprise data.