The transcript indicates that Box initially used pre-processing steps involving OCR (Optical Character Recognition). More details on the specific OCR techniques or steps are not provided.
Maaf, transkrip hanya menyebutkan bahwa Box menggunakan langkah pra-pemrosesan yang melibatkan OCR (Pengenalan Karakter Optik). Tidak ada detail lebih lanjut tentang teknik atau langkah OCR tertentu yang diberikan.
This video presents Ben Kus, CTO of Box, discussing Box's journey in integrating AI, specifically focusing on their shift to an "agentic" architecture for data extraction from unstructured content. The video highlights the challenges of using LLMs alone for complex data extraction tasks and explains how the agentic approach addresses these limitations.
Challenges of Pre-Generative AI Data Extraction: Traditional methods for extracting structured data from unstructured data (like contracts) were difficult, requiring specialized models and large training datasets. This limited the automation of unstructured data processing in enterprises.
Initial LLM Approach and its Limitations: Box initially used LLMs for data extraction, which worked well for simple tasks. However, complex documents and numerous fields overwhelmed the LLMs, leading to accuracy issues and challenges in handling diverse file formats and languages.
Agentic Architecture Solution: Box transitioned to an agentic architecture, employing AI agents that follow instructions, have objectives, use tools (like OCR), and possess memory, enabling more sophisticated, multi-step data extraction processes. This approach improved accuracy and adaptability to complex scenarios.
Benefits of Agentic Architecture: The agentic system offers improved accuracy, flexibility, and scalability. It allows for iterative improvements and better handles complex documents, diverse file formats, and multiple languages. The architecture is also easier to evolve than a purely LLM-based approach.
Recommendation: Build Agentic Architecture Early: Kus recommends building an agentic architecture early in AI development, as it provides a more robust and adaptable foundation compared to relying solely on LLMs, especially when dealing with the complexities of enterprise data.
The pure LLM approach proved insufficient for Box's data extraction needs primarily due to the complexity of the documents and the large number of fields involved. LLMs struggled with: