The amount of unstructured data, including PDFs, images, videos, and audio files, is rapidly increasing within businesses today. However, documents, which make up a significant portion of this data and hold valuable information, are still being processed using inefficient and manual methods.
To help organizations extract more value from unstructured data, Snowflake has introduced Document AI. This new feature, currently in private preview, allows for easy extraction of content such as invoice amounts or contract terms from documents using a proprietary multimodal large language model (LLM) called Snowflake Arctic-TILT (Text Image Layout Transformer). Arctic-TILT, along with Snowflake Arctic, demonstrates the Snowflake AI Research Team’s dedication to enhancing LLM accuracy and creating tailored products for enterprises.
Snowflake is proud to announce a breakthrough achievement for Arctic-TILT. This model, with 0.8B parameters, has achieved a top score in the DocVQA benchmark test, surpassing even GPT-4, a much larger model. This highlights the effectiveness of smaller, application-specific models over larger, more general-purpose ones, making Arctic-TILT more cost-effective and accessible.
What is Arctic-TILT?
Arctic-TILT is a Snowflake-developed LLM designed to extract data from documents using a unique transformer architecture. By combining multiple data modalities, Arctic-TILT offers exceptional performance in document understanding tasks. This model powers Snowflake’s Document AI feature, enabling users to interact with the model through a natural language interface, ask questions, evaluate responses, and fine-tune the model with ease. This solution allows users to structure unstructured data, integrate it with existing data, and perform automated analytics.
Key Features and Capabilities
- Multimodal Understanding: Arctic-TILT can analyze text, images, and spatial layouts simultaneously for a comprehensive understanding of content.
- State-of-the-Art Performance: Arctic-TILT excels in benchmarks like DocVQA, showcasing its capabilities in visual question answering.
- Extended Context Window: Arctic-TILT offers a large context window of 375,000 tokens, essential for understanding multimodal content.
- Efficient Inference: Designed for scalability, Arctic-TILT efficiently processes documents with high accuracy.
- Adaptability: Arctic-TILT is versatile and can be fine-tuned for various applications without prior knowledge of document formats.
Punching above its weight: Arctic-TILT on DocVQA
In recent evaluations, Arctic-TILT achieved an impressive 90% ANLS score on the DocVQA data set, outperforming larger models with fewer parameters. This efficiency reflects Snowflake’s commitment to developing cost-effective and high-performing AI solutions.
Why DocVQA?
DocVQA is a benchmark for evaluating models’ abilities to handle document-centric questions. By excelling in DocVQA, Arctic-TILT demonstrates its effectiveness in complex document understanding scenarios.
Document AI use cases and applications
With Snowflake Document AI powered by Arctic-TILT, users can interact with unstructured data more efficiently. Document AI simplifies model setup and deployment, allowing users without machine learning expertise to leverage the power of Arctic-TILT for extraction tasks. This seamless integration enhances efficiency and expands AI applications within enterprises.
Private preview customers across various industries are already benefiting from Document AI for tasks such as processing patient records, insurance claims, tax filings, and more.
Looking ahead and getting started with Document AI
Attend the Data Cloud Summit from June 3-6 to learn more about Document AI and its applications. Snowflake customers can reach out to their account team for more information on trying Document AI. Alternatively, they can utilize Snowflake Arctic models like Arctic LLM and Arctic embed for text-embedding tasks.
The post Snowflake’s Arctic-TILT: A State-of-the-Art Document Intelligence LLM in a Single A10 GPU appeared first on Snowflake.