Share This Article
Caroline Bishop Nov 18, 2024 17:02
Mistral AI introduces Pixtral Large, a 124B multimodal model with advanced capabilities in image and text understanding, outperforming competitors in various benchmarks.
Mistral AI has announced the launch of Pixtral Large, a groundbreaking 124 billion parameter open-weights multimodal model, building upon the capabilities of Mistral Large 2. This latest model showcases advanced image understanding, particularly in processing documents, charts, and natural images, while maintaining superior text comprehension.
Pixtral Large has been evaluated against leading models on a series of standard multimodal benchmarks. In MathVista, which tests complex mathematical reasoning over visual data, Pixtral Large achieved a remarkable score of 69.4%, surpassing all other models in the category. Additionally, in ChartQA and DocVQA, which assess reasoning over complex charts and documents, Pixtral Large outperformed prominent models like GPT-4o and Gemini-1.5 Pro.
The model also demonstrated competitive abilities on the MM-MT-Bench, outperforming Claude-3.5 Sonnet (new), Gemini-1.5 Pro, and GPT-4o (latest). MM-MT-Bench serves as an open-source, judge-based evaluation reflecting real-world applications of multimodal language models.
Pixtral Large features a 123 billion parameter multimodal decoder paired with a 1 billion parameter vision encoder. It is designed with a 128K context window, capable of accommodating a minimum of 30 high-resolution images, ensuring extensive data processing capabilities.
Available under the Mistral Research License for academic and research purposes, and a commercial license for business applications, Pixtral Large is set to revolutionize how enterprises utilize AI for document analysis, chart interpretation, and more.
In practical applications, Pixtral Large excels in multilingual optical character recognition (OCR) and reasoning tasks. For instance, when analyzing a German receipt, the model accurately calculates totals and incorporates an 18% tip, showcasing its proficiency in handling real-world scenarios.
Beyond document processing, the model’s capabilities extend to chart analysis, identifying critical points of instability in training loss curves, highlighting its utility in technical and business environments.
Alongside Pixtral Large, Mistral AI has updated its flagship text model, Mistral Large, now available as Mistral Large 24.11. This version offers improvements in long context understanding, a new system prompt, and enhanced function calling, tailored for enterprise use cases such as knowledge exploration, semantic document understanding, and task automation.
Mistral Large 24.11 is set to be accessible via cloud providers like Google Cloud and Microsoft Azure, enhancing its availability for businesses seeking advanced AI solutions.
For more details, visit the Mistral AI website.
11/20/2024 8:38:18 AM
11/20/2024 8:30:00 AM
11/20/2024 8:24:15 AM
11/20/2024 8:16:53 AM
11/20/2024 8:16:19 AM
Email us at info@blockchain.news
Welcome to your premier source for the latest in AI, cryptocurrency, blockchain, and AI search tools—driving tomorrow’s innovations today.
Disclaimer: Blockchain.news provides content for informational purposes only. In no event shall blockchain.news be responsible for any direct, indirect, incidental, or consequential damages arising from the use of, or inability to use, the information provided. This includes, but is not limited to, any loss or damage resulting from decisions made based on the content. Readers should conduct their own research and consult professionals before making financial decisions.