Join Boston Leaders for Networking & Insights on March 27

0 0
Read Time:3 Minute

Inflection AI Introduces Inflection-2.5 Model

Today, Inflection AI, a startup based in Palo Alto and co-founded by Mustafa Suleyman from DeepMind and Reid Hoffman from LinkedIn, unveiled its latest foundation model named Inflection-2.5.

Inflection-2.5 represents a significant upgrade over the company’s original Inflection-1, showcasing performance that nearly matches OpenAI’s GPT-4 model, particularly in STEM fields. It now fuels the Pi assistant developed by the company to compete with ChatGPT and Gemini, offering accessibility via mobile devices and web platforms.

This development signifies the ongoing efforts in the dynamic AI landscape to challenge the supremacy of OpenAI, which continues to refine its strategy for AI development aimed at benefiting humanity. Recently, Anthropic introduced Claude 3 Opus, the first model to outperform GPT-4.

Enhancing AI Performance

Since its establishment, Inflection AI has been dedicated to crafting an AI that is empathetic, functional, and secure, showcasing a more personalized and conversational demeanor compared to other models such as the GPT series. The company’s unique empathetic fine-tuning has endowed the model powering Pi with a distinctive personality and remarkable emotional intelligence.

With the rollout of Inflection-2.5, the startup, which secured a $1.3 billion funding round in June 2023, is concentrating on boosting the IQ component, focusing on domains like physics and mathematics. Through a blog post released today, Inflection AI revealed that users engaging with Pi, backed by Inflection-2.5, can engage in discussions covering a wide array of topics, from hobbies and coding to academic queries and business planning.

Performance Benchmarking

The enhanced model demonstrates significant advancements over Inflection-1 across various metrics, nearly on par with GPT-4, although still slightly trailing. In performance evaluations such as the MMLU benchmark covering tasks from high school to professional levels, Inflection-2.5 scored 85.5, closely trailing GPT-4’s 87.3. Similarly, in STEM assessments, the model performed comparably to the OpenAI model, achieving a score of 63 in the Hungarian Math exam (compared to GPT-4’s 68) and reaching the 85th percentile in Physics GRE, in contrast to GPT-4’s 97th percentile.

In the GSM8K benchmark, which includes 8.5K high-grade school math problems, the Inflection model achieved a score of 86.3, slightly below GPT-4’s 92. In the 0-shot HumanEval test designed to assess code generation capabilities, it garnered a score of 73.8 compared to GPT-4’s 79.3.

Efficient Training and Web Search Integration

Although Inflection-2.5’s performance does not surpass that of GPT-4, the company emphasizes that it has achieved a “94% GPT-4 level performance” with significantly more efficient training compared to the OpenAI large language model (LLM). According to Inflection AI, the training of Inflection-2.5 required only 40% of the training FLOPs (compute) of GPT-4 to yield these results.

Similar to GPT-4, this model incorporates real-time web search functionality, ensuring users have access to the most current information on various topics. This enhancement is crucial as the company positions Pi assistant as an AI accessible to all. However, it is important to note that the quality of outcomes with web retrieval may vary since no benchmark currently assesses this aspect.

Accessing Inflection-2.5

Inflection AI has already deployed the new model for its Pi chatbot, enabling users to begin exploring its capabilities. Although specific benefits to users from the enhanced model remain undisclosed, the company indicated that the update has significantly impacted user sentiment, engagement, and retention, fostering organic growth of the chatbot’s user base.

Presently, the Pi chatbot, available on Android, iOS, web, and as a desktop application, boasts one million daily active users and six million monthly active users, with over four billion messages exchanged with the AI, and an average conversation duration of 33 minutes.

Image/Photo credit: source url

About Post Author

Chris Jones

Hey there! 👋 I'm Chris, 34 yo from Toronto (CA), I'm a journalist with a PhD in journalism and mass communication. For 5 years, I worked for some local publications as an envoy and reporter. Today, I work as 'content publisher' for InformOverload. 📰🌐 Passionate about global news, I cover a wide range of topics including technology, business, healthcare, sports, finance, and more. If you want to know more or interact with me, visit my social channels, or send me a message.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %