Summarize by Aili

Announcing Higgs Llama V2

https://boson.ai/higgs-v2/?utm_source=tldrai

🌈 Abstract

The article discusses Boson AI's work on developing intelligent agents that can serve as human companions and helpers. It introduces a new model called Higgs-Llama-3-70B-v2, which significantly improves upon its predecessor. The article highlights the model's performance on benchmarks relevant for dialog, interaction, and understanding, and mentions the partnership with the roleplay community to evaluate the model. It also discusses the improvements made to the model's judging system, which guides the model's alignment through synthetic feedback signals.

🙋 Q&A

[01] Introduction

1. What is Boson AI working on?

Boson AI is working on developing intelligent agents that can serve as human companions and helpers.

2. What is the new model introduced in the article?

The new model introduced is called Higgs-Llama-3-70B-v2, which is an improved version of its predecessor.

3. How does the new model perform on relevant benchmarks?

The new model narrows the gap to the very best proprietary models on benchmarks relevant for dialog, interaction, and understanding, such as Arena-Hard, AlpacaEval 2.0, and MMLU-Pro.

4. How did Boson AI evaluate the new model?

Boson AI partnered with the roleplay community and collected 6.2M dialogues in a 2-week A/B test to evaluate Higgs v2 directly against other models.

[02] Improvements in the new model

1. What are the key improvements in the new model?

The new model has an improved judging system that guides the model's alignment through synthetic feedback signals.
Boson AI built an in-house LLM reward model, named Higgs Judger, to evaluate model outputs, which ties with the best generative judger, Google's Gemini 1.5 Pro, on the Reward Bench leaderboard.
The Higgs Judger model learns the preference of players during roleplays, using the feedback provided by users.

2. What are the performance improvements of the new model?

Compared to Claude 3.5 Sonnet, Higgs v2 reduces the response regeneration rate by 21.6% and increases the day 1 retention rate by 5.3%.

[03] Access to the new model

1. What is the current status of the new model?

Boson AI is conducting more evaluations before the final release of the Higgs-Llama-3-70B-v2 model.

2. How can users access the new model?

If users would like to access Higgs v2 early or do customization, they can contact Boson AI at api@boson.ai.

Shared by Daniel Chen ·

Install fromChrome Web Store