Reported 9 months ago
On June 27, 2024, Hugging Face's co-founder Clem revealed that Alibaba's latest open-source Qwen2-72B instruction fine-tuning version has topped the open-source model leaderboard. The evaluation, using 300 Nvidia H100 high-performance hardware, included over 100 mainstream open-source large models like Qwen2, Llama-3, Mixtral, and Phi-3 on rigorous benchmark test sets like BBH, MUSR, MMLU-PRO, GPQA. The assessment aims to provide a fair and accurate ranking, addressing developers' over-reliance on leaderboard rankings during model training by increasing evaluation difficulty to test real-world performance under more challenging conditions. Alibaba's Qwen2-72B model emerged as the new industry leader, followed by Meta's Llama-3-70B, and Alibaba's Qwen2-72B base version ranked third.
Source: YAHOO