LLM Evaluation

Evaluation of LLMs Nowadays, LLMs become more and more powerful to tackle many tasks (e.g., math proplems, content generation). However, evaluating LLMs has been still a big challenge. a A statistical approach to model evaluations —view from Anthropic Suppose an artificial intelligent model outperforms another one on an interest benchmark—such as testing whose abilities of common knowledge or solving computer coding questions. Is this difference in capabilities real? Or could one model just have gotten luckly in the choice of questions on the benchmark?...

April 11, 2024 · 1 min · Loong

LLM Inference

why LLM inference runs slowly excellent solutions cerebras Figure 1. The result of LLaMA3.1-70B inference speed with different solutions. (Image source: Artificial Analysis)

February 22, 2024 · 1 min · Loong