The "pass-at-k" metric is used to evaluate the performance of predictive models, such as Large Language Models (LLMs). This metric measures the likelihood that the correct answer or an acceptable ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results