• 17 Posts
  • 350 Comments
Joined 22 days ago
cake
Cake day: February 10th, 2026

help-circle












  • pkjqpg1h@lemmy.ziptoFuck AI@lemmy.worldbro
    link
    fedilink
    English
    arrow-up
    9
    ·
    2 days ago

    According to the AA-Omniscience benchmark

    The most expensive models,

    Opus 4.6 has a 60% hallucination rate and 46% accuracy rate. Gemini 3.1 Pro Preview has a 50% hallucination rate and 55% accuracy rate.

    And the questions aren’t even open-ended.

    I don’t even need to tell you about the other models.