this post was submitted on 10 Apr 2024
32 points (100.0% liked)
U.S. News
2244 readers
1 users here now
News about and pertaining to the United States and its people.
Please read what's functionally the mission statement before posting for the first time. We have a narrower definition of news than you might be accustomed to.
Guidelines for submissions:
- Post the original source of information as the link.
- If there is a paywall, provide an archive link in the body.
- Post using the original headline; edits for clarity (as in providing crucial info a clickbait hed omits) are fine.
- Social media is not a news source.
For World News, see the News community.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Ugh... I'm deep on the ai sphere, and this seems like a bad idea to me. Gpt (let's face it, they are probably using open ai) can be deeply biased and arbitrary in it's evaluations.
For example, "Two apples and four oranges," might score better than: "4 oranges and 2 apples." for inscrutable reasons. Say, if the question spelled out the numbers, and the LLM has a weighted bias to favor overall textual consistently, it might produces a reason to dock points apparently unrelated to that weight, such as: "incomplete sentence." for the second answer, but not the first.
Students may also receive lower scores due to cultural biases towards certain phrases, and factors as straightforward as their name.
Finally, AI will hallucinate errors constantly if you ask it to evaluate text without any errors. Constantly. Consistently.