GPT-4 didn't ace the bar exam after all, MIT research suggests — it barely passed

  • 📰 LiveScience
  • ⏱ Reading Time:
  • 55 sec. here
  • 2 min. at publisher
  • 📊 Quality Score:
  • News: 25%
  • Publisher: 51%

日本 見出し ニュース

日本 最新ニュース,日本 見出し

Ben Turner is a U.K. based staff writer at Live Science. He covers physics and astronomy, among other topics like tech and climate change. He graduated from University College London with a degree in particle physics before training as a journalist.

OpenAI, the company behind the large language model that powers its chatbot ChatGPT, made the claim in March last year, and the announcement sent shock waves around the web and the legal profession.

But it turns out the artificial intelligence model only scored in the top 10% when compared with repeat test takers. When Martínez contrasted the model's performance more generally, the LLM scored in the 69th percentile of all test takers and in the 48th percentile of those taking the test for the first time.

To investigate the results further, Martínez made GPT-4 repeat the test again according to the parameters set by the authors of the original study. The UBE typically consists of three components: the multiple-choice Multistate Bar Examination ; the Multistate Performance Test that makes examinees perform various lawyering tasks; and the written Multistate Essay Examination .

"Although the leap from GPT-3.5 was undoubtedly impressive and very much worthy of attention, the fact that GPT-4 particularly struggled on essay writing compared to practicing lawyers indicates that large language models, at least on their own, struggle on tasks that more closely resemble what a lawyer does on a daily basis," Martínez said.

 

コメントありがとうございます。コメントは審査後に公開されます。
このニュースをすぐに読めるように要約しました。ニュースに興味がある場合は、ここで全文を読むことができます。 続きを読む:

 /  🏆 538. in JP

日本 最新ニュース, 日本 見出し