Those tests using fairly straightforward maths problems are revealing, because the mistakes and fabrications can be identified and shown for what they are. But it's worrying if the AI tool is being used to produce outputs relating to more loosely-defined real-world subjects. It may be much less obvious when the output is simply nonsense, because it may appear plausible.
AI megathread
421 Replies, 17957 Views
OpenAI’s dirty December o3 demo doesn’t readily replicate
Gary Marcus Quote:Later, after I wrote that piece, I discovered that one of their demos, on FrontierMath, was fishy in a different way: OpenAI had privileged access to data their competitors didn’t have, but didn’t acknowledge this. They also (if I recall) failed to disclose their financial contributions in developing the test. And then a couple weeks ago we all saw that current models struggled mightly on the USA Math Olympiad problems that were fresh out of the oven, hence hard to prepare for in advance.
'Historically, we may regard materialism as a system of dogma set up to combat orthodox dogma...Accordingly we find that, as ancient orthodoxies disintegrate, materialism more and more gives way to scepticism.'
- Bertrand Russell |
« Next Oldest | Next Newest »
|
Users browsing this thread: 1 Guest(s)