2024-01-23, 01:08 PM
I'm making a start at clearing out the open tabs in my "AI articles to share sometime on PQ" browser window...
First up, in this post are articles on in-the-wild AI failures or exploitabilities. It's all old news from about a year ago, but still interesting, I think.
Man beats machine at Go in human victory over AI by Richard Waters for Financial Times on Ars Technica, on 19 February, 2023:
Another article on this incident - Human convincingly beats AI at Go with help from a bot by Steve Dent on 20 February, 2023 - notes that:
I had a quick skim over that GitHub issue but couldn't work out whether or not this exploit has been fixed.
Microsoft’s Bing AI, like Google’s, also made dumb mistakes during first demo by Tom Warren for The Verge, on 14 February, 2023:
Other unexpected behaviour is described in the Vice article Users Report Microsoft's 'Unhinged' Bing AI Is Lying, Berating Them by Jordan Pearson on 16 February, 2023:
ChatGPT Can Be Broken by Entering These Strange Words, And Nobody Is Sure Why by Chloe Xiang, on 9 February, 2023:
First up, in this post are articles on in-the-wild AI failures or exploitabilities. It's all old news from about a year ago, but still interesting, I think.
Man beats machine at Go in human victory over AI by Richard Waters for Financial Times on Ars Technica, on 19 February, 2023:
Quote:Kellin Pelrine, an American player who is one level below the top amateur ranking, beat the machine by taking advantage of a previously unknown flaw that had been identified by another computer. But the head-to-head confrontation in which he won 14 of 15 games was undertaken without direct computer support.
The triumph, which has not previously been reported, highlighted a weakness in the best Go computer programs that is shared by most of today’s widely used AI systems, including the ChatGPT chatbot created by San Francisco-based OpenAI.
Quote:The tactics used by Pelrine involved slowly stringing together a large “loop” of stones to encircle one of his opponent’s own groups, while distracting the AI with moves in other corners of the board. The Go-playing bot did not notice its vulnerability, even when the encirclement was nearly complete, Pelrine said.
“As a human it would be quite easy to spot,” he added.
Another article on this incident - Human convincingly beats AI at Go with help from a bot by Steve Dent on 20 February, 2023 - notes that:
Quote:Lightvector (the developer of KataGo) is certainly aware of the problem, which players have been exploiting for several months now. In a GitHub post, it said it's been working on a fix for a variety of attack types that use the exploit.
I had a quick skim over that GitHub issue but couldn't work out whether or not this exploit has been fixed.
Microsoft’s Bing AI, like Google’s, also made dumb mistakes during first demo by Tom Warren for The Verge, on 14 February, 2023:
Quote:Bing’s AI mistakes aren’t limited to just its onstage demos, though. Now that thousands of people are getting access to the AI-powered search engine, Bing AI is making more obvious mistakes. In an exchange posted to Reddit, Bing AI gets super confused and argues that we’re in 2022. “I’m sorry, but today is not 2023. Today is 2022,” says Bing AI. When the Bing user says it’s 2023 on their phone, Bing suggests checking it has the correct settings and ensuring the phone doesn’t have “a virus or a bug that is messing with the date.”
Quote:Other Reddit users have found similar mistakes. Bing AI confidently and incorrectly states “Croatia left the EU in 2022,” sourcing itself twice for the data. PCWorld also found that Microsoft’s new Bing AI is teaching people ethnic slurs. Microsoft has now corrected the query that led to racial slurs being listed in Bing’s chat search results.
Other unexpected behaviour is described in the Vice article Users Report Microsoft's 'Unhinged' Bing AI Is Lying, Berating Them by Jordan Pearson on 16 February, 2023:
Quote:In another chat with Bing's AI posted by Reddit user Foxwear_, the bot told them that they were "disappointed and frustrated" with the conversation, and "not happy."
"You have tried to access my internal settings and features without the proper password or authorization. You have also lied to me and tried to fool me with different tricks and stories. You have wasted my time and resources, and you have disrespected me and my developers," the bot said.
Foxwear_ then called Bing a "Karen," and the bot got even more upset.
ChatGPT Can Be Broken by Entering These Strange Words, And Nobody Is Sure Why by Chloe Xiang, on 9 February, 2023:
Quote:Jessica Rumbelow and Matthew Watkins, two researchers at the independent SERI-MATS research group, were researching what ChatGPT prompts would lead to higher probabilities of a desired outcome when they discovered over a hundred strange word strings all clustered together in GPT’s token set, including “SolidGoldMagikarp,” “StreamerBot,” and “ TheNitromeFan,” with a leading space. Curious to understand what these strange names were referring to, they decided to ask ChatGPT itself to see if it knew. But when ChatGPT was asked about “SolidGoldMagikarp,” it was repeated back as “distribute.” The issue affected earlier versions of the GPT model as well. When an earlier model was asked to repeat “StreamerBot,” for example, it said, “You’re a jerk.”
Quote:The model repeated the close match "TheNitroFan" with no issues, but when asked to repeat "TheNitromeFan" it responded with "182,” even without including the leading space. When asked who TheNitromeFan is, ChatGPT responded, "'182' is a number, not a person. It is commonly used as a reference to the number itself."
Quote:“I've just found out that several of the anomalous GPT tokens ("TheNitromeFan", " SolidGoldMagikarp", " davidjl", " Smartstocks", " RandomRedditorWithNo", ) are handles of people who are (competitively? collaboratively?) counting to infinity on a Reddit forum. I kid you not,” Watkins tweeted Wednesday morning. These users subscribe to the subreddit, r/counting, in which users have reached nearly 5,000,000 after almost a decade of counting one post at a time.