If a text-sending algorithm can pass itself off as a human instead of a chatbot, its message would be more credible. Therefore, human-seeming chatbots with well-crafted online identities could start scattering fake news that seem plausible, for instance making false claims during a presidential election. With enough chatbots, it might be even possible to achieve artificial social proof.
[In] artificial intelligence ... machines are made to behave in wondrous ways, often sufficient to dazzle even the most experienced observer. But once a particular program is unmasked, once its inner workings are explained ... its magic crumbles away; it stands revealed as a mere collection of procedures ... The observer says to himself "I could have written that". With that thought he moves the program in question from the shelf marked "intelligent", to that reserved for curios ... The object of this paper is to cause just such a re-evaluation of the program about to be "explained". Few programs ever needed it more.
ALICE – which stands for Artificial Linguistic Internet Computer Entity, an acronym that could have been lifted straight out of an episode of The X-Files – was developed and launched by creator Dr. Richard Wallace way back in the dark days of the early Internet in 1995. (As you can see in the image above, the website’s aesthetic remains virtually unchanged since that time, a powerful reminder of how far web design has come.)
The first formal instantiation of a Turing Test for machine intelligence is a Loebner Prize and has been organized since 1991. In a typical setup, there are three areas: the computer area with typically 3-5 computers, each running a stand-alone version (i.e. not connected with the internet) of the participating chatbot, an area for the human judges, typically four persons, and another area for the ‘confederates’, typically 3-5 voluntary humans, dependent on the number of chatbot participants. The human judges, working on their own terminal separated from one another, engage in a conversation with a human or a computer through the terminal, not knowing whether they are connected to a computer or a human. Then, they simply start to interact. The organizing committee requires that conversations are restricted to a single topic. The task for the human judges is to recognize chatbot responses and distinguish them from conversations with humans. If the judges cannot reliably distinguish the chatbot from the human, the chatbot is said to have passed the test.
A.L.I.C.E. was written within the frame of Artificial Intelligence Markup Language (AIML), an open standard for creating any kind of chatbot, also developed by Wallace. Most AIML interpreters are offered under a free or open source license. Therefore, many “Alicebot clones” populate the internet, having been created based upon the original implementation of A.L.I.C.E. and its AIML knowledge base. This video shows a speech as given by dr. Wallace about A.L.I.C.E., AIML and the chatbot history in general.
In a particularly alarming example of unexpected consequences, the bots soon began to devise their own language – in a sense. After being online for a short time, researchers discovered that their bots had begun to deviate significantly from pre-programmed conversational pathways and were responding to users (and each other) in an increasingly strange way, ultimately creating their own language without any human input.
Enter Roof Ai, a chatbot that helps real-estate marketers to automate interacting with potential leads and lead assignment via social media. The bot identifies potential leads via Facebook, then responds almost instantaneously in a friendly, helpful, and conversational tone that closely resembles that of a real person. Based on user input, Roof Ai prompts potential leads to provide a little more information, before automatically assigning the lead to a sales agent.
The idea was to permit Tay to “learn” about the nuances of human conversation by monitoring and interacting with real people online. Unfortunately, it didn’t take long for Tay to figure out that Twitter is a towering garbage-fire of awfulness, which resulted in the Twitter bot claiming that “Hitler did nothing wrong,” using a wide range of colorful expletives, and encouraging casual drug use. While some of Tay’s tweets were “original,” in that Tay composed them itself, many were actually the result of the bot’s “repeat back to me” function, meaning users could literally make the poor bot say whatever disgusting remarks they wanted.
Sometimes it is hard to discover if a conversational partner on the other end is a real person or a chatbot. In fact, it is getting harder as technology progresses. A well-known way to measure the chatbot intelligence in a more or less objective manner is the so-called Turing Test. This test determines how well a chatbot is capable of appearing like a real person by giving responses indistinguishable from a human’s response.
The classic historic early chatbots are ELIZA (1966) and PARRY (1972). More recent notable programs include A.L.I.C.E., Jabberwacky and D.U.D.E (Agence Nationale de la Recherche and CNRS 2006). While ELIZA and PARRY were used exclusively to simulate typed conversation, many chatbots now include functional features such as games and web searching abilities. In 1984, a book called The Policeman's Beard is Half Constructed was published, allegedly written by the chatbot Racter (though the program as released would not have been capable of doing so).