Earlier, I made a rather lazy joke with a reference to the Terminator movie franchise, in which an artificial intelligence system known as Skynet becomes self-aware and identifies the human race as the greatest threat to its own survival, triggering a global nuclear war by preemptively launching the missiles under its command at cities around the world. (If by some miracle you haven’t seen any of the Terminator movies, the first two are excellent but I’d strongly advise steering clear of later entries in the franchise.)
The idea was to permit Tay to “learn” about the nuances of human conversation by monitoring and interacting with real people online. Unfortunately, it didn’t take long for Tay to figure out that Twitter is a towering garbage-fire of awfulness, which resulted in the Twitter bot claiming that “Hitler did nothing wrong,” using a wide range of colorful expletives, and encouraging casual drug use. While some of Tay’s tweets were “original,” in that Tay composed them itself, many were actually the result of the bot’s “repeat back to me” function, meaning users could literally make the poor bot say whatever disgusting remarks they wanted.
If a text-sending algorithm can pass itself off as a human instead of a chatbot, its message would be more credible. Therefore, human-seeming chatbots with well-crafted online identities could start scattering fake news that seem plausible, for instance making false claims during a presidential election. With enough chatbots, it might be even possible to achieve artificial social proof.
The most widely used anti-bot technique is the use of CAPTCHA, which is a form of Turing test used to distinguish between a human user and a less-sophisticated AI-powered bot, by the use of graphically-encoded human-readable text. Examples of providers include Recaptcha, and commercial companies such as Minteye, Solve Media, and NuCaptcha. Captchas, however, are not foolproof in preventing bots as they can often be circumvented by computer character recognition, security holes, and even by outsourcing captcha solving to cheap laborers.
Efforts by servers hosting websites to counteract bots vary. Servers may choose to outline rules on the behaviour of internet bots by implementing a robots.txt file: this file is simply text stating the rules governing a bot's behaviour on that server. Any bot that does not follow these rules when interacting with (or 'spidering') any server should, in theory, be denied access to, or removed from, the affected website. If the only rule implementation by a server is a posted text file with no associated program/software/app, then adhering to those rules is entirely voluntary – in reality there is no way to enforce those rules, or even to ensure that a bot's creator or implementer acknowledges, or even reads, the robots.txt file contents. Some bots are "good" – e.g. search engine spiders – while others can be used to launch malicious and harsh attacks, most notably, in political campaigns.
“Beware though, bots have the illusion of simplicity on the front end but there are many hurdles to overcome to create a great experience. So much work to be done. Analytics, flow optimization, keeping up with ever changing platforms that have no standard. For deeper integrations and real commerce like Assist powers, you have error checking, integrations to APIs, routing and escalation to live human support, understanding NLP, no back buttons, no home button, etc etc. We have to unlearn everything we learned the past 20 years to create an amazing experience in this new browser.” — Shane Mac, CEO of Assist