Let’s imagine two near-identical scenarios. In both scenarios you go online to a service provider – for example, a utility company – open up a live chat service, ask a question, and receive an answer. Both cases take exactly the same amount of time and give exactly the same information. The only difference is that in the first case (call it Case A) you were talking to a person, while in Case B you were interacting with an artificially intelligent chatbot.
The question is, does it make any difference?
The Turing Test
The Turing test is one of the most famous thought experiments in philosophy. In fact, it is one of the most referenced papers in academia. The pioneering computer scientist, Alan Turing, posed a simple test called the Imitation Game. (His life was dramatised in the multi-award winning biopic The Imitation Game) In this test, there are 3 subjects. We will call them Alice, Bernard, and Charlie. Alice cannot see either of the other two, so must ask questions to both Bernard and Charlie via written notes which are passed through the door to two separate rooms. Both Bernard and Charlie must answer these questions, and then write out their responses, giving them back to Alice. However, there is a twist: either Bernard or Charlie may in fact be an artificially intelligent computer that is communicating to Alice through a human. It is up to Alice to try to figure out which, if any, of the other subjects is actually a non-human.
The purpose of this is to answer a fundamental question ‘Can computers think?’. However, Turing recognised that this was a fairly impossible question because we can never agree on what it means to ‘think’ in any meaningful way. Instead, the Imitation Game example seeks to answer the more simple question ‘Are there imaginable digital computers which would do well in the imitation game?’ Turing thought that there were, and he has been proved right if perhaps a little optimistic. In 1950 he wrote, “I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.”
The key to Turing’s test is that it does not disadvantage the computer by its inability to “win a beauty pageant”, or the human by their inability to perform quick mathematics. Instead it is a judgement on the ability for a computer to appear indistinguishable from a human. This is mostly discussed in terms of linguistics, but can be applied to AI of almost any function.
We are now nearing the age that Turing imagined, where machines could be made indistinguishable from humans, and we can run our own Turing Test on AI to find out how well they can mimic human behaviour. Many chatbots have a whole range of responses programmed into them that seem irrelevant to their function, but ultimately these are often a way of trying to prevent a human being fully aware that they are talking to a bot. Alternatively some bots are programmed to make light of the fact that they are bots, for example, Siri will make jokes, or explicitly reference that it is a bot, whereas some AI chatbots will try and hide it.
Computers are very good at fulfilling highly specific tasks, and so there are some tasks where it can be naturally difficult to know whether you are interacting with a bot or a human. Chess master Gary Kasparov claimed that IBM’s Deep Blue, a computer designed to beat him at chess, showed ‘elements of deep intelligence and creativity’. This suggested that humans were actually manipulating the play of the computer, rather than using existing code. While an expert like Kasparov might be able to identify subtleties in chess-playing, most of us are not sufficiently knowledgeable to subject a chess computer to a Turing test.
On the other hand, almost all humans are experts in language. We’re extremely skilled in interpreting language and understanding phrases even when there are errors in the formulation of a sentence, (syntactic) or in word choice (semantic). Computers are currently far less good at this, and it takes a lot of time and money to get them up to scratch.
The Loebner Prize, awarded to the most convincing chatbot, uses a series of questions to assess the ability of a bot to replicate human function. What is interesting is that many of the questions it asks are very easy for humans to understand and respond to, but extremely difficult for bots.
This can range to something as simple as “How many letters are in the word ‘abracadabra’” or “If a chicken roosts with a fox they may be eaten. What may be eaten?”. As bots continue to use machine learning to fine-tune their responses this will undoubtedly improve, but it is a good demonstration of the current limitations of the technology.
AI Customer Service Chatbots and User Experience
When it comes to customer service there is a question about what service we want to receive. From a user’s perspective we want to get the most salient information, as quickly as possible, so that we can solve whatever problem we had and get on with the rest of our day. We all know how annoying it can be to be stuck on hold listening to a tinny rendition of Bach!
However, when it comes to AI chatbots there seems to be a certain squeamishness about using them that goes deeper than any faults they might currently have. Going back to our original problem of Case A and Case B, some people might still say that it does matter if one of them is a bot and one a human, even if you cannot tell the difference. Part of this may be that they want an authentic ‘human’ experience, which they feel a bot cannot deliver. Even when AI technology is perfectly capable of replicating human interaction with things like jokes and dialect matching, some people still prefer to talk to a human.
Another reason for that some people are anxious about the roll-out of AI is that they might feel they cannot be sure who they are talking to. This is especially true of cases where we might be asked to hand over sensitive information and data, and as customers it is important that we know who we are talking to. Customers tend to be cautious, particularly with new and unfamiliar technology, so many people will need to be given peace of mind when it comes to being sure that a bot they are talking to is secure.
Should customer service chatbots appear human?
This year Google launched an automated personal voice assistant; Google Duplex, which is able to place calls to do things like book a table at a restaurant or a haircut. However, controversy sparked with many people questioning the ethics around whether Google should explicitly state that the user was talking to a bot. Duplex’s ability to replicate natural language, even including fillers like ‘um’ and ‘er’ to make it sound more human, is undoubtedly impressive and due to this, it is extremely difficult to tell whether it is infact a human or bot. The only indication that you might be speaking to a bot is the introductory ‘I’m calling on behalf of Google, this call might be recorded’.
Of course AI customer service bots do have the advantage that users are mostly not trying to trick them into revealing that they are bots, but use them to get answers to queries. Deep Blue was excellent at chess but incapable of passing a verbal Turing Test. Machine intelligence is highly specific, which is why even though there have been very few bots that can pass the Turing Test we are able to use AI in much of our lives without being put off by their lack of humanity.
The second approach is to immediately flag that the bot is a bot. This is especially helpful with something like Siri or Alexa where the user knows that they are not talking to a human, and the conventions like naming and bones of personality can be very minimal and self-aware. It will also stop people from feeling like they have been in some way misled into believing they are talking to a human rather than a bot.
As people become more familiar with AI technology perhaps the need to pretend to have humans will recede, in a similar way that we have become accustomed to other mass-produced items like clothes. Customers may quickly realise the benefits that AI chatbots can bring in solving queries that are routine or fairly simple, meaning that resources can be freed up to allow humans to deal with the more challenging and complex problems. Meanwhile, the technology will also improve so that it will become increasingly harder to tell the difference, ironically meaning that it will be both easier to do, and less necessary.
Ultimately if you are interested in using an AI chatbot, many of these problems are little more than academic, as the vast majority of users are interested in the end result. As a provider of customer service of your chatbot can give them relevant, reliable, and quick answers to most of their problems. They can then be transferred to a specialist for anything that requires a great level of aptitude. Most people will not mind whether they are talking to a bot or a human, but most of us would certainly like to be able to tell. Perhaps the best way of dealing with the minority of people who will try and test the ability of your bot with trick questions is to disarm them by putting in jokes, as Siri does.
Do you relish a the greater use of AI chatbots, or do you have concerns about how they might work? Is it important to you that you know if you are talking to a robot or a human, or is it more important to quickly solve your problems? Why not tweet us and join in the conversation @webuild_bots.