Creating a ChatBot

If you know me, you know I love a good challenge and the recent HCI assignment was just that.

The Assignment: create a chat bot for Slack that attempts a task that requires a human connection.

Choosing the task was probably the easiest part. Professor Li provided some examples, such as teaching the user Bayes rule, or counseling someone that is struggling emotionally, but one example really caught my eye: debate a contentious topic.

What sounds more fun than making a bot that can argue with its users? Whatever you answered, it's probably true, but that's debatable (get it?...cuz we're making a debate bot, haha).

Bad jokes aside, making a debate bot is super relevant and the contentious topics are plentiful, all thanks to today's divisive political climate. From abortion's pro-life/pro-choice debate to debates about trans* bathroom use, there is a plethora of options for us to choose from. One that has been reignited recently with his debut as a Nike sponsor is the debate about Colin Kaepernick and kneeling during the anthem. I won't go into any of the details, but long story short, it was a largely divisive and included issues like support for troops, police brutality, and the disproportional deaths of African Americans. The Colin Kaepernick debate became the issue that we chose for our chat bot to debate.

The relevance comes from the fact that Oxy is a bubble. Most of the community is left-leaning and it is often very difficult to hold a productive conversation with someone that has an opposing view. After taking a few sociology classes and seeing the political activism on campus, I know that the students have opinions, however, they do not share them for a wide range of reasons, usually because they fear social ridicule for having a different opinion. If they do share, it is often never with anyone that they know shares the opposite opinion, keeping them from considering the arguments being presented from the other side of the table.

With all of this in mind, we decided to make the chat bot argue the conservative side of the Colin Kaepernick issue. The bot would be designed for the liberal minded students on the campus that wanted to hear the conservative side of the debate and feared striking up that conversation with a real person. The chat bot would be able to facilitate a more productive conversation by removing judgement from the conversation. In addition, we think that the opportunity to type responses to a chat bot gives the user the opportunity to think through their response and removes the immediacy of a response that is often felt when debating with a real person. In addition to conducting a more productive conversation, we felt that this was a viable direction given that we were all liberal minded people and that it would be easier to test the argumentation of the bot given our own repository of arguments in favor of Colin Kaepernick.

Our first iteration didn't utilize the concept of states, rather it led the user through a set question and answer route. Before we started coding, we brainstormed how the general conversation was going to go (pictured below).



After creating the flow chart, we tried a series of if-else statements with some list indexing to handle tags. After getting this initial code to run, Leo Connelly, a member of the group working on this project with me, had our friend Zach test the bot for us. He is a liberal-minded Senior who holds a considerable amount of knowledge and understanding of the debate.

For all of the user tests, we started by laying out the scenario, instructing them to pretend that we weren't there and letting them know that the test was purely for observational reasons. They also understood that they couldn't be wrong, only the bot could and that we couldn't answer any of their questions while the test was in progress. Further, we laid out the expectations and the scenario of the bot before they began.

Leo handled the first user test with Zach. Here's how the test went (the responses from each person/bot will be in italics with intermittent comments about the interaction in a regular type face)

Leo: Zach, you will be testing a chat bot that knows a little bit about politics.

Zach: What should I ask it?

L: It's better to ask the bot.

Chat Bot: Hello and God bless. Let's chat. Tell me are you an all American Republican, an undecided, or one of them Democrats.

Z: Democrat

CB:  Sorry try again. I may be a red blooded American but I'm a simple chat bot who craves simple answers.

Zach used a capital 'D'  which triggered the bot to miss the declaration of party even though it knew what democrat was because the tag was case sensitive. His response triggered our fallback script which fires when the bot has no clue how to respond.

Z: democrat

CB: Oh, sounds like you're Democrat. This whole free speech and protest thing has really got a lot of people hot and bothered. Do you think free speech means people can disrespect the flag?

Z: No, they can do what they want.

CB: It's not fair to soldiers who risk their lives every day to keep us safe. The only blue I respect is on my flag! I don't think we're getting anywhere with this. 

Zach's response triggered an automatic response that would have run no matter what his answer to the first question was.

CB: What do you think about the NFL's new policy which prohibits protests?

Z:  Players shouldn't be told what to do.

CB: In my opinion, the players are getting paid millions of dollars to be out there. They should shut up and do their job. If a player protests, they should get benched for a year. Do you agree?

Z: no I don't

CB: Now do you thik Colin Kaepernick should be let back into the league?

Z: (out loud) Wow so it just ignored me. (to the bot) He's a good quarterback he should play

CB: I gotta run, thanks for chatting!

Z: (out loud) What that's it?

After concluding his 'conversation', Leo talked with Zach about his experience. Zach reported having no freedom when using the chat bot and felt like he couldn't actually have a conversation. He said it felt like a job interview rather than an actual conversation and that he felt no connection to the chat bot. The responses felt natural to him, but the forced order removed any sort of humanity from the bot. Lastly, Zach felt that the bot was aggressive and didn't think that he could have a civil conversation with it if he tried.

The first user test revealed a few important concepts. The on-rails model of a debate conversation was not working. The bot was reacting to Zach's answers, but it wasn't responding to them. That is it was able to run through it's code, but it wasn't taking into account anything that Zach was saying. The bot was also too aggressive which was the opposite of our original goal: to make a bot that held a civil debate.

At this point, it also became very clear how difficult it is to code discourse. More specifically, we were sure of our own responses, but we constantly ran into issues as we tried to form the responses from the other side. The internet is a wonderful place, though, and quick searches on Google revealed a lot of responses that we could use. Unfortunately, it was difficult to implement them word-for-word because many of them included problematic language that went against the kind of discourse we were trying to achieve. It took some analytical thinking to understand what they were actually getting at with their argument and to create responses that would support the opposition, but weren't disrespectful or problematic.

With the first user test under our belt, we took a second stab at the conversation flow.



We narrowed the topic from the general flag debate - which has many different routes - to more specifically prompting the conversation about the right to kneel during the anthem. This still had many different directions, but we were able to narrow the potential arguments to four categories that we felt were the most relevant the debate: those about free speech, those about police (as they related to Black Lives Matter motivation for Kaepernick kneeling), those about employer (the NFL, the coaches, etc), and those about the flag (patriotism and the flag's symbolism).  Focusing on these topics allowed us to make sure that we made responses that covered the bulk of the arguments that a conservative-minded person might make. If the bot didn't recognize any of these topics in the user's response, it would default to asking the user to explain their reasoning, in hopes that the user would address a topic it knew about.

To help make the bot listen to what the user was saying, we did a few things. First, we gave it a greeting response. When we started the bot to do our own tests, it wasn't responding to any type of greeting, such as 'hello', that made it feel devoid less human. Second, we decided to take some time to understand Professor Li's code. Unlike our series of if-else statements, his code utilized a series of methods that switched the bot between 'states.' In our bot, the 'states' corresponded to the four topics we defined above so that the bot knew what topic the user addressed in their response. Given a response, the bot parses through the responses and tries to match individual words or word phrases with 'tags' that we curated. The tags either map to an opinion (ie. 'I don't think' mapping to 'no'), a topic (ie 'right to protest mapping to 'free speech), a political party (i.e. 'I lean left' mapping to 'Democrat') or general responses (i.e. 'thank you' or 'hello').  In this way, we were able to get the bot to 'listen' to the user's response and understand the users opinion on the opinions the bot presents as well as what topic the user is referring to. To allow the bot to know the context the user's response, we gave the bot a variable that held the tag corresponding to the bot's previous response. For example, the bot starts off the debate by saying,

"Okay, so this whole kneeling for the flag thing has gotten out of hand. Colin Kaepernick shouldn't have started this whole movement. It's super disrespectful."

which we tagged as 'free speech.' Should the user respond and reference the Black Lives Matter movement, the bot would respond with:

"I suppose the police aren't perfect, but their lives matter too. Besides, he took a public stance on a controversial issue, why is anyone surprised that there was push back from the NFL?"

The responses also helped the bot be more conversational. One of the states the bot entered was the "unknown topic" state (i.e. if it knows the user's opinion, but not what topic they were referencing), in which the bot will say "Please explain." until the user talks about a topic it understands. In addition, if the user references a topic, but doesn't know whether or not you agreed or disagreed with its statement, it will enter the "unknown opinion" state and ask for clarification: "I don't understand, do you agree or disagree?" Both of these are intended to give the user the space to flesh out their responses, which is necessary when talking to someone who doesn't share your opinions. Further, since the bot is arguing in opposition, it only gives a response if the user disagrees with the bot. Lastly, we gave the bot the ability to keep track of how many times the user touched on a topic. As with any debate, the person you are talking to would probably get annoyed if you kept bringing up a topic that was already discussed. For that reason, we allowed the bot to respond up to the third reference of a topic at which point the bot would prompt the user to change topics.

When we got this bot working, we recruited our friend Joey to perform a second user test. Joey is another liberal-minded sports fan who is very familiar with the debate. Joey was prepped in the same manner as Zach before proceeding with the test. Here is a video of the beginning of his second interaction with the bot. There was a bug in the code that kept it in a loop during the first test.


Joey had to leave shortly after the test, but we still learned from his interaction with the bot. The tags were sufficient for his conversation, the bot had a more conversational feel, and it worked.

It wasn't perfect though, so we spent some time searching the web for more tags to use to make sure that the bot worked for more variations of the conversation. After some final debugging and self-testing, we felt comfortable enough with the performance of the bot to call it done.

We uploaded the code to Github, and created a Slack app for our classmates to interact with it.

Little did we know, the early morning hours had one more surprise in store.



As we walked towards the library exit to enjoy some victory sleep, we got a glimpse of the white board still faithfully holding our second flow chart conversation, but our notes were no longer the only contents. Someone had took the time to write their own responses to our bot making this sort of a user test in and of itself. We were ecstatic to stumble upon this as it supported a lot of the assumptions we made for making the bot:

  1. People are able to give their response more thoroughly when they have the time and space to write them out instead of having to give a response in the moment. 
  2. It's difficult to have a civil conversation about social issues when one person has the minority opinion.
Our chat bot provides a solution to the present problem. It's clear that any person on this campus that holds the opposite view is likely to be unfairly labelled as a bigot. One of our group members also responded in a way that supported our assumptions. They were worried that the person that responded to the board overheard our conversations about what to make the bot say and made a connection to label them, and by association, all of us, as bigots. Yes, we had to think about responses from the perspective of a conservative person, but it was uncomfortable for us to put ourselves in the mindset of the opposition, and we not agree with the responses we wrote. 

This assignment properly challenged all of us. It forced us to step out of our comfort zone in terms of code and ideology. We learned not only how difficult it is to create a chat bot, but also how to make a chat bot that feels human. Thanks to the person that spent some time to respond to our analog bot and helping us hone in what our bot should understand. That being said, it is nowhere near perfect, but good things take time. Maybe we'll visit it later, maybe not, but for now, I walk away understanding how large a problem human conversation is for HCI and with more confidence in my coding ability than ever before.










Comments