Click here to Skip to main content
Click here to Skip to main content

Chatbot Tutorial

, 14 Apr 2014
Rate this:
Please Sign up or sign in to vote.
Tutorial on making an artificial intelligence chatbot

Overview

A step by step guide to implement your own Artificial Intelligence chatbot.

Table of contents

  1. Introduction - Chatbot description (first example)
  2. Introducing keywords and stimulus response
  3. Preprocessing the user's input and repetition control
  4. A more flexible way for matching the inputs
  5. Using classes for a better implementation
  6. Controlling repetition made by the user
  7. Using "states" to represent different events
  8. Keyword boundaries concept
  9. Using Signon messages
  10. "Keyword Ranking" concept
  11. Keyword equivalence concept
  12. Transposition and template response
  13. Keyword location concept
  14. Handling context
  15. Using Text To Speech
  16. Using a flat file to store the database
  17. A better repetition handling algorithm
  18. Updating the database with new keywords
  19. Saving the conversation Logs
  20. Learning capability

Introduction

Basically a chatterbot is a computer program that when you provide it with some inputs in Natural Language (English, French ...) responds with something meaningful in that same language. Which means that the strength of a chatterbot could be directly measured by the quality of the output selected by the Bot in response to the user. By the previous description, we could deduce that a very basic chatterbot can be written in a few lines of code in a given specific programming language. Lets make our first chatterbot (notice that all the codes that will be used in this tutorial will be written in C++. Also, it is assumed that the reader is familiar with the STL library) This tutorial is also available in the following languages: Java, Visual Basic, C#, Pascal, Prolog and Lisp

//
// Program Name: chatterbot1
// Description: this is a very basic example of a chatterbot program
//
// Author: Gonzales Cenelia
//

#include <iostream>
#include <string>
#include <ctime>

int main()
{
    std::string Response[] = {
        "I HEARD YOU!",
        "SO, YOU ARE TALKING TO ME.",
        "CONTINUE, I’M LISTENING.",
        "VERY INTERESTING CONVERSATION.",
        "TELL ME MORE..."
    };

    srand((unsigned) time(NULL));

    std::string sInput = "";
    std::string sResponse = "";

    while(1) {
        std::cout << ">";
        std::getline(std::cin, sInput);
        int nSelection = rand() % 5;
        sResponse = Response[nSelection];
        std::cout << sResponse << std::endl;
    }

    return 0;
}
As you can see, it doesn't take a lot of code to write a very basic program that can interact with a user but it would probably be very difficult to write a program that would really be capable of truly interpreting what the user is actually saying and after that would also generate an appropriate response to it. These have been a long term goal since the beginning and even before the very first computers were created. In 1951,the British mathematician Alan Turing has came up with the question Can machines think and he has also propose a test which is now known as the Turing Test. In this test, a computer program and also a real person is set to speak to a third person (the judge) and he has to decide which of them is the real person. Nowadays, there is a competition that was named the Loebner Prize and in this competition bots that has successfully fool most of the judge for at list 5 minutes would win a prize of 100.000$. So far no computer program was able to pass this test successfully. One of the major reasons for this is that computer programs written to compute in such contest have naturally the tendency of committing a lot of typo (they are often out of the context of the conversation). Which means that generally, it isn't that difficult for a judge to decide whether he is speaking to a "computer program" or a real person. Also, the direct ancestor of all those program that tries to mimic a conversation between real human beings is Eliza, the first version of this program was written in 1966 by Joseph Weizenbaum a professor of MIT.

Chatbots in general are considered to belong to the weak AI field (weak artificial intelligence) as opposed to strong a.i who's goal is to create programs that are as intelligent as humans or more intelligent. But it doesn't mean that chatbots do not have any true potential. Being able to create a program that could communicate the same way humans do would be a great advance for the AI field. Chatbot is this part of artificial intelligence which is more accessible to hobbyist (it only take some average programming skill to be a chatbot programmer). So, programmers out there who wanted to create true AI or some kind of artificial intelligence, writing intelligent chatbots is a great place to start!

Now, let's get back to our previous program, what are the problems with it?

Well, there is a lot of them. First of all, we can clearly see that the program isn't really trying to understand what the user is saying but instead he is just selecting a random response from his database each time the user type some sentence on the keyboard. And also, we could notice that the program repeat himself very often. One of the reason for this is because of the size of the database which is very small (5 sentences). The second thing that would explain the repetitions is that we haven't implemented any mechanism that would control this unwanted behavior.

How do we move from a program that just select responses randomly to whatever input that the user might enter on the keyboard to a program that shows some more understanding of the inputs?

The answer to that question is quiet simple; we simply need to use keywords.

A keyword is just a sentence (not necessarily a complete one) or even a word that the program might recognize from the user's input which then makes it possible for the program to react to it (ex: by printing a sentence on the screen). For the next program, we will write a knowledge base or database, it will be composed of keywords and some responses associated to each keyword.

so, now we know what to do to improve "our first chatterbot" and make it more intelligent. Let’s proceed on writing "our second bot", we will call it chatterbot2.

//
// Program Name: chatterbot2
// Description: this is an improved version
// of the previous chatterbot program "chatterbot1"
// this one will try a little bit more to understand what the user is trying to say
//
// Author: Gonzales Cenelia
//

#pragma warning(disable: 4786)

#include <iostream>
#include <string>
#include <vector>
#include <ctime>

const int MAX_RESP = 3;

typedef std::vector<std::string> vstring;

vstring find_match(std::string input);
void copy(char *array[], vstring &v);


typedef struct {
    char *input;
    char *responses[MAX_RESP];
}record;

record KnowledgeBase[] = {
    {"WHAT IS YOUR NAME", 
    {"MY NAME IS CHATTERBOT2.",
     "YOU CAN CALL ME CHATTERBOT2.",
     "WHY DO YOU WANT TO KNOW MY NAME?"}
    },

    {"HI", 
    {"HI THERE!",
     "HOW ARE YOU?",
     "HI!"}
    },
    
    {"HOW ARE YOU",
    {"I'M DOING FINE!",
    "I'M DOING WELL AND YOU?",
    "WHY DO YOU WANT TO KNOW HOW AM I DOING?"}
    },

    {"WHO ARE YOU",
    {"I'M AN A.I PROGRAM.",
     "I THINK THAT YOU KNOW WHO I'M.",
     "WHY ARE YOU ASKING?"}
    },

    {"ARE YOU INTELLIGENT",
    {"YES,OFCORSE.",
     "WHAT DO YOU THINK?",
     "ACTUALY,I'M VERY INTELLIGENT!"}
    },

    {"ARE YOU REAL",
    {"DOES THAT QUESTION REALLY MATERS TO YOU?",
     "WHAT DO YOU MEAN BY THAT?",
     "I'M AS REAL AS I CAN BE."}
    }
};

size_t nKnowledgeBaseSize = sizeof(KnowledgeBase)/sizeof(KnowledgeBase[0]);


int main() {
    srand((unsigned) time(NULL));

    std::string sInput = "";
    std::string sResponse = "";

    while(1) {
        std::cout << ">";
        std::getline(std::cin, sInput);
        vstring responses = find_match(sInput);
        if(sInput == "BYE") {
            std::cout << "IT WAS NICE TALKING TO YOU USER, SEE YOU NEXTTIME!" << std::endl;  
            break;
        } 
        else if(responses.size() == 0)  {
            std::cout << "I'M NOT SURE IF I  UNDERSTAND WHAT YOU  ARE TALKING ABOUT." << std::endl;
        }
        else {
            int nSelection = rand()  % MAX_RESP;
            sResponse =   responses[nSelection]; std::cout << sResponse << std::endl; 
        } 
    } 

    return 0;
}
    
// make a  search for the  user's input 
// inside the database of the program 
vstring find_match(std::string  input) { 
    vstring result;
    for(int i = 0; i < nKnowledgeBaseSize;  ++i) {  
        if(std::string(KnowledgeBase[i].input) == input) { 
            copy(KnowledgeBase[i].responses, result); 
            return result;
        } 
    } 
    return result; 
}

void copy(char  *array[], vstring &v) { 
    for(int i = 0;  i < MAX_RESP; ++i) {
        v.push_back(array[i]);
    }
}

Now, the program can understand some sentences like "what is your name", "are you intelligent" etc And also he can choose an appropriate response from his list of responses for this given sentence and just display it on the screen. Unlike the previous version of the program (chatterbot1) Chatterbot2 is capable of choosing a suitable response to the given user input without choosing random responses that doesn't take into account what actually the user trying to say.

We’ve also added a couple of new techniques to theses new program: when the program is unable to find a matching keyword the current user input, it simply answers by saying that it doesn't understand which is quiet human like.

What can we improve on these previous Chatbot to make it even better?

There are quiet a few things that we can improve, the first one is that since the chatterbot tends to be very repetitive, we might create a mechanism to control these repetitions. We could simply store the previous response of that Chatbot within a string sPrevResponse and make some checkings when selecting the next bot response to see if it's not equal to the previous response. If it is the case, we then select a new response from the available responses.

The other thing that we could improve would be the way that the chatbot handles the users inputs, currently if you enter an input that is in lower case the Chatbot would not understand anything about it even if there would be a match inside the bot's database for that input. Also if the input contains extra spaces or punctuation characters (!;,.) this also would prevent the Chatbot from understanding the input. That's the reason why we will try to introduce some new mechanism to preprocess the user’s inputs before it can be search into the Chatbot database. We could have a function to put the users inputs in upper case since the keywords inside the database are in uppercase and another procedure to just remove all of the punctuations and extra spaces that could be found within users input. That said, we now have enough material to write our next chatterbot: "Chattebot3". View the code for Chatterbot3

What are the weaknesses with the current version of the program?

Clearly there are still many limitations with this version of the program. The most obvious one would be that the program use "exact sentence matching" to find a response to the user's input. This means that if you would go and ask him "what is your name again", the program will simply not understand what you are trying to say to him and this is because it was unable to find a match for this input. And this definitely would sound a little bit surprising considering the fact that the program can understand the sentence "what is your name".

How do we overcome this problem?

There are at list two ways to solve this problem, the most obvious one is to use a slightly more flexible way for matching keywords in the database against the user's input. All we have to do to make this possible is to simply aloud keywords to be found within the inputs so that we will no longer have the previous limitation.

The other possibility is much more complex, it use's the concept of Fuzzy String Search. To apply this method, it could be useful at first to break the inputs and the current keyword in separate words, after that we could create two different vectors, the first one could be use to store the words for the input and the other one would store the words for the current keyword. Once we have done this we could use the Levenshtein distance for measuring the distance between the two word vectors. (Notice that in order for this method to be effective we would also need an extra keyword that would represent the subject of the current keyword).

So, there you have it, two different methods for improving the chatterbot. Actually we could combine both methods and just selecting which one to use on each situation.

Finally, there are still another problem that you may have noticed with the previous chatterbot, you could repeat the same sentence over and over and the program wouldn't have any reaction to this. We need also to correct this problem.

So, we are now ready to write our fourth chatterbot, we will simply call it chatterbot4. View the code for Chatterbot4

As you probably may have seen, the code for "chatterbot4" is very similar to the one for "chatterbot3" but also there was some key changes in it. In particular, the function for searching for keywords inside the database is now a little bit more flexible. So, what next? Don’t worry; there are still a lot of things to be covered.

What can we improve in chatterbot4 to make it better?

Here are some ideas

  • since the code for the chatterbots have started to grow, it would be a good thing to encapsulate the implementation of the next chatterbot by using a class.
  • also the database is still much too small to be capable of handling a real conversation with users, so we will need to add some more entries in it.
  • it may happen sometimes that the user will press the enter key without entering anything on the keyboard, we need to handle this situation as well.
  • the user might also try to trick the chatterbot by repeating his previous sentence with some slight modification, we need to count this as a repetition from the user.
  • and finally, pretty soon you will also notice that we might need a way for ranking keywords when we have multiple choices of keywords for a given input, we need a way for choosing the best one among them.
That said, we will now start to write the implementation for chatterbot5. Download Chatterbot5

Before proceeding to the next part of this tutorial, you are encouraged to try compiling and running the code for "chatterbot5" so that you can understand how it works and also to verifies the changes that have been made in it. Has you may have seen, the implementation of the "current chatterbot", is now encapsulated into a class, also, there has been some new functions added to the new version of the program.

We will now try to discuss the implementation of "chatterbot5"

  • select_response(): this function selects a response from a list of responses, there is a new helper function that was added to the program shuffle, this new function shuffles a list of strings randomly after seed_random_generator() was called.
  • save_prev_input(): this function simply saves the current user input into a variable (m_sPrevInput) before getting some new inputs from the user.
  • void save_prev_response(): the function save_prev_response() saves the current response of the chatterbot before the bot have started to search responses for the current input, the current responsesis save in the varaible (m_sPrevResponse).
  • void save_prev_event(): this function simply saves the current event (m_sEvent) into the variable (m_sPrevEvent). An event can be when the program has detected a null input from the user also, when the user repeats himself or even when the chatterbot makes repetitions has well etc.
  • void set_event(std::string str): sets the current event (m_sEvent)
  • void save_input(): makes a backup of the current input (m_sIntput) into the variable m_sInputBackup.
  • void set_input(std::string str): sets the current input (m_sInput)
  • void restore_input(): restores the value of the current input (m_sInput) that has been saved previously into the variable m_sInputBackup.
  • void print_response(): prints the response that has been selected by the chat robot on the screen.
  • void preprocess_input(): this function does some preprocessing on the input like removing punctuations, redundant spaces charactes and also it converts the input to uppercase.
  • bool bot_repeat(): verifies if the chatterbot has started to repeat himself.
  • bool user_repeat(): Verifies if the user has repeated his self.
  • bool bot_understand(): Verifies that the bot understand the current user input (m_sInput).
  • bool null_input(): Verifies if the current user input (m_sInput) is null.
  • bool null_input_repetition(): Verifies if the user has repeated some null inputs.
  • bool user_want_to_quit(): Check to see if the user wants to quit the current session with the chatterbot.
  • bool same_event(): Verifies if the current event (m_sEvent) is the same as the previous one (m_sPrevEvent).
  • bool no_response(): Checks to see if the program has no response for the current input.
  • bool same_input(): Verifies if the current input (m_sInput) is the same as the previous one (m_sPrevInput).
  • bool similar_input(): Checks to see if the current and previous input are similar, two inputs are considered similar if one of them is the substring of the other one (e.g.: how are you and how are you doing would be considered similar because how are you is a substring of how are you doing.
  • void get_input(): Gets inputs from the user.
  • void respond(): handles all responses of the chat robot whether it is for events or simply the current user input. So, basically, these function controls the behaviour of the program.
  • find_match(): Finds responses for the current input.
  • void handle_repetition(): Handles repetitions made by the program.
  • handle_user_repetition(): Handles repetitions made by the user.
  • void handle_event(std::string str): This function handles events in general.

You can clearly see that "chatterbot5" have much more functionalities than "chatterbot4" and also each functionalities is encapsulated into methods (functions) of the class CBot but still there are a lot more improvements to be made on it too.

Chattebot5 introduce the concept of "state", in these new version of the Chatterbot, we associate a different "state" to some of the events that can occur during a conversation. Ex: when the user enters a null input, the chatterbot would set itself into the "NULL INPUT**" state, when the user repeat the same sentence, it would go into the "REPETITION T1**" state, etc.

Also these new chatterbot uses a bigger database than the previous chatbot that we have seen so far: chatterbot1, chatterbot2, chatterbot3 ... But still, this is quiet insignificant due to the fact that most chatterbots in use today (the very popular ones) have a database of at least 10000 lines or more. So, this would definitely be one of the major goal that we might try to achieve into the next versions of the chatterbot.

But however for now, we will concentrate a little problem concerning the current chatterbot.

What exactly would be this problem?

Well, it's all about keyword boundaries, suppose that user enters the sentence: "I think not" during a conversation with the chatbot, naturally the program would look into his database for a keyword that would match the sentence, and it might found the keyword: "Hi", which is also a substring of the word "think", clearly this is an unwanted behaviour.

How do we avoid it?

Simply by putting a space character before and after the keywords that can be found inside the database or we can simply apply the changes during the matching process inside the "find_match() function".

Are there other things that we can improve in "Chatterbot5"?

Certainly there is. So far the Chatbot start a "chatting session" with the users without saying anything at the beginning of the conversations. It would be good if the chatterbot could say anything at all to startup the conversations. This can easily be achieved by introducing "sign on messages" into the program. We can simply do this by creating a new state inside the Chatbot "knowledge base" and by adding some appropriate message that links to it. That new state could be call "SIGNON**".

Introducing the concept of "Keyword Ranking"

As you can see, on each new version of the chatterbot, we are progressively adding new features in order to make the Chabot more realistic. Now, in these section, we a re going to introduce the concept of 'keyword ranking' into the Chatterbot. Keyword ranking is a way for the program to select the best keywords in his database when there are more than one keyword that match the users inputs. Ex: if we have the current user input: What is your name again, by looking into his database, the Chatbot would have a list of two keywords that match this input: 'WHAT' and 'WHAT IS YOUR NAME'. Which one is the best? Well, the answer is quiet simple, it is obviously: 'What is your name' simply because it is the longest keyword. These new feature has been implemented in the new version of the program: Chatterbot7.

Equivalent keywords

Within all the previous Chatterbots the record for the database aloud us to use only one keyword for each set of responses but sometimes it could be Useful to have more than one keyword associated to each set of responses. Specially when these keywords have the same meaning. E.g.: What is your name and Can you please tell me your name have both had the same meaning? So there would be no need to use different records for these keywords instead we can just modify the record structure so that it aloud us to have more than one keyword per records. Download Chatterbot8

Keyword transposition and template response

One of the well known mechanisms of chatterbots is the capacity to reformulate the user's input by doing some basic verb conjugation. Example, if the user enters: YOU ARE A MACHINE, the chatterbot might respond: So, you think that I'm a machine.

How did we arrive at this transformation? We may have done it by using two steps:

  • We make sure that the chatterbot have a list of response templates that is linked to the corresponding keywords. Responses templates are a sort of skeleton to build new responses for the chatterbot. usually we used wildcards in the responses to indicate that it is a template. On the previous example, we have used the template: (so, you think that*) to construct our response. During the reassembly process, we simply replace the wildcard by some part of the original input. In that same example, we have used: You are a machine, which is actually the complete original input from the user. After replacing the wildcard by the user's input, we have the following sentence: So, you think that you are a machine but we can not use these sentence as it is, before that we need to make some pronoun reversal in it.

  • The usual transpositions that we use mostly are the replacement of pronoun of the first person to pronoun of the second person, e.g.: you -> me, I'm -> you are etc. In the previous example by replacing "YOU ARE" by "I'M" in the users input, After applying these changes, the original sentence becomes: I'm a machine. Now we can replace the wildcard from the template by these new sentence which give us our final response for the Chatbot: So, you think that I'm a machine.

Notice that it's not a good thing to use transposition too much during a conversation, the mechanism would become too obvious and it could create some repetition.

Keyword location concept

Some keywords can be located anywhere in a given input, some others can only be found in only some specific places in the user's input otherwise it wouldn't make any sense. A keyword like: "Who are you" can be found anywhere on the user's input without creating any problems with the meaning of it.

Some examples of sentences using "WHO ARE YOU" would be:

  1. Who are you?
  2. By the way, who are you?
  3. So tell me, who are you exactly?

But a keyword such as "who is" can only be found at the beginning or in the middle of a given sentence but it can not be found at end of the sentence or alone.

Examples of sentences using the keyword: "who is":

  1. Who is your favorite singer?
  2. Do you know who is the greatest mathematician of all time?
  3. Tell me, do you know who is? (this clearly doesn't make any sense)

How do we make sure that the chatterbot will be able to distinguish such keywords and the specific places were they are aloud to be found on a sentence? We will simply introduce some new notations for keywords:

  1. Keywords that can only be found at the beginning or in the middle of a sentence will be represented by: _KEYWORD (Ex: _WHO IS)
  2. Keywords that can only be found at end or in the middle of a sentence will be denoted by: KEYWORD_ (WHAT ARE YOU_)
  3. Keywords that should only be found alone in a sentence will be represented by: _KEYWORD_ (Ex: _WHAT)
  4. And finally, keywords that can be found anywhere in a sentence or even alone would be simply represented by: KEYWORD (Ex: I UNDERSTAND)

A keyword can have different meanings depending on it's position in a given sentence.

Handling Context

Context a is way for the Chatterbot to keep in tract what it has said previously and being able to take this into account when selecting his next response. So far, every response selected by the Chatbot during a conversation is chosen only based on the current user's input. But sometimes, we might need more data in order to be able to respond properly to a given input, that's when we need to use context.

To illustrate these concept, we are going to look at the following conversation log:

USER: What is your favorite movie?
CHATTERBOT: IT IS TERMINATOR II.

USER: Why do you like this movie? (Now how are we supposed to answer that question if we knew nothing about the previous response of the Chatbot?)

So clearly, some inputs requires the usage of "context" in order to be able to formulate a correct answer. In the previous example, it would simply be: IT IS TERMINATOR II. Now the Bot knows what it was talking about previously, it can more easily formulate a good answer to the user's input.

We can now continue the previous conversation log:

(Context: IT IS TERMINATOR II)
CHATTERBOT: BECAUSE IT IS A SCIENCE-FICTION MOVIE AND I LOVE SCIENCE-FICTION.

Context also aloud us to control improper reaction from the Chatbot. Example, if the user enters the sentence: "Why do you like these movie?" during a conversation without the Chatterbot even talking about these subject. It could simply respond by saying: WHAT ARE YOU TALKING ABOUT?

The context feature has been implemented in Chatterbot11.

Another great feature that would be very interesting to implement into a Chatterbot is the capacity to anticipate the next response of the user, these would make the Chatbot looks even more smarter during a conversation.

Using Text To Speech

Wouldn't it be great if your computer could speak back to you when ever you order it to do something, we've accomplish just that in "Chatterbot12" the latest version of the program. Now the program can speak out every answer that is has selected after examining the user's input. The SAPI library from Microsoft was used in order to add the "Text To Speech" feature within the program. For the implementation part, three new functions were added to the program to implement the "Text To Speech" functionality: Initialize_TTS_Engine(), speak(const std::string text), Release_TTS_Engine().

  • Initialize_TTS_Engine(): These function as the name suggest initialized the "Text To Speech Engine" that is, we first start by initializing the "COM objects" since SAPI is build on top of the ATL library. If the initialization was successful, we then create an instance of the ISpVoice object that controlled the "Text To Speech" mechanism within the SAPI library by using the CoCreateInstance function. If that also was successful, it means that our "Text To Speech Engine" was initialized properly and we are now ready for the next stage: speak out the "response string"
  • speak (const std::string text): So, this is the main function that is used for implementing "Text To Speech" within the program, it basically takes the "response string" converted to wide characters (WCHAR) and then pass it to the "Speak method" of the "ISpVoice" object which then speak out the "bot's response".
  • Release_TTS_Engine(): Once we are done using the "SAPI Text To Speech Engine", we just release all the resources that has been allocated during the procedure.

Using a flat file to store the database

So far the, database was always built into the program which means when ever you modified the database, you would also have to recompile the program. This is not really convenient because it might happen sometimes that we only want to edit the database and keep the rest of the program as it is. For these reason and many others, it could be a good thing to have a separate file to store the database which then gives us the capability of just editing the database without having to recompile all the files in the program. To store the database we could basically use a simple text file with some specific notations to distinguish the different elements of the database (keywords, response, transpositions, context ...). In the current program, we will use the following notations that has been used before some implementation of the Eliza chatbot in Pascal.
  1. Lines that starts by "K" in the database will represent keywords.
  2. Lines that starts by "R" will represent responses
  3. Lines that starts by "S" will represent sign on messages
  4. Lines that starts by "T" will represent transpositions
  5. Lines that starts by "E" will represent possible corrections can be made after transposing the user's input
  6. Lines that starts by "N" will represent responses for empty input from the user
  7. Lines that starts by "X" will represent responses for when that chatbot did not find any matching keyword that match the current user input.
  8. Lines that starts by "W" will represent responses for when the user repeat itself.
  9. Lines that starts by "C" will represent the context of the chatbot's current response.
  10. Lines that starts by "#" will represent comments

We now have a complete architecture for the database, we just need to implement theses features into the next version of the chatbot (Chatterbot13).

A better repetition handling algorithm

In an effort to prevent the chatbot from repeating itself too much, previously we have use a very basic and simple algorithm that consist of comparing the current chatbot's response to the previous one. If the current response selection is equal to the previous one, we simply discard that response and look over for the next response candidate on the list of available responses. This algorithm is very efficient when it comes to control immediate repetitions from the chatbot. However, it's not that good to avoid more long term repetition. During a chatting session, the same response can occurs many times. With the new algorithm, we control how long it takes for the chatbot to reselect the same response. Actually we make sure that it has use all available response for the corresponding keyword before it can repeat the same response. This is in turn can improve the quality of the conversation exchanges. Here is a decryption on how the algorithm works: During the conversation between the chatbot and the user, we make a list of all the responses previously selected by the chat robot. When selecting a new response, we make a search of then current selected response inside the list starting from the end. If the current response candidate was found during that search within the list, we then make a comparison of that position the total number of available responses. if the position plus one is inferior to the total of available responses, we consider that it is a repetition, so we have to discard the current response and select another one.

Updating the database with new keywords

Sometimes, when it comes to add new keywords to the database, it could be difficult to choose those that are really relevant. However, there is a very simple solution to that problem. When chating with the chat robot, we just make sure that we store the user's input in a file (ex: unknown.txt) each time the chatbot was not able to find any matching keyword for the current input. Later on, when we need to make some keywords updates in the database, we just have to take a look at the file that we've use to save the unkown sentences found earlier during the previous conversations. By continuously adding new keywords using these procedure, we could create a very good database.
Download Chatterbot15

Saving the Conversation Logs

Why saving the conversations between the users and the chatbot? Because it could help us find the weakness of the chatbot during a given conversation. We might then decide on which modifications to make to the database in order to make the future conversations exchanges more natural. We could basically save the time and also the date to help us determine the progress of the chatbot after new updates were applied to it. Saving the logs helps us determine how human like is the conversation skill of the chatbot.

Learning Capability

So far, the chatbot was not able to learn new data from the users while chatting, it would be very useful to have this feature within the chatbot. It basically means that whenever the chatbot encounters an input that has no corresponding keyword, it would prompt the user about it. And in return the user would be able to add a new keyword and the corresponding response to it in the database of the chat robot, doing so can improve the database of the chabot very significantly. Here is how the algorithm should go:

  1. NO KEYWORD WAS FOUND FOR THIS INPUT, PLEASE ENTER A KEYWORD
  2. SO THE KEYWORD IS: (key)
  3. (if response is no)PLEASE REENTER THE KEYWORD (go back to step #2)
  4. NO RESPONSE WAS FOUND FOR THIS KEYWORD: (key) , PLEASE ENTER A RESPONSE
  5. SO, THE RESPONSE IS: (resp)
  6. (if response is no) PLEASE REENTER THE RESPONSE (go back to step #4)
  7. KEYWORD AND RESPONSE LEARNED SUCCESSFULLY
  8. IS THERE ANY OTHER KEYWORD THAT I SHOULD LEARN
  9. (if response is yes, otherwise continue chating): PLEASE ENTER THE KEYWORD (go back to step #2)

Return to beginning of the document

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Gonzales Cenelia
Help desk / Support Gexel Telecom
Canada Canada
I have been programming in C and C++ for more than four years, the first time that i had learn programming was in 1999 in college. However it was only by the year 2000 when i have buy my first computer that i had truly started to do some more interesting things in programming. As a programmer,my main interest is A.I programming. So i'm really captivated by all that is related to N.L.U (Natural Language Understanding), N.L.P (Natural Language Processing), Artificial Neural Networks etc. Currently i'm learning to program in Prolog and Lisp. Also,i'm really fascinated with the original chatterbot program named: Eliza,that program was wrote by Joseph Weizenbaum. Everytime i run this program,it makes me really think that A.I could be solve one day. A lot of interesting stuff has been accomplish in the domain of Artificial Intelligence in the past years. A very good example of those accomplishments is: Logic Programming,which makes it possible to manipulate logic statements and also to make some inferences about those statements. A classical example would be: given the fact that "Every man is mortal" and that Socrates is a man,than logically we can deduce that Socrates is mortal. Such simple logical statements can be wrote in Prolog by using just a few lines of code:
 
prolog code sample:
 
mortal(X):- man(X). % rule
man(socrates). % declaring a fact
 
the preceding prolog rule can be read: for every variable X,if X is a man than X is mortal. these last Prolog code sample can be easily extented by adding more facts or rules,example:
mortal(X):- man(X). % rule
mortal(X):- woman(X). % rule
man(socrates). % fact 1
man(adam). % fact 2
woman(eve). % fact 3
 
for more, check: http://www.ai-search.4t.com
ai-programming.blogspot.com

Comments and Discussions

 
GeneralRe: Good article except for the grammar and spelling PinmemberMember 937010020-Aug-12 12:34 
GeneralRe: Good article except for the grammar and spelling Pinmemberozbear21-Aug-12 12:00 
GeneralRe: Good article except for the grammar and spelling Pinmembertohagan33326-Sep-12 11:00 
GeneralRe: Good article except for the grammar and spelling PinmemberMember 937010026-Sep-12 14:25 
GeneralMy vote of 2 Pinmemberozbear22-Jun-09 13:07 
Generalgreate for AI beginners PinmemberK. Sant.16-Jun-09 20:17 
Generalthe chatbot tutorial is available in visual basic PinmemberGonzales Cenelia8-Jun-09 16:45 
GeneralRe: the chatbot tutorial is available in visual basic PinmemberThord Johansson4-Jul-09 10:04 
That is awesome man! But... the downloads are broken. I can't download chatbot11.zip for example. 404 errors...
GeneralRe: the chatbot tutorial is available in visual basic PinmemberSaraMuneeb2-May-11 23:58 
General!!! Pinmemberalejandro29A2-Jun-09 4:14 
GeneralMy vote of 1 Pinmembericestatue4-May-09 8:01 
GeneralRe: My vote of 1 PinmemberGonzales Cenelia4-May-09 17:59 
GeneralRe: My vote of 1 PinmemberSafarTimura5-May-09 0:41 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140827.1 | Last Updated 14 Apr 2014
Article Copyright 2009 by Gonzales Cenelia
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid