This program is an Eliza like chatterbot. Bots like Eliza are the results of researches in Artificial Intelligence (more specifically, in NLP and NLU; NLP: Natural Language Processing, NLU: Natural Language Understanding). The first chatterbot was published in 1966 by Joseph Weizenbaum, a professor of MIT. And also, most of the chatterbots that have been written these days are largely based on the original chatterbot Eliza that was written by Joseph Weizenbaum, which means that they use some appropriate keywords to select the responses to generate when they get new inputs from the users. The technique that is in use in a "chatterbot database" or "script file" to represent the chatterbot knowledge is known as "Case Base Reasoning" or CBR.
A very good example of an Eliza like chatterbot would be "Alice". This program has won the Loebner prize for the most human chatterbot three times. The goal of NLP and NLU is to create programs that are capable of understanding natural languages, and also capable of processing it to get input from the user by "voice recognition" or to produce output by "text to speech". During the last few decades, there has been a lot of progress in the domains of "Voice Recognition" and "Text to Speech". However, the goal of NLU to make software that is capable of showing a good level of understanding of "natural languages", in general, seems to be quiet far to many AI experts. The general view about this subject is that it would take at least many decades before any computer can begin to really understand "natural language" just as humans do. This code is copyrighted, and has limited warranty.
This new version of the program is smarter than ever, more new features have been added since the last submission, and also now the conversation log between the users and the chatbot is automatically saved into the file: log.txt. And finally, the "script file" (script.txt) which acts as a knowledge base for the chatbot has been totally rewritten, and it is definitely better than in the previous versions of the program.
Most of the ideas that are in this code are directly inspired by the original chatterbot "Eliza" that was written by Joseph Weizenbaum.
Using the code
The code is pretty simple to understand, the most important part of the code can be found in the class that is named "
- script.txt is the file that stores the knowledge of the program.
- time records.txt is the file that holds the records of time delays in seconds that the program uses for simulating a human typist.
- unknown.txt - if during a conversation with a user, the program hasn't found any keyword for a given sentence, the sentence is saved into this file. After that, the user might use this new sentence for deciding which new keyword to add to the database.
Points of Interest
While I was writing the code for the chatterbot, I came up with a method for controlling the repetition made by the program. This new functionality was implemented using the functions
There are many interesting features in the current program. This chatterbot is capable of avoiding repetitions when selecting new responses. It can also follow the context of a conversation with a user. When the program detects new keywords that are not part of the "script file", it saves them on the file "unknown.txt", and finally, it simulates a "human typist" when displaying responses on the screen. Actually, there are many more features in the program, you will be capable of finding them by reading the "source code".
Tips on editing the database of the chatbot
When editing the database of the program (script.txt), you should keep in mind that a keyword that is enclosed with two underscores, one in front and one at the end of the keyword (_KeyWord_), means that the specific keyword can only be found alone within a given input, there can not be any other word in the corresponding sentence. When a keyword is followed by an underscore, (KeyWord_), it means that the keyword can only be found at the end of the corresponding input. And finally, when there is no underscore in the keyword (KeyWord), it means that the keyword can be found anywhere in the user's input except at the end.
Now about the responses, when at least one of the characters ('*', '@', '%') is found in the chatbot response, it means that the response is a template. When processing the final response, the character '*' will be replaced by the part of the user's input that follows the keyword that was found in the input. The character '@' will simply be replaced by the last user's input, and the character '%' will be replaced by the previous response of the chatbot. And also, whenever a response is considered to be a template, it is always transposed (the chatbot will make some pronoun reversal, 'I' to 'You' etc.) before being printed on the screen.
I have made some more encapsulation of the functionalities of the program by using more functions. Finally, the database is bigger than the last time. Also, a new feature has been added for controlling the usage of short inputs by the user. With these new feature, the chatterbot makes sure that the user uses longer sentences, and this in turn might guarantee the chances of finding more keywords within the user's input. The database (script.txt) has been updated, more entries have been added, and also some corrections have been made with the formatting of some keywords.