Introduction
I first learned about Hugo Liu's research around three years ago when I was looking into NLP, commonsense data, artifical intelligence code and theory. His work, and others at MIT, is debatably the latest and greatest in today's research and very useful, at least informative, for any program designer. Most of the projects available from MIT are not written in Microsoft Visual Studio .Net and I am attempting to make use of ConceptNet for educational, research purposes, and just to have fun with the technology involved.
What is ConceptNet?
ConceptNet¹ is a commonsense knowledgebase, composed mainly from the Open Mind Project, written and put together by Hugo Liu and Push Singh (Media Laboratory Massachusetts Institute of Technology). ConceptNet 2.1 (current version at the time of this composition) also includes MontyLingua, a natural-language-processing package. ConceptNet is written in Python but it's commonsense knowledgebase is stored in text files. To read more specific details about the complete overview of ConceptNet, read Liu and Singh's outstanding paper (pdf).
What is so Unique about ConceptNet?
Unlike other projects like WordNet or Cyc, ConceptNet is based more on Context. ConceptNet makes it possible, within the limit's of it's knowledgebase, to allow a computer to understand new concepts or even unknown concepts by using conceptual correlations called Knowledge-Lines (K-Lines: a term introduced by Minsky, cf. The Society of Mind (1987)). K-Lines may be thought of a list of previous knowledge about a subject or task. ConceptNet actually puts these K-Lines together using it's twenty relationship types that fall into eight categories (including K-Lines)to form a network of data to simulate conceptual reasoning. What really makes all this possible is ConceptNet's Relational Ontology (the eight categories and twenty relationship types).
ConceptNet is structured around MIT's Open Mind Common Sense Project knowledge base. ConceptNet uses it's data in two processes: The Normalization Process and the Relaxation Process. The Normalization Process involves all the predicate to get filtered and undergo lexical distillation (Verbs and Nouns are reduced to their basic baseforms). Also ConceptNet removes determiners("a", "the", etc.) and modals("may", "could", "will", etc) in this stage. It also uses Parts Of Speech Tagging to validate well structured word orders. (The Normalization Process is not demonstrated in this demo. Please feel free to use your own POS Taggers with this Class Library) The Relaxation Process raises or "Lifts" heavily weighted common sense predicate nodes (one line from the predicate file(s)) and duplicate nodes are merged. This is reflected in each predicate's "f" and "i" metadata tags. Where f equals the number of utterances and i equals the number of of times inferred.
ConceptNet's Relational Ontology
ConceptNet's power of linking subjects together is attributed to twenty relationship types defined by it's Relational Ontology. Here is 2.1's twenty relationship types and eight categories:
(Courtesy Hugo Liu and Push Singh, ConceptNet: A Practical Commonsense Reasoning Toolkit) |
• K-Lines: ConceptuallyRelatedTo, ThematicKLine, SuperThematicKLine
• Things: IsA, PartOf, PropertyOf, De.nedAs, MadeOf
• Spatial: LocationOf
• Events: SubeventOf, PrerequisiteEventOf, First-SubeventOf, LastSubeventOf
• Causal: EffectOf, DesirousEffectOf
• Affective: MotivationOf, DesireOf
• Functional: CapableOfReceivingAction, UsedFor
• Agents: CapableOf
Example lines from ConceptNet's 2.1 predicate files:
(UsedFor "ball" "throw" "f=4;i=0;")
(LocationOf "popcorn" "at movie" "f=7;i=0;")
(CapableOfReceivingAction "film" "place on reel" "f=2;i=0;")
(IsA "guitar" "musical instrument with string" "f=2;i=0;")
(SubeventOf "talk" "debate" "f=2;i=0;")
(CapableOf "person" "write book" "f=11;i=1;")
(MotivationOf "audition for part" "act in play" "f=2;i=0;")
(PropertyOf "bacteria" "small" "f=2;i=0;") |
Let's get started and get the fun rolling!
What is Needed and Where to download it...
Again, please remember that ConceptNet 2.1 is written in the Python programming language and not C# but it's commonsense knowledgebase data is in text file format totalling around 96mb when uncompressed. You must download the ConceptNet text files by agreeing to it's user agreement (this of course goes for all of the projects listed below for download) and then downloading the entire ConceptNet Python Project.
My VS.Net C# ConceptNet Class Library
This is a very simple No-Fills Class Library written in MS VS.Net. I have quickly thrown it together mostly because I just downloaded ConceptNet for the first time yesterday and noticed a shortage of VS.Net friendly code. For some reason, I don't remember there being a public download of ConceptNet before, which I may be mistaken, however I have known about this project for some time. It's papers have been available via MIT.
There are two projects in the solution ConceptNet Demo App and ConceptNetUtils. ConceptNetUtils is the ConceptNet Class Library and consists of three Classes: FoundList, Misc, Search.
ConceptNetUtils.FoundList
Holds search result data in an index format.
Access: Public
Base Classes: Object
Members Description
protected string[] LineFound
static public int size = 999;
public string this[]
Count()
Reset()
public int get_f(int index)
public int get_i(int index)
|
ConceptNetUtils.Misc
Created for Misc Methods
Access: Public
Base Classes: Object
Members Description
public string RemoveCategoryString(string R_TYPE)
public string XMLGetNode(string path_xmlfilename,
string elementname)
public string XMLGetAttribute(string path_xmlfilename,
string elementname, string attributename)
|
ConceptNetUtils.Search
Takes care of Searching ConceptNet text files.
Access: Public
Base Classes: Object
Members Description
public bool CreateTextFile(string fullfilename)
public string GetFoundListLine(int index)
public int GetTotalLineCount()
public void SearchFor(string fullpathfilename,
string SubjectWord,
string R_Type,
int MAX_RESULTS,
bool CreateOutputFile,
string fullpathTextFilename)
public static string Predicatefile1;
public static string Predicatefile2;
public static string Predicatefile3;
public static string Predicatefile4;
public static string Predicatefile5;
public void SearchFor(int index,
string SubjectWord,
string R_Type,
int MAX_RESULTS,
bool CreateOutputFile,
string fullpathTextFilename)
public void XMLSearchForChecked(string path_xmlfilename,
string SubjectWord,
string R_Type,
int MAX_RESULTS,
bool CreateOutputFile,
string fullpathTextFilename)
public static string GetPredicatePathtoFilename(int index)
public void XMLLoadFilePaths(string settingsxmlfile)
public static int getnode_f(string node)
public static int getnode_i(string node)
public void Sort_f(ArrayList inList, out ArrayList rankedList)
public void Sort_i(ArrayList inList, out ArrayList rankedList)
public class Compare_f : IComparer
public class Compare_i : IComparer |
How to Run the Demo
1.) Make sure you have downloaded and installed ConceptNet 2.1. (I installed it into path ...\My Documents\Python Projects\conceptnet2.1\)
2.) Download and unzip this article's .Net Solution and project files.
3.) Navigate to the location "...\ConceptNet Demo App\bin\Release" and run the ConceptNet Demo App.exe. It will automaticly open the "Set Location of Knowledgebase Files" dialogbox and, on it's first run, you must click the browse button to a predicate file (ConceptNet or other) then click ok. Following runs will remember the location of checked predicate files you wish to search.
4.) You are now ready to a)Enter a word, b)Choose a relationship (ConceptNet looks at IsA, then PropertyOf), c)Click the Search button to display found nodes. You may then sort them by clicking the "Sort by f" or "Sort by i".
To Do
-
Automate the demo's process of locating the concept2.1 files.-
Add combobox to choose which predicate files to search.- Add
MSBNx COM (or some other BN) for creating, assessing, and evaluating Bayesian Networks, and to easily output to XML format.
- Add more methods to the class library.
- Create some more Lifting methods.
Conclusion
ConceptNet 2.1 can be a tool to create personalized commonsense knowledgebase networks. Hopefully this MS VS.Net Class Library project can be informational, useful, and fun.
New Version
The 0.x version posted on this article will no longer be under development. I am working on an updated version using Microsoft Visual C# Express 2005 with .Net 2.0 framework and will serve as the latest version of the ConceptNet Class Library in C# that I am working on. It will probably make use of the IronPython library. If you are interested, here is a small peek into getting ConceptNet Mini-Browser (written in Natural Python code) to execute using IronPython:
My wdevs blog post with some code.
I am just working on it whenever I have free time.
References
¹ Liu, H. & Singh, P. (2004) ConceptNet: A Practical Commonsense Reasoning Toolkit. BT Technology Journal, To Appear. Volume 22, forthcoming issue. Kluwer Academic Publishers.
ConceptNet: A Practical Commonsense Reasoning Toolkit, Hugo Liu and Push Singh Media Laboratory Massachusetts Institute of Technology
Investigating ConceptNet, Dustin Smith; Advisor: Stan Thomas, Ph.D. December 2004
Open Mind Common Sense Project
Hugo Liu website
WordNet
Cyc
Updates
1/3/06
- Added Method in Form1.cs to change the word to lowercase on leaving the textbox. Searches must be performed in lowercase.
- ConceptNetUtils.Search.SearchFor automatically changes incoming SubjectWord to lowercase.
- Fixed minor drawing problem with Combobox. (Please email me if you experience this bug.)
- Added To Do section.
- Uploaded ConceptNet Demo App version 0.01032006.2rc1 - Uploaded ConceptNetUtils binary version 0.01032006.2b1
1/9/06
- For the Demo, the loading of the predicate files are now xml stored and automated. They are no longer hardcoded. I added FileOptionsForm.cs to take care of this. I also added an ImageList.
- For the ConceptNetUtils Class Library, I added XML capability. The demo creates a Settings.xml file to hold the locations of the predicate files and now the class library can read xml files.
- Search.cs: Added SearchFor() overload, XMLSearchForChecked(), GetPredicatePathtoFilename(), XMLLoadFilePaths().
- Misc.cs: Added XMLGetNode(), XMLGetAttribute()
- Uploaded ConceptNet Demo App version 0.01092006.0rc2 - Uploaded ConceptNetUtils binary version 0.01092006.0b2
1/14/06
- Added some details about ConceptNet's Normalization and Relaxation Processes in the "What is so Unique about ConceptNet?" section.
- For the Demo, I added two new buttons "Sort by f" and "Sort by i", this demonstrates "The Relaxation Process" of the ConceptNet project.
- For the ConceptNetUtils Class Library, I added some methods needed to accomplish the "The Relaxation Process" of ConceptNet. I also fixed some problems when it had to search more than one predicate file.
- Search.cs: Added getnode_f(), getnode_i(), Sort_f(), Sort_i(). Two IComparer Classes were also added to assist with the sorting/Lifting of the knowledge, Compare_f : IComparer, Compare_i : IComparer.
- FoundList.cs: Added get_f(), get_i().
- Notified MIT's ConceptNet team of this article via email.
- Emptied \doc folder with VS generated documentation from project.
- Uploaded ConceptNet Demo App version 0.01142006.0rc4 - Uploaded ConceptNetUtils binary version 0.01142006.0b3
1/15/06
- Edited Introduction section.
- Edited Search.cs XMLSearchForChecked method documentation.
- Modified ConceptNetUtils.Search.SearchFor() to only return word only results. For example, a search for "eat" was also returning "theater" etc.
- Uploaded ConceptNet Demo App version 0.01152006.0rc5 (build 2206.41738) - Uploaded ConceptNetUtils binary version 0.01152006.0b4 (build 2206.41736)
1/16/06
- Edited the "How to Run the Demo" section to reflect the new version of the demo and class library.
2/22/06
- Edited the link to IronPython-1.0-Beta3 in the "What is Needed and Where to download it..." section (was Beta 1).
- Added "New Version" section. 0.x is no longer under development and a new ConceptNetUtils class library (that is .Net 2.0 / IronPython based) is being developed.
Born in Pennsylvania (USA), just north of Philadelphia. Joe has been programming since he was ten[now much older]. He is entirely self-taught programmer, & he is currently working as an IT Manager in Seattle WA. He was previously U.S. Navy Active Reservist for (
SPAWAR)
In '98 was honorably discharged from the USN. He served onboard the USS Carl Vinson (
94-98) He was lucky enough to drink President Clinton's leftover wine, promoted by his Captain, and flew in a plane off the flightdeck but not all at the same time. His interests, when time allows, are developing
misc apps and Artificial Intelligence proof-of-concept demos that specifically exhibits human behavior. He is a true sports-a-holic, needs plenty of caffeine, & a coding junkie. He also enjoys alternative music and a big Pearl Jam, Nirvana, new alternative music fan, and the Alison Wonderland.
He is currently working on
earthboticsai.net<> which he says is fun and cool. :cheers:
Joe is an INTP[^] personality type. Joe "sees everything in terms of how it could be improved, or what it could be turned into. INTP's live primarily inside their own minds." INTPs also can have the "greatest precision in thought and language. Can readily discern contradictions and inconsistencies. The world exists primarily to be understood. 1% of the total population" [