 |
|
 |
i would like to say thank very much
|
|
|
|
 |
|
 |
the program stop running when it try to check IsStopword function and give an error message null referenceexception. i have try to modified the code, but i still cant make it fixed. please help me.. thanks for your kindness..
|
|
|
|
 |
|
 |
the program working very well, but it is so sleepy for large data
|
|
|
|
 |
|
 |
your proghramme is excellent, but couldn't find a way to run it. because without knowing ur code well i can't modify it to read word documents Can u please send me some more details about programma It classes and how can i Intrgrate it from word document or any ather text
i want compare word to text pls send me your advise 'dyeingbreed@naver.com'
have a good day !!!
|
|
|
|
 |
|
 |
nice find u'r page,,
i needs some help..
now i trying to finish my project..
about"implementation algorithm tf-idf for text mining to classify abstrack"
and this project using visual basic programme..
can you help me sirr for coding in vb programe..
thank you very much..waiting ur answer sir!
|
|
|
|
 |
|
 |
No description how to run the code..
|
|
|
|
 |
|
 |
no description for source code
|
|
|
|
 |
|
 |
your proghramme is excellent, but couldn't find a way to run it. because without knowing ur code well i can't modify it to read word documents Can u please send me some more details about programma It classes and how can i Intrgrate it from word document or any ather text
|
|
|
|
 |
|
 |
Please guide me how to run this code?
Thanks in anticipation.
|
|
|
|
 |
|
 |
Hi,
The cosine similarity worth reading and its ussefull, still you may want to read some considerations about IDF in my article:
http://www.codeproject.com/KB/IP/AnatomyOfASearchEngine1.aspx
Cheers
|
|
|
|
 |
|
 |
Dear sir! How can i improve the performance of this code. It need's a lot of resources using with large corporas. How can i store the vectors?
skicc
|
|
|
|
 |
|
 |
Dear Sir,
I don't know, how to use this code. I can't add an application in this code. Can you tell me
thanks
Poor Student
|
|
|
|
 |
|
 |
I am doing a task to imoplement retrieval based on query.Code is fine but when i run it, after execution, it does nothing. where should i modify and how. for the following task
I have a set of text files i have to read them after tokenization, stemming etc, i have to create an index upon which retrieval is based and then also ranking is required.
wish to get help at your convenience
Chan
i m Chan Naseeb from Pakistan
|
|
|
|
 |
|
 |
This is quite a useful package, and I was wondering if it's still maintained.
I've been using it for my Master's thesis, with some changes to the code for optimisation and ease of use - but would like to make sure they make their way back to the author and I didn't want to plagiarise anything.
Thanks
Mike
|
|
|
|
 |
|
 |
Dear please can u tell me how it works becuase when i run it does nothig
|
|
|
|
 |
|
 |
Did you study the source code to see what it actually does?
Only writing because you responded to my inquiry. I saw that the author has no time to maintain the code.
I ended up writing my own TfiDf classes for my thesis, which wasn't that hard if you have a good document that described how TfiDf works. I think I kept 2 methods from this project because they were very nice and compact, though I made some modifications.
I've been thinking of posting my own code, but because I originally started with this project, I wanted to talk to the author to discuss integration and/or proper attribution.
Mike
|
|
|
|
 |
|
 |
Mike, my apologize for this lately respond. let me know if you're interested in extending the code...(just drop me some lines via: thanh.ngoc.dao@gmail.com).
thanks!
Thanh Dao
|
|
|
|
 |
|
 |
Hello!
This has fallen off my radar - once I handed in my master's thesis I was exhausted and wanted to never even see code again.
If I ever find the time I'll see if I can dig out the C# classes that I wrote and make them available.
|
|
|
|
 |
|
 |
hello!
i have been working on my mini project in this area
i was searching for the TF-IDF codings
i tried this code and ended with nothing
can u help in in that?
|
|
|
|
 |
|
 |
Unfortunately, it's been quite a while since I finished and handed in my thesis. Almost a year ago.
I would have to dig around on an old backup to find my code, if I have the time.
|
|
|
|
 |
|
 |
thanks for Ur reply...
please try for that...
at least can u summarize the concept????
|
|
|
|
 |
|
 |
Hello Thanh Dao
I am doing a project in campus. And one of the tasks involve finding the cosine similarity of centroid vectors.
I am very happy when I found your code...It really helps me to understand the concept.
But now, I am confused on what should I do if the vector length is different.
centroid vector of Cluster 1:
video : 0.00116015
site : 2.57935e-005
google : 4.49502e-006
year : 7.70845e-005
america : 0.00012493
show : 0.000316262
network : 1.12263e-005
internet : 0.000171536
online : 1.21093e-005
include : 2.76765e-005
centroid vector of Cluster 2:
Cluster 3 :
year : 0.000166218
show : 0.000249849
google : 9.23458e-005
company : 3.66881e-005
internet : 6.59765e-005
america : 3.83144e-005
life : 4.1567e-005
call : 3.2381e-006
world : 3.98896e-005
include : 0.000144504
announce : 0.000129032
play : 0.000301592
music : 5.05836e-005
president : 0.000180596
lead : 0.00013031
online : 0.000367673
work : 5.8714e-005
What should I do to find the cosine similarity based on the above centroid vectors..
Lastly thank you so much =)
Regards,
Devit
modified on Wednesday, December 19, 2007 5:55:30 AM
|
|
|
|
 |
|
 |
Hi,
Vectors must be the same length. What that means is that a term that occurs in D1, but not in D2, will have a zero vector.
So, if you have 2 documents, D1 and D2 (or your two clusters if you prefer), and they look like this (really simple and exagerated):
D1:
cat
mouse
D2:
dog
ball
Then you create the complete vector space as, say, (ball, cat, dog, mouse) and your vectors will look something like:
D1(0, 0.5, 0, 0.5)
D2(0.5, 0, 0.5, 0)
Thus they are the same length!
Hope this helps!
Mike
|
|
|
|
 |
|
 |
The progrum builds just fine, but how do i run it, it does not seem to be doing anything after you build it. I even checked with some people who are strong in C++ but they could not figure it out.
Jones
|
|
|
|
 |
|
 |
I have tried this code.
It is working well good with the small sized docs but
not working with long docs.
I have tried to compare these two docs
"A furious China on Tuesday accused the US Congress of trying to give impetus to the secessionist movement in Tibet by its plans to award the congressional gold medal, its highest honour, to the Dalai Lama on Wednesday. Chinese officials also warned that the move would seriously harm Sino-US relations.",
"Chinese officials warned the United States on Tuesday not to honor the Dalai Lama, saying a planned award ceremony in Washington for the Tibetan spiritual leader would have 殿n extremely serious impact・on relations between the countries."
but the result is arround 0.14,
I have tried using Normalised Weight but the result is same.
Could you please help me in this case.
|
|
|
|
 |