Click here to Skip to main content
15,893,663 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I want to do canopy clustering over strings to reduce the distance and the measures. But I not having any idea how to do canopy clustering over set of strings.
When I searched I got the Apache hadoop implementation of text clustering. But in that they said the input format should be sequential vector file in which the input should vector readable format.


I have a column of strings and how to change this into sequential file and vector file in java and how to use hadoop canopy clustering efficiently.


example of one column words :

quickly
need
close
this?
father-in-law
relatives
come
location?
little
specific
'where
exactly
chennai-bangalore
road?',
far
road?
state
right
locality
in?
post
message
brahmma
weeks
max


help me thanks
Posted

Next time, ask google first, here[^]
 
Share this answer
 
I know friend. But how to do it in Java according to my input?
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900