Click here to Skip to main content
11,708,980 members (72,668 online)
Rate this: bad
good
Please Sign up or sign in to vote.
See more: Java
I want to do canopy clustering over strings to reduce the distance and the measures. But I not having any idea how to do canopy clustering over set of strings.
When I searched I got the Apache hadoop implementation of text clustering. But in that they said the input format should be sequential vector file in which the input should vector readable format.


I have a column of strings and how to change this into sequential file and vector file in java and how to use hadoop canopy clustering efficiently.


example of one column words :

quickly
need
close
this?
father-in-law
relatives
come
location?
little
specific
'where
exactly
chennai-bangalore
road?',
far
road?
state
right
locality
in?
post
message
brahmma
weeks
max


help me thanks
Posted 13-Feb-13 1:50am
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Next time, ask google first, here[^]
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

I know friend. But how to do it in Java according to my input?
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 Sergey Alexandrovich Kryukov 360
1 Maciej Los 235
2 Mika Wendelius 170
3 OriginalGriff 158
4 Peter Leow 139
0 OriginalGriff 9,348
1 Sergey Alexandrovich Kryukov 8,727
2 CPallini 5,189
3 Maciej Los 4,991
4 Mika Wendelius 3,856


Advertise | Privacy | Mobile
Web01 | 2.8.150819.1 | Last Updated 18 Feb 2013
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100