Can we use kNN and k-mean at a same time?

Question

2.00/5 (1 vote)

See more:

I Get dataset of neighbours using kNN and then I want to apply k-mean on that dataset. By using this, is it possible that I get more accurate result? Is it logically correct that use kNN and then after use k-mean or vice-versa?

Posted 18-Nov-14 7:33am

rushiraj_11

Add a Solution

Comments

Sergey Alexandrovich Kryukov 18-Nov-14 15:49pm

Not clear... More accurate? It depends on what result do you want to get... :-)
—SA

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Peter Leow · Accepted Answer · 2014-11-18T15:12:00

They are different machine learning techniques and it is a long story. I will try my best to be concise.

KNN is used to:
1. Classify a new data into a known group (category); or
2. Predict a target value for a new data;
It works by comparing the similarity between features of the new data and those of a set of historical data of known categories or known target values. The "K" refers to the number of data that has the closest match to it. The final outcome may be determine by the majority of category in the K group in the case of classification or a simple average in the case of prediction.
So, for KNN you need to have historical data with known targets and it is called supervised machine learning.

K-means, on the other hand, is a clustering algorithm. It works by first grouping data points into K number of partitions (or clusters). It starts by selecting K number of data points randomly as the centers of these k clusters, then assign the rest of data points to these cluster based on the features similarity between them and the cluster centers. Once all the data points are being grouped into their clusters, each cluster will re-select the most suitable data points among its members to be its new cluster centers. Then, the whole process of re-clustering begins. This will go on until there is no sigificant changes in the clusters. Once the K-clusters are identified and settled, new data then finds its place in one of these clusters through similarity matching.
so, K-means needs historical data with no known outcomes and it is called unsupervised machine learning.

Combining Them:
You may do a K-means first to group new data into a cluster and then apply KNN using the data points in that cluster. But you would need a very large dataset. Whether it produces better result, it depends on many factors - nature of the problem, the quality and quantity of the dataset, and the value of K . You just have to explore and experiment. Good luck.

Some reading that may help: Combination of K-Nearest Neighbor and K-Means based
on Term Re-weighting for Classify Indonesian News
[^]