Click here to Skip to main content
15,898,947 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
iam a final year Bca stuent and my project is plagiarism detection between 2 files .and i have read research papers on plagiarism detection and iam going mad doing so.....because there are many and variety of steps that can be taken and variety of algorithms that can be used. there is no specific paper that gives a detailed description about this plagiarism detection..please help me (iam planning to use java and MySQL for database)

What I have tried:

as per the info i got from the paper i tried splitting up the paragraph into sentences and and count the frequency of keywords.. and iam lost on what exaclty to do ...i all did it in java
Posted
Updated 28-Nov-16 21:13pm

1 solution

You may consider using Text Mining (TM) to detect the similarity of two documents. It is impossible to go into the details here. Briefly, TM involves, among other things, removing unnecessary or meaningless words, such as punctuation, stop words, trivial words, looking out for synonyms, etc as a way to transform free-form unstructured textual content into structured data that can be used by the computer to machine-learn of any patterns using appropriate AI techniques. It is a precursor to data mining. I have not even started talking about coding here.
To begin with, you should sign up for some AI modules, esp Text Mining, in your college to prepare yourself for such a project.
For your reference Text Mining:The state of the art and the challenges[^]
 
Share this answer
 
v7
Comments
Maciej Los 29-Nov-16 3:50am    
5ed!
Peter Leow 29-Nov-16 6:14am    
Thank you, Maciej.
Member 12199293 30-Nov-16 1:11am    
thank u for the answer,really appreciate it.would u give me some refernces so that i can get started :)
Peter Leow 30-Nov-16 2:11am    
Check this out: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.428.8805&rep=rep1&type=pdf

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900