Performance Efficiency in Plagiarism Indication Detection System Using Indexing Method with Data Structure 2-3 Tree


Performance Efficiency in Plagiarism Indication Detection System Using Indexing Method with Data Structure 2-3 Tree

 

Author		: ANNISA FITRIANA SURYANA; AGUNG TOTO WIBOWO
Published on	: ICoICT 2014(Telkom University - Bandung Indonesia)

 

Abstract

Plagiarism is a form of cheating that has been so much happen. One of prevention is to make the anti-plagiarism system. The system that must compare a query document with all documents in the database requires a very long time. The more irrelevant document in database compare with the query that will be matched will waste the time. This paper will discuss a system to detect plagiarism by using indexing method as a way to eliminate irrelevant documents in order to reduce the document database that will be matched with the query document. Matching between a query document and documents in database will be done with Longest Common Subsequence (LCS) algorithm. The system will use inverted index as the form to eliminate irrelevant documents using a 2-3 tree data structure. Indexing is done by inserting the fingerprint of the document. To find the fingerprint this paper will use winnowing algorithm. The results of the system shows to execute 1 query and 10000 documents corpus, most of them are not relevant, takes 59 seconds and 134 seconds with and without respectively. The f-measure value, the average value of precision and recall, is obtained 0.7387 by indexing with 0.15 as the threshold of indexing elimination and 0.000428 without indexing.

Leave a Reply

Your email address will not be published. Required fields are marked *