• Medientyp: Sonstige Veröffentlichung; Elektronische Hochschulschrift; Dissertation; E-Book
  • Titel: Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
  • Beteiligte: Meuschke, Norman [VerfasserIn]
  • Erschienen: KOPS - The Institutional Repository of the University of Konstanz, 2021
  • Sprache: Englisch
  • DOI: https://doi.org/10.5281/zenodo.4913345
  • Schlagwörter: Mathematics retrieval ; Math Retrieval ; Information Visualization ; Information extraction ; Open Source Software ; Citation Analysis ; Web-based interaction ; Surveys and overviews ; Retrieval models and ranking ; Data mining ; Natural Language Processing ; Image search ; User Interaction ; Plagiarism Detection ; Document representation ; Evaluation of retrieval results ; Content-based Image Retrieval ; Web searching and information discovery ; Digital libraries and archives ; Information integration ; Users and interactive retrieval ; Near-duplicate and plagiarism detection ; Link and co-citation analysis ; Multilingual and cross-lingual retrieval
  • Entstehung:
  • Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Beschreibung: Identifying academic plagiarism is a pressing problem, among others, for research institutions, publishers, and funding organizations. Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text. However, reliably detecting disguised plagiarism, such as strong paraphrases, sense-for-sense translations, and the reuse of non-textual content and ideas, is an open research problem. The thesis addresses this problem by proposing plagiarism detection approaches that implement a different concept—analyzing non-textual content in academic documents, such as citations, images, and mathematical content. The thesis makes the following research contributions. It provides the most extensive literature review on plagiarism detection technology to date. The study presents the weaknesses of current detection approaches for identifying strongly disguised plagiarism. Moreover, the survey identifies a significant research gap regarding methods that analyze features other than text. Subsequently, the thesis summarizes work that initiated the research on analyzing non-textual content elements to detect academic plagiarism by studying citation patterns in academic documents. To enable plagiarism checks of figures in academic documents, the thesis introduces an image-based detection process that adapts itself to the forms of image similarity typically found in academic work. The process includes established image similarity assessments and newly proposed use-case-specific methods. To improve the identification of plagiarism in disciplines like mathematics, physics, and engineering, the thesis presents the first plagiarism detection approach that analyzes the similarity of mathematical expressions. To demonstrate the benefit of combining non-textual and text-based detection methods, the thesis describes the first plagiarism detection system that integrates the analysis of citation-based, image-based, math-based, and ...
  • Zugangsstatus: Freier Zugang
  • Rechte-/Nutzungshinweise: Namensnennung - Nicht kommerziell (CC BY-NC)