Hypatia Digital Library: A novel text classification approach for small text fragments

Fiche du document

Date

1 décembre 2019

Type de document
Périmètre
Langue
Identifiant
Source

eJournals

Relations

Ce document est lié à :
https://ejournals.epublishing.ekt.gr/index.php/jii [...]

Organisation

EKT ePublishing

Licence

https://creativecommons.org/licenses/by-nc/4.0




Citer ce document

Ioannis Triantafyllou et al., « Hypatia Digital Library: A novel text classification approach for small text fragments », eJournals, ID : 10670/1.wj50cl


Métriques


Partage / Export

Résumé 0

Purpose - The purpose of this paper is to further investigate prior work of the authors in text classification in Hypatia, the digital library of University of Western Attica. The main objective is to provide an accurate automated classification tool as an alternative to manual assignments. Design/methodology/approach - The crucial point in text classification is the selection of the most important term-words for document representation. The specific document collection consists of 718 abstracts in Medicine, Tourism and Food Technology. Two weighting methods were investigated: classic TF.IDF and DEVMAX.DF. The last one was proposed by the authors as a more accurate term-word selection tool for smaller text fragments. Classification was conducted by applying 14 classifiers available on WEKA. Findings - Classification process yielded an excellent ~97% precision score and DEVMAX.DF proved to perform better than classic TF.IDF.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en