buchspektrum Internet-Buchhandlung

Neuerscheinungen 2018

Stand: 2020-02-01
Schnellsuche
ISBN/Stichwort/Autor
Herderstraße 10
10625 Berlin
Tel.: 030 315 714 16
Fax 030 315 714 14
info@buchspektrum.de

Abdulwahed Almarimi

Dissimilarities Detections in Arabic and English Texts


Using n-grams, Histograms and Self Organizing Maps
2018. 128 S. 220 mm
Verlag/Jahr: SCHOLAR´S PRESS 2018
ISBN: 6-202-30271-2 (6202302712)
Neue ISBN: 978-6-202-30271-5 (9786202302715)

Preis und Lieferzeit: Bitte klicken


The main goals of our research is to apply mathematical methods to cover anomalies and discrepancies in texts. English and Arabic texts were analyzed from many statistical characteristics point of view. We covered some basic statistical differences between lengths of used words in both languages and the results were applied in some heuristics for measurements of text parts dissimilarities. In the research we prepared three methods for the analysis of texts: (1) Element n-gram profiles method: The method is based on similarity/dissimilarity occurrences of n-grams in text parts in a comparison to a full text. (2) Histogram method: Histograms of text sequences are analyzed from a cluster point of view. If a cluster dispersion is not large, the text is probably written by the same author. If the cluster dispersion is large, the text is critical and it will be split in two or more parts and the same analysis will be done for the text parts. (3) Neural networks { Systems of Self-Organizing Maps: The systems were trained to input sequences and after the training they determine text parts with anomalies using a cumulative error and some complex analysis.
D. Abdulwahed Almarimi, Born 12.12.1985 in Bani Waleed, Libya. PhD from Pavol Jozef Safárik University in Kosice, Slovakia.