XeLDA models and standardizes unstructured documents in order to automatically exploit their content. Based on a technology developed through 20 years of research and development, XeLDA provides advanced text mining features enabling textual information processing. TEMIS further explains XeLDA offers a scalable range of services based on natural language processing components that can be integrated in business applications, enabling: • automatic identification of the language within each document, • segmentation of text into sentences, • split of text into basic lexical units (tokenization), • morphological text analysis to return the normalized form (the lemma) and the potential grammatical categories for all the words identified during the tokenization stage, • morpho-syntactic disambiguation to determine the exact grammatical category of a word according to its context, • extraction of sequences of words that form noun phrases, • identification of the context of a word to find the corresponding dictionary entry (dictionary lookup), and • recognition of idiomatic expressions. |