Multimodal Semantics Integration Using Ontologies Enhanced By Ontology Extraction And Cross Modality Disambiguation
Loading...
Date
2012-03
Authors
Abu Shareha, Ahmad Adel Ahmad
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
The increasing amount of multimodal data such as text documents, annotated images and web pages have necessitated the development of effective techniques for their manipulation. The ineffectiveness of low-level image and textual features is one of the main issues as these features are commonly insufficient for effective data manipulation. Therefore, obtaining sufficient and significant information from the multimodal data, and then further using this information in the proper manner is penultimate in data manipulation tasks. This thesis proposes a multimodal semantics integration (MSI) process to extract and integrate the semantics from the image and text modalities, and to use these semantics for manipulation tasks. The proposed process firstly extracts a textual representation from the textual and image modalities, followed by mapping the representation to concepts in a condensed knowledge source using a semantic-based alignment sub-process. Cross modality disambiguation is then performed using semantic closeness to obtain a set of enhanced semantics. Finally, the extracted and enhanced semantics are combined to deliver rich and sufficient information based on the integrated sources. MSI was evaluated on two tasks, namely disambiguation and retrieval-with-diversity (RwD), using 20,000 multimodal instances from the ImageCLEF dataset. In the disambiguation task, MSI improved the precision of ambiguous inputs by 32% over the conventional approach while preserving recall. In the RwD task, the diversity of the obtained solution was improved by 12% over the non-diversity-based approach while maintaining accuracy. The proposed non-diversity-based approach also improved the precision of the retrieval task by over the state-of-the-art approaches. Experimental results further showed that each proposed component of MSI justified the choice for building and utilizing the selected components within the overall process.
Description
Keywords
Significant information from the multimodal data , is penultimate in data manipulation tasks