Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource

By | 07/01/2014

Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource

ABSTRACT:

We propose, in this paper, a model for an Ontological and Terminological Resource (OTR) dedicated to the task of n-ary relations annotation in Web data tables. This task relies on the identification of the symbolic concepts and the quantities, defined in the OTR, which are represented in the table’s columns. We propose to guide the annotation by an OTR because it allows a separation between the terminological and conceptual components and allows dealing with abbreviations and synonyms which could denote the same concept in a multilingual context. The OTR is composed of a generic part to represent the structure of the ontology dedicated to the task of n-ary relations annotation in data tables for any application and of a specific part to represent a particular domain of interest. We present the model of our OTR and its use in an existing method for semantic annotation and querying of Web tables.

SYSTEM ANALYSIS:

Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource

EXISTING SYSTEM:

Large domain ontologies are emerging from collaborative efforts in the Life Sciences, being its main aim to achieve the interoperability among the different research resources by assuming a common conceptualization. These resources mainly consist of both domain ontologies and terminological resources (e.g. thesauri), which allow researchers to process, store and share the ever increasing knowledge derived from their experiments. So far, these two kinds of resources have usually lived apart, being its later integration a very hard task. However, some exceptions exist where the thesaurus is integrated within the ontology; e.g. the Open Biomedical Ontologies (OBO) and the Foundational Model of Anatomy (FMA) with the Terminologia Anatomica (TA).

DRAWBACK IN EXISTING SYSTEM:

1) The recognition and the representation of imprecise numerical data appearing in the cells of a data table
2) Computation and explicit representation of the semantic distance between terms in the cells of a data table and terms of the OTR.

EXISTING ALGORITHM:

SPARQL (Simple Protocol and RDF (Resource Description Framework) Query Language).

Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource

PROPOSED SYSTEM:

ONDINE system relies on an Ontological and Terminological Resource (OTR) which is composed of two parts: on the one hand, a generic set of concepts dedicated to the data integration task and, on the other hand, a specific set of concepts and a terminology, dedicated to a given domain of application. ONDINE system is composed of two subsystems:

1) Web subsystem designed to load an XML/ RDF data warehouse with data tables which have been extracted from Web documents and semantically annotated using concepts from the OTR.

2) MIEL++ subsystem designed to query simultaneously and uniformly the local data sources and the XML/RDF data warehouse using the OTR in order to retrieve approximate answers in a homogeneous way.

ADVANTAGES IN PROPOSED SYSTEM:
1) To retrieve not only exact answers compared with the selection criteria but also semantically close answers.
2) To compare the selection criteria expressed as fuzzy sets representing preferences with the fuzzy annotations of data tables.

PROPOSED ALGORITHM:

ONtology-based Data INtEgration (ONDINE), Semantic Web framework.
MIEL++ Query software.

Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:
PROCESSOR : PENTIUM IV 2.6 GHz, Intel Core 2 Duo.
RAM : 512 MB DD RAM
MONITOR : 15” COLOR
HARD DISK : 40 GB

SOFTWARE REQUIREMENTS:
Front End : J2EE (JSP, SERVLET), STRUTS
Back End : MS SQL 05
Operating System : Windows 07
IDE : Net Beans, Eclipse

FUTURE ENHANCEMENT:

The other perspectives concern the improvement of ONDINE system by

1) completing the cosine similarity measure used to compare terms with other syntactical and semantic techniques,

2) completing the semantic annotation of data tables in Web documents with the annotation of the text using the OTR, and

3) managing OTR evolution by taking into account annotation results and other ontologies.

Fuzzy Web Data Tables Integration Guided by An Ontological and Terminological Resource Video Link

 

Other Projects On Data Mining

Leave a Reply