Title : Continuous extraction, edition, and annotation

Code : 2

Responsible : Orpailleur

Activities : extracting knowledge units from texts using data mining

Start Date : 2011-02-01

End Date : 2014-07-31

Objectives : This task aims at proposing a process where in extracting knowledge from different kinds of sources. The challenge in this task is that at any time, automatic (and formal) methods extracting knowledge could be invoked and run in accordance with already stored knowledge. In the other way, human should be able to correct or to enrich, at any time, the ontology. Usually, humans operate after automatic knowledge extraction systems so that they have no formal constraints in operations they perform. Knowledge extraction is an iterative and interactive process involving several steps. Interaction is usually described as evaluation, where experts are asked to interpret and validate pattern extracted by data mining algorithms. In fact, extracting information from texts is closer to trial and error process than to a one-shot process and expert interaction involves as well to look back at the previous step, such as data selection, data preprocessing to “tune” data. This look back will be perform thanks to the annotation process. The annotation process guides the construction of the ontology and the ontology guides the annotation process.

Success criteria :

Risks :

Deliverables :

description Feb2011+months
D21 Building a corpus for experimenting continuous knowledge extraction. 66
D22 Integrating a knowledge extraction system from texts in a semantic wiki. methodology, modules, and experiments. 1212
D23 Specification of a continuous knowledge extraction system 1818
D24 Dynamic semantic annotation in a semantic wiki: definitions and specifications 3636
D25 Continuous extraction, edition and annotation in a semantic wiki ; report advances on this task at the end of the project. 3636

Sub-tasks :

Task21 Collecting data
Task22 Formal methods for knowledge extraction.
Task23 Continuous extraction of knowledge.
Task24 Semantic Annotation.

