Due to the big abstraction gap between the real world and today's programming languages, in addition to the knowledge about their domain, these APIs are cluttered with a considerable amount of noise in form of implementation detail. Furthermore, an API offers a particular view on its domain and different APIs regard their domains from different perspectives.
In this paper we propose an approach for building domain ontologies by identifying commonalities between domain specific APIs that target the same domain.
Converting a controlled vocabulary into an ontology: the case of GEM
Besides our ontology extraction algorithm, we present a methodology for eliminating the noise and we sketch possible usage-scenarios of the ontologies for program analysis and understanding. We evaluate our approach through a set of case-studies on extracting domain ontologies from well-known domain specific APIs. Wissensorganisation und gesellschaftliche Verantwortung. Report on the Open Space Sessions. Park Eds. Topic-Oriented Portals. Tatnall Ed. Soergel, D. The rise of ontologies or the reinvention of classification.
Journal of the American Society for Information Science, 50 12 , Journal of Digital Information, 4 4 , Article No. Staab, S. Handbook on Ontologies. Springer: Springer. Stuckenschmidt, H. Information Sharing on the Web: Springer. Studer, R. Sure, Y. Automatische Wissensintegration mit Ontologien. Svenonius, E. Tudhope, D. New applications of knowledge organization systems: introduction to a special issue.
Ullrich, M. Domingue Eds. LNCS , pp. Paper presented at the 3rd Int. Vatant, B. Cooking for the Semantic Web. Ontology-driven topic maps. Veltman, K. Towards a Semantic Web for Culture. Journal of Digital Information, 4 4. Vickery, B. Journal of Information Science, 23 4 , Welty, C. Ontology research. AI magazine, 24 3 , Zelewski, S. In: Guarino, N. In: Hitzler, P. Springer Heidelberg, In: Ohly, H. Fortschritte in der Wissensorganisation. A new search and learning technology for the Internet and Intranets. Accessed in Oct. In: Beghtol, C.
Dynamism and Stability in Knowledge Organization. In: Vielberth, J. An advanced formula for linguistic engineering and information retrieval. In: Green, R. Knowledge Organization and Change. Technical report KSL Springer-Verlag, London. Heidelberg: Heidelberger Orientverlag, In: Park, J. XML Topic Maps. Creating and Using Topic Maps for the Web.
Addison-Wesley, Boston, In: Fisette D. KO 23 2 As part of JCDL, SML Europe, Berlin, In : Fuchs. Ophrys, Paris, Paris, In: Proc. Conference on Distributed Communities on the Web. Sydney, Australia, Kluwer, , Mainz, In: Comm. ACM, Vol. Deliverable 4. The Linear Topic Map Notation. Definition and introduction, version 1. Topic Maps — Data Model. Accuracy in the description of the stages: We were interested in knowing if the stages were sufficiently described so they could be easily followed. Terminology extraction: We wanted to study how could terminology extraction assist knowledge engineers and domain experts when building ontologies.
We were interested in those methodologies that could offer some level of support for identifying terms. Generality: We needed to know how dependent on a particular intended use the investigated methodologies are. This point was of our particular interest since our ontology was intended to serve a particular task. This parameter may be assimilated to the ability of the method to be applied to a different scenario, or use of the ontology it self. Ontology evaluation: We needed to know how could we evaluate the completeness of our ontology. This point was interesting for us since we were working with agreements within the community, and domain experts could therefore agree upon errors in the models.
Distributed and decentralized: We were interested in those methodologies that could offer support for communities such as ours in which domain experts were not only geographically distributed but also organized in an atypical manner i. Usability: We had a particular interest in those methodologies for which real examples had been reported. Had the methodology been applied to building a real ontology? Supporting software: We were interested in knowing whether the methodology was independent from particular software.
We found that only Diligent offered community support for building ontologies and none of them had detailed descriptions about knowledge elicitation, nor did they have details on the different steps that had to be undertaken. The methodologies mentioned above have been applied mostly in controlled environments where the ontology is deployed on a one-off basis. Tools, languages and methodologies for building ontologies has been the main research goal for many computer scientists; whereas for the bioinformatics community, it is just one step in the process of developing software to support tasks such as annotation and text mining Unfortunately, none of the methodologies investigated was designed for the requirements of bioinformatics, nor has any of them been standardised and stabilised long enough to have a significant user community i.
Theoretically, the methodologies are independent from the domain and intended use. However, none of the methodologies has been used long enough as to provide evidence of its generality. They had been developed in order to address a specific problem or as an end by it self. The evaluation of the ontology remains a difficult issue to address; there is a lack of criteria for evaluating ontologies.
Within our particular scenario, the models were being built upon agreements between domain experts. Evaluation was therefore based upon their knowledge and thus could contain "settled" errors. We studied those knowledge elicitation methods described by [ 19 ] such as observation, interviews, process tracing, conceptual methods, and card sorting. Unfortunately, none of them was described within the context of ontology development in a decentralised setting. We drew parallels between the biological domain and the Semantic Web SW. This is a vision in which the current, largely human-accessible Web, is annotated from ontologies such that the vast content of the Web is available to machine processing [ 20 ].
Articles on semantic technologies and KBAI (knowledge-based artificial intelligence)
Pinto and coworkers [ 21 ] define these scenarios as distributed, loosely controlled and evolving. Domain experts in biological sciences are rarely in one place; they tend to form virtual organizations where experts with different but complementary skills collaborate in building an ontology for a specific purpose. The structure of the collaboration does not necessarily have a central control and different domain experts join and leave the network at any time and decide on the scope of their contribution to the joint effort.
The rapid evolution of biological ontologies is due in part to the fact that ontology builders are also those who will ultimately use the ontology [ 22 ]. Some of the differences between classic proposals from Knowledge Engineering KE and the requirements of the SW, have been presented by Pinto and coworkers [ 21 ], who summarise these differences in four key points:. Distributed information processing with ontologies: within the SW scenario, ontologies are developed by geographically distributed domain experts willing to collaborate, whereas KE deals with centrally-developed ontologies.
Domain expert-centric design: within the SW scenario, domain experts guide the effort while the knowledge engineer assists them.
- Domain Analysis for Knowledge Organization: Tools for Ontology Extraction?
- Associated Data;
- Associated Data.
- Utopia of Understanding: Between Babel and Auschwitz (SUNY series in Contemporary Continental Philosophy);
- The Metropolis Case.
There is a clear and dynamic separation between the domain of knowledge and the operational domain. In contrast, traditional KE approaches relegate the role of the expert as an informant to the knowledge engineer. Ontologies are in constant evolution in SW, whereas in KE scenarios, ontologies are simply developed and deployed. Additionally, within the SW scenario, fine-grained guidance should be provided by the knowledge engineer to the domain experts. We consider these four points to be applicable within biological domains, where domain experts have crafted ontologies, taken care of their evolution, and defined their ultimate use.
Our proposed methodology takes into account all the considerations reported by Pinto and coworkers [ 21 ], as well as those previously studied by the knowledge representation community. A key feature of our methodology is the use of CMs throughout our knowledge elicitation process. CMs are graphs consisting of nodes representing concepts, connected by arcs representing the relationships between those nodes [ 23 ]. Nodes are labelled with text describing the concept that they represent, and the arcs are labelled sometimes only implicitly with a relationship type.
CMs proved, within our development, useful both for sharing and capturing activities, and in the formalisation of use cases. Figure 1 illustrates a CM. Our methodology strongly emphasises: i capturing knowledge, ii sharing knowledge, iii supporting needs with well-structured use cases, and iv supporting collaboration in distributed decentralised environments. Figure 2 presents those steps and milestones that we envisage to occur during our ontology development process.
Step 1: The first step involves addressing straight forward questions such as: what is the ontology going to be used for? How is the ontology ultimately going to be used by the software implementation? What do we want the ontology to be aware of, and what is the scope of the knowledge we want to have in the ontology?
Step 2: When identifying reusable ontologies, it is important to focus on what any particular concept is used for, how it impacts on and relates to other concepts, how it is embedded within the process to which it is relevant, and how domain experts understand it.
It is not important to identify exact linguistic matches. By recyclability of different ontologies, we do not imply that we can indicate which other ontology should be used in a particular area or problem; instead, we mean conceptually how and when one can extrapolate from one context to another. Extrapolating from one context to another largely depends on the agreement of the community, and specific conditions of the contexts involved.
Indicating where another ontology should be used to harmonise the representation at hand — for example, between geographical ontologies and the NCBI National Center for Biotechnology Information taxonomy — is a different issue that we refer to as reusability. Step 3: Domain analysis and knowledge acquisition are processes by which the information used in a particular domain is identified, captured and organised for the purpose of making it available in an ontology.
This step may be seen as the 'art of questioning', since ultimately all relevant knowledge is either directly or indirectly in the heads of domain experts. This step involves the definition of the terminology, i. This starts by the identification of those reusable ontologies and terminates with the baseline ontology, i. We found it important to maintain the following criteria during knowledge acquisition:. Accuracy in the definition of terms. Table 2 presents the structure of our linguistic definitions. The availability of context as part of the definition proved to be useful when sharing knowledge.
Coherence: as CMs were being enriched it was important to ensure the coherence of the story we were capturing. Domain experts were asked to use the CMs as a means to tell a story; consistency within the narration was therefore crucial. Extensibility: Our approach may be seen as an aggregation problem; CMs were constantly gaining information, which was always part of a bigger narration. Extending the conceptual model was not only about adding more details to the existing CMs, nor it was it just about generating new CMs; it was also about grouping concepts into higher-level abstractions and validating these with domain experts.
Scaling the models involved the participation of both domain experts and the knowledge engineer. It was mostly done by direct interview and confrontation with the models from different perspectives. The participation of new "fresh" domain experts as well as the intervention of experts from allied domains allowed us to analyse the models from different angles. This participatory process allowed us to re-factorise the models by increasing the level of abstraction.
A Workflow Ontology to Support Knowledge Management in a Group’s Organizational Structure
The goal determines the complexity of the process. Creating an ontology intended only to provide a basic understanding of a domain may require less effort than creating one intended to support formal logical arguments and proofs in a domain. We must answer questions such as: Why are we building this ontology? What do we want to use it for? How is it going to be used by the software layer? Subsections Identification of purpose, scope, competency questions and scenarios to Iterative building of informal ontology models explain these steps in detail. Step 4: Iterative building of informal ontology models helped to expand our glossary of terms, relations, their definition or meaning, and additional information such as examples to clarify the meaning where appropriate.
Different models were built and validated with the domain experts. Step 5: Formalisation of the ontology was the step during which the classes were constrained, and instances were attached to their corresponding classes. For example: "a male is constrained to be an animal with a y-chromosome". This step involves the use of an ontology editor. Step 6: There is no unified framework to evaluate ontologies, and this remains an active field of research. We consider that ontologies should be evaluated according to their fitness for purpose, i.
By the same token, the recall and precision of the data, and the usability of the conceptual query builder, should form the basis of the evaluation of an ontology designed to enable data retrieval. The methodology we report herein has been applied during the knowledge elicitation phase with the European nutrigenomics community NuGO [ 24 ].
Nutrigenomics is the study of the response of a genome to nutrients, using "omics" technologies such as genomic-scale mRNA expression transcriptomics , cell and tissue-wide protein expression proteomics , and metabolite profiling metabolomics in combination with conventional methods. NuGO includes twenty-two partner organisations from ten European countries, and aims to develop and integrate all facets of resources, thereby making future nutrigenomics research easier.
An ontology for nutrigenomics investigations would be one of these resources, designed to provide semantics for those descriptors relevant to the interpretation and analysis of the data. When developing an ontology involving geographically distributed domain experts, as in our case, the domain analysis and knowledge acquisition phases may become a bottleneck due to difficulties in establishing a formal means of communication i. The RSBI groups will validate the high-level abstraction against complex uses cases from their domain communities, ultimately contributing to the Functional Genomics Ontology FuGO , a large international collaborative development project [ 26 ].
Application of our methodology in this context, with geographically distributed groups, has allowed us to examine its applicability and understand the suitability of some of the tools currently available for collaborative ontology development.
- Helsinki White (Inspector Vaara, Book 3).
- Linked open data-based framework for automatic biomedical ontology generation!
- How Finance Is Shaping the Economies of China, Japan, and Korea.
- The Real Paleo Diet Cookbook: 250 All-New Recipes from the Paleo Expert.
Whilst the high-level framework of the nutrigenomics ontology will be build as a the collaborative effort with the others MGED RSBI groups, the lower-level framework aims to provide semantics for those descriptors specific to the nutritional domain. Having defined the scope of the ontology we discussed the competency questions with our nutrigenomics researchers henceforth our domain experts ; these were used at a later stage in order to help evaluate our model.
Examples of those competency questions are presented in table 3. Competency questions are understood here as those questions for which we want the ontology to be able to provide support for reasoning and inferring processes. We consider ontologies do not answer questions, although they may provide support for reasoning processes. Domain experts should express the competency questions in natural language without any constraint. For our particular purposes, we followed a 'top-down' approach where experts in the biological domain work together to identify key concepts, then postulate and capture an initial high-level ontology.
The Open Biomedical Ontologies project OBO [ 28 , 29 ] was an invaluable source of information for the identification of possible orthogonal ontologies. Domain experts and the knowledge engineer worked together in this task; in our scenario, it was a process where we focused on those high-level concepts that were part of MO and relevant for the description of a complete investigation. We also studied the structure that MO proposes, and by doing so came to appreciate that some concepts could be linguistically different but in essence mean very similar things.
This is an iterative process currently done as part of the FuGO project. FuGO will expand the scope of MO, drawing in large numbers of experimentalists and developers, and will draw upon the domain-specific knowledge of a wide range of biological and technical experts. We hosted a series of meetings during which the domain experts discussed the terminology and structure used to describe nutrigenomics investigations.
For us, domain analysis is an iterative process that must take place at every stage of the development process. We focused our discussions on specific descriptions about what the ontology should support, and sketched the planned area in which the ontology would be applied.
Our goal was also to guide the knowledge engineer and involve that person in a more direct manner. An important outcome from this phase was an initial consensus reached on those terms that could potentially have a meaning for our intended users. The main aim of these informal linguistic models was to build an explanatory dictionary; some basic relations were also established between concepts. Additionally, we studied different elicitation experiences with CMs such as [ 31 , 32 ].
CMs were used in two stages of our process: capturing knowledge, and testing the representation. Initially we started to work with informal CMs; although they are not computationally enabled, for a human they appear to have greater utility than other forms of knowledge representation such as spreadsheets or word processor tables. Using CMs, our domain experts were able to identify and represent concepts, and declare relations among them.
We used CMAP-tools version 3.
Related Domain Analysis for Knowledge Organization: Tools for Ontology Extraction
Copyright 2019 - All Right Reserved