The Information Management Group
The Information Management Group (IMG) conducts research into the design, development and use of data and knowledge management systems. Such research activities are broad in nature as well as scope, including basic research on models and languages that underpins activities on algorithms, technologies and architectures. Challenging applications motivate and validate our research, in particular the Semantic Web and e-Science.
IMG Seminar 18th January 2012
Non-Visual Interaction with Graphs
Presenter: Andy Brown
Time 13.00 18th January 2012
Venue: 2.19, Kilburn Building
Abstract
Node-arc diagrams are widely used, in situations ranging from technical documents to information for the general public - the London Underground map is a classic example. This is because they are thought to offer several benefits over alternative representations - "a picture is worth a thousand words". For the blind or visually impaired, however, they are inaccessible; although screen readers offer access to text documents, enabling non-visual access to morecomplex, non-linear, information is much more difficult. In this talk, I shall describe describe how we tackled this problem, exploring the benefits that diagrams offer sighted users, and how annotation can be used to recreate those
benefits in an audio interface. Example graphs include molecular structure diagrams, logic circuits and family trees. A demo is currently looking unlikely, but I'll try!
IMG Seminar 19th October 2011
Title: ISWC/CIKM 2011 Papers
Venue: 13.00 Wednesday 19th October 2011 in Atlas 1, Kilburn Building
This seminar is a series of papers that have been accepted for ISWC or CIKM. People are free to come and go and attend those papers that they find interesting.
- 1:00 pm - Samantha Bail
- "The Justificatory Structure of the NCBO BioPortal Ontologies"
- Accepted for ISWC2011
- Current ontology development tools offer debugging supportby presenting justifications for entailments of OWL ontologies. Whilethese minimal subsets have been shown to support debugging andunderstanding tasks, the occurrence of multiple justificationspresents a significant cognitive challenge to users. In many caseseven a single entailment may have many distinct justifications, andjustifications for distinct entailments may be critically related.However, it is currently unknown how prevalent significant numbers ofmultiple justifications per entailment are in the field. To addressthis lack, we examine the justifications from an independentlymotivated corpus of actively used biomedical ontologies from the NCBOBioPortal. We find that the majority of ontologies contain multiplejustifications, while also exhibiting structural features (such aspatterns) which can be exploited in order to reduce user effort in theontology engineering process.
- 1:30 pm - Eleni Mikroyannidi
- "Inspecting regularities in ontology design using clustering"
- Accepted for ISWC2011
- We propose a novel application of clustering analysis to identifyregularities in the usage of entities in axioms within an ontology. Weargue that such regularities will be able to help to identify parts ofthe schemas and guidelines upon which ontologies are often built,especially in the absence of explicit documentation. Such analysis canalso isolate irregular entities, thus highlighting possible deviationsfrom the initial design. The clusters we obtain can be fully describedin terms of generalised axioms that o er a synthetic representation ofthe detected regularity. In this paper we discuss the results of theapplication of our analysis to di erent ontologies and we discuss thepotential advantages of incorporating it into future authoring tools.
- 2:00 pm - Chiara Del Vescovo
- "Decomposition and Modular Structure of BioPortal Ontologies"
- Accepted for ISWC2011
- We present the first large scale investigation into the modularstructure of a substantial collection of state-of-the-art biomedicalontologies, namely those maintained in the NCBO BioPortal repository.Using the notion of Atomic Decomposition, we partition BioPortalontologies into logically coherent subsets (atoms), which are relatedto each other by a notion of dependency. We analyze various aspects ofthe resulting structures, and discuss their implications onapplications of ontologies. In particular, we describe and investigatethe usage of these ontology decompositions to extract modules, forinstance, to facilitate matchmaking of semantic Web services in SSWAP(Simple Semantic Web Architecture and Protocol). Descriptions of thoseservices use terms from BioPortal so service discovery requiresreasoning with respect to relevant fragments of ontologies (i.e.,modules). We present a novel algorithm for extracting modules fromdecomposed BioPortal ontologies which is able to quickly identifyatoms that need to be included in a module to ensure logicallycomplete reasoning. Comparing to existing module extractionalgorithms, it has a number of benefits, including improvedperformance and the possibility to avoid loading the entire ontologyinto memory. The algorithm is also evaluated on BioPortal ontologiesand the results are presented and discussed.
- 2:30 pm - Rafael Goncalves
- "Categorising Logical Differences Between OWL Ontologies"
- Accepted for CIKM2011
- The analysis of changes between OWL ontologies (in the form of a diff) is an important service for ontology engineering. A purely syntacticanalysis of changes is insufficient to distinguish between changesthat have logical impact and those that do not. The current state ofthe art in semantic diffing ignores logically ineffectual changes andlacks any further characterisation of even significant changes. Wepresent a diff method based on an exhaustive categorisation ofeffectual and ineffectual changes between ontologies. In order toverify the applicability of our approach we apply it to 88 versions ofthe National Cancer Institute (NCI) Thesaurus (NCIt), and demonstratethat all categories are realized throughout the corpus. Based on theoutcome of the NCIt study we argue that the devised categorisation ofchanges is helpful for ontology engineers and their understanding ofchanges carried out between ontologies.
IMG Seminar 26th September 2011
Title: Nailing jellyfish to the wall
Presenter: Richard White, Cardiff University
Venue: 10.30 Monday 26th September 2011 in Atlas 2, Kilburn Building
Abstract:
In this seminar I shall investigate some issues which arise with the use of unique identifiers in practical situations. Their uses to provide labels which can be used to refer to real-world objects and to database records are well understood, and they form the basis by which humans handle information and communicate with each other. In recent decades various software systems on the Internet have joined in this conversation, and at present much attention is given to the use of identifiers in Semantic Web applications such as Linked Data. I will mainly address the connection between the identifiers and the items they denote, rather than the ways in which information about the items is to be interpreted using RDF, ontologies, etc.
The assumption is that an identifier acts as a way of nailing together different pieces of information about the same item. This is useful to people who want to find relevant information which may be distributed in various sources. But what happens if the identity of the object or concept denoted by a unique identifier is doubtful, ambiguous or subject to gradual or sudden change? It is important that the information being connected by the identifier is actually consistent. Humans can cope with this to some extent, but, as usual, computers require careful instruction to avoid misunderstandings. Examples will be drawn from the area of biodiversity informatics, but similar problems exist with geographical and medical data and in bioinformatics. I will discuss some ways in which these issues are being addressed in research projects involving the cataloguing of biodiversity, an area of interest of colleagues at Cardiff University and elsewhere. I expect I shall ask more questions than I answer, so you are welcome to come along and contribute more questions or even answers.Note unusual time, day and venue.
IMG Seminar 18th August 2011
Title: Cool things around ontology engineering
Presenter: Tommie Meyer
Venue: 13.00 Thursday 18th August 2011 in Atlas 1, Kilburn Building
Abstract:
TBA
Note unusual day.
IMG Seminar 29th June 2011
Title: Analysing the Evolution of the NCI Thesaurus
Presenter: Rafael Goncalves
Venue: 13.00 29th June 2011 in Atlas 1, Kilburn Building
Abstract:
The National Cancer Institute (NCI) Thesaurus (NCIt) is a biomedical ontology which has been developed for over a decade. We collected all versions of the NCIt available in OWL format since 2003 (88 versions), and conducted a cross-sectional study on this corpus to investigate and characterize the evolution of the NCIt. In particular, we gathered and analyzed various axiom and entity statistics, as well as reasoner performance over the corpus. Additionally, we extracted two complete sets of pairwise, consecutive diffs: the first set was generated by a purely syntactic difference analysis (based on OWL's notion of "structural equivalence"); for the second set, we also checked whether the additions or removals changed the set of entailments between versions. We discovered what seems to be a fairly high level of "merely syntactic" removals and additions. We develop a categorization of such changes based on a heuristic inference of the intent of the change. As a result, not only do we get a rich, purely analytic characterization of the change history of the NCIt, but also we generate a realistic test corpus for incremental classification.