Text Mining For Drug Discovery

Drug development is an information intensive effort that relies heavily upon the ability to expeditiously sift through large quantities of external and internal information sources. As such, it provides one of the most compelling opportunities for the use of text mining tools and Semantic Web technology for large-scale knowledge acquisition from the literature. In fact, several examples of text mining for database development and application to knowledge mining already exist. Two use cases of...

Business Object Model Design

The business object model for the above clinical decision support rule could be specified as follows. method has_genetic_test_result() StructuredTestResult method has_liver_panel__result () LiverPanelResult method has_contraindication ) set of string method has_therapy() set of string method has__allergy () set of string method indicates_disease() Disease method identifies_mutation() set of string method evidence_of_mutation(string) real The model describes patient state information by...

Architecture For Translational Medicine

In the previous section, we presented an analysis of some information requirements for translational medicine. The information items have multiple stakeholders and are required or generated by different application and software components. In this section, we build upon this analysis and present architectural components required to support a cycle of learning from and translation of innovation into the clinical care environment. Table 12-1. Information Requirements Table 12-1. Information...

Conclusion And Future Directions

Semantic Web (RDF) database technologies have been maturing over the past several years. The two use cases (LinkHub and YeastHub) presented in this chapter show that RDF data warehouses can be built to serve some practical data integration needs in the life science domain. While the relational database is the predominant form of database in use in life sciences today, it has the following limitations that can be addressed by the RDF database technology. While a relational schema can be exposed...

Link Hub

LinkHub can be seen as a hybrid approach between a data warehouse and a federated database. Individual LinkHub instantiations are a kind of mini, local data warehouse of commonly grouped data, which can be connected to larger major hubs in a federated fashion. Such a connection is established through the semantic relationship among biological identifiers provided by different databases. A key abstraction in representing biological data is the notion of unique identifiers for biological entities...

Data Integration In Bioinformatics

The amount of available data in the life sciences increases rapidly and so does the variety of data formats used. Bioinformatics has a tradition for legacy text-based dataformats and databases such as UniProt 2 for protein sequences, PDB 3 for 3D structures of proteins, or PubMed 4 for scientific literature. Today, many databases, including the above are available in Extensible Markup Language (www.w3.org XML ). Due to its hierarchical structure, XML is a flexible data format. It is a...

Querying Individual Descriptions

The open-world assumption also affects how queries about individual descriptions are answered. Besides the basic inference services for individual descriptions some OWL DL reasoners also support query answering with functionality similar to DBs. Again, query answering about OWL DL individual descriptions might involve reasoning in contrast to standard DBs, where query answering mostly involves table look-ups. One of the currently most advanced query languages 26 , called nRQL (New RacerPro...

Biological Ontologies

Patrick Lambrix, He Tan, Vaida Jakoniene and Lena Str mb ck Department of Computer and Information Science, Link pings Universitet, Sweden Abstract Biological ontologies define the basic terms and relations in biological domains and are being used among others, as community reference, as the basis for interoperability between systems, and for search, integration and exchange of biological data. In this chapter we present examples of biological ontologies and ontology-based knowledge, show how...

The Taverna Provenance Architecture

The Taverna workbench and the Experiment environment collect experiment provenance from the scientist via plug-in tools. The basic architecture for gathering and processing workflow provenance is shown in Figure 16-12, The Taverna workflow enactor produces workflow and knowledge provenance via a Provenance Capture plug-in that sits on the Experiment Interoperation Bus (Figure 16-7) and listens to events generated by the enactor. Data products, gathered from databases or newly computed, arising...

Ontology Accreditation Certification Maturity Model

Once validation, verification, and evaluation of ontologies become standard practice, a further evolution toward more rigor is to issue accreditation or certification (to a given ontology or to a team of ontology developers or an organization) based on a set of recognized evaluation criteria by an accrediting body (top-down) or an accrediting process (bottom-up) similar to the trustworthiness, reputation, and feedback mechanisms of online services and communities such as E-Bay and Amazon 21 ....

Conclusion

In this article we have discussed visualization model for OWL, focusing on GrOWL visualization model. We have used the following tentative criteria for the performance of OWL visualization frameworks 1. Sufficient completeness and simplicity to provide a readable rendering of all or almost all elements of Roger L. Costello's camera ontology 13 and other similar-sized ontologies on a 640 by 800 canvas. 2. Support for separate views of the class definitions, the named class hierarchy, and the...

Dlg2 A Visual Langauge For

In contrast to OVT that approaches the human-machine barrier through careful arrangement of nodes and edges within a given space, a VOL achieves this through formal pictorial representation of the language constructs. However, just as it is difficult to clearly define what a natural language is, it is also difficult to define precisely what makes a VOL. For instance, although a layout algorithm cares not about what kind of shape should be used to represent a given node or edge, an OVT...

Tool support

Prova, because of its relative youth, has almost no support for editing or debugging tools. XPath is simple enough to be written with a plain text editor. However it is strongly recommended to use specialized editors for XQuery. There exist mature tools for several software platforms which come with editing support, validation and debugging functionalities. Xcerpt is accompanied by a visual query authoring and execution tool called visXcerpt. It features a web-based graphical interface, running...

Applying Owl Reasoning To Genomic Data

Katy Wolstencroft', Robert Stevens' and Volker Haarslev2 'School of Computer Science, University of Manchester, UK. 2 Department of Computer Science and Software Engineering, Concordia University, Canada Abstract The core part of the Web Ontology Language (OWL) is based on Description Logic (DL) theory, which has been investigated for more than 25 years. OWL reasoning systems offer various DL-based inference services such as (i) checking class descriptions for consistency and automatically...

Query Languages and Inference Mechanisms

The primary advantage in adopting semantic web data and knowledge representation schemes is that they enable query processing and reasoning capability that can address various requirements such as data integration, decision support and knowledge maintenance discussed earlier in this chapter. The expressivity and performance of query processing and inference mechanisms will play a critical role in enabling novel healthcare and life science applications. Some interesting issues that need to be...

Normalization and Grounding

Normalization needs to decide on a canonical name for each entity, like a protein or an organism. Since the ontology encodes information about e.g. scientific names for organisms, a corresponding normalized entry can often be uniquely determined with a simple lookup. In case of abbreviations, however, finding the canonical name usually involves an additional disambiguation step. For example, if we encounter E. coli in a text, it is first recognised as an organism from the pattern species...

Ontological Knowledge

In addition to the ontologies there is also other publicly available ontological knowledge that can be used for data search, integration and analysis 13,14 , This knowledge includes ontology alignments (i.e. inter-ontology relationships), ontological annotations of data sources, and mappings between data values and ontological terms. Ontology alignments. As mentioned before, knowing inter-ontology relationships is a major issue and some organizations have started to address it. As a result of...

Knowledge Representation

Of importance in evaluating an ontology is the expressivity of the knowledge representation (KR) language the ontology is represented in, in light of the trade-off between the value of high expressivity and the cost of computation. Emphasis on high expressivity is manifested by First-Order Logic (FOL)-based languages such as Common Logic (CL) 18 , the Interoperable Knowledge Representation for Intelligence Support (IKRIS) language 38 , and the Web Ontology Language's (OWL) most expressive...

References

1 Altman R.B., Klein T.E., Murray T., and Dunker A.K., editors. Pacific Symposium on Biocomputing, 2006, Singapore, 2006. World Scientific Publishing Co. 2 Ashburner M., Ball C.A., Blake J.A., Botstein D Butler H Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., and Sherlock G. Gene Ontology tool for the unification of biology. Nature Genet., 25 25-29, 2000. 3...

Beyond XML

It is not yet clear if XML will eventually become the universal format for data exchange. Relational databases, flat files, and other idiosyncratic formats might subsist and limit, in practice, the applicability of pure XML query languages. We have shown how practical Prova is for assembling workflows involving heterogeneous sources of data. Prova is also able to delegate XML processing tasks to XQuery which has itself a Java implementation based on the Saxon library (http saxon.sourceforge.net...

Viewing Structure

Graph drawing techniques study the general constraints of geometrical representation of nodes and edges. Given a set of nodes and edges, a graph drawing program must compute the position of nodes and edges, satisfying a set of physical (e.g., display resolution) and psychological (e.g., the aesthetic rules) conditions 15-17 , Because the RDF model is, itself, based on a graph model, graph drawing techniques are the logical choice for ontology's visualization. Of all visual structures, the...

Syntax

OWL DL provides an abstract syntax and an RDF XML syntax, as well as a mapping from the abstract syntax to the RDF XML syntax 46 . In this sub-section, we will introduce the abstract syntax of OWL DL, which is heavily influenced by frames in general and by the design of OIL in particular. The abstract syntax is important because the model-theoretic semantics of OWL DL to be introduced in the next sub-section is based on it. It is important to note that not all valid RDF XML statements are valid...

Case Study Proteinbrowser

Biological databases are growing rapidly. Currently there is much effort spent on annotating these databases with terms from controlled, hierarchical vocabularies such as the Gene Ontology. It is often useful to be able to retrieve all entries from a database, which are annotated with a given term from the ontology. The ProteinBrowser use-case shows how one typically needs to join data from different sources. The starting point is the Gene Ontology (GO), from which a hierarchy of terms is...

Yeast Hub

YeastHub features the construction of a RDF-based data warehouse implemented using Sesame for integrating a variety of yeast genome data. This allows yeast researchers to seamlessly access and query multiple related data sources to perform integrative data analysis in a much broader context. The system consists of the following components registration, data conversion, and data integration. This component allows the user to register a Web-accessible dataset so that it can be used by YeastHub....

UMLS and Discovery Systems

We and others have pioneered the integration of genomic databases with ontology-anchored clinical databases. Since clinical decision support systems like Quick Medical Reference QMR 76 contain densely coded descriptions of diseases, we hypothesized that they can be used as a proxy for clinical databases in genetic studies. To unveil systems biology properties of phenotypes via conducting genome-scale clustering analysis of phenotypes associated with diseases, we conducted two studies with QMR....