WP VI: Cross-
disciplinary applications
The general approach for each of the four
cross-discipline research programs in this Work Package is to (1) evaluate
known techniques for modelling information and services in each domain (2) to
analyse the used techniques for weaknesses and strengths relative to the
characteristics of the domains, (3) to feed these results into the four basic
research work packages, (4) to search for common improvements in methods and
techniques (5) to develop a common platform for development of information and
services (6) to relate the new model and approach to contemporary industrial
practice.
Goals: The aim is to choose different issues from the basic
research components comprising language technology, conceptual modelling, information
service engineering, information resource management, and Web application
engineering for each of the cross-discipline research projects of health
informatics, bio-informatics, learning with ICT to concentrate and to elaborate
the issues as described above. Thus, knowledge exchange alongside a focus, and an objective test bed for the approach and
techniques supporting ontology and Web service engineering will be provided.
This work package contains the following tasks.
Task VI.1: In health informatics, the concentration will be on
information modelling / ontologies related to
electronic patient records, and in workflow modelling and user interface design
in order to provide for better work support for health workers.
Task VI.2: In bio-informatics, the focus is on ontology of the
genome, to support retrieval of information from text about bio-informatics.
This will involve an investigation of the use of ontology for integrating the
immense amounts of data in biological databases.
Task VI.3: In learning by ICT, the concentration is on articulation of knowledge.
Learning is seen as a process, which leads to a shared model of knowledge about
the domain being studied.
Task VI.4: In information security, the concentration is on characterization of knowledge
to be protected, and on characterization of illegal information processes. Role
modelling, modelling of an actor’s knowledge of both the
information and the other actors, as well as information access rules, are of
special interest in this domain.
Task VI.5: In digital libraries the focus is on modelling the overall structure of
the libraries, as well as particular items, to enable effective retrieval and
possibly also more advanced information services on top of the libraries.
Task VI.6: The relevant language technology
issues in this context are analysis of natural language with the purpose of
meaning extraction from large text corpora, and cognitive and linguistic
methods for semantic enrichment of models.
Basic research issues: because of the diversity of this workpackage,
this is discussed field by field below.
Bio-informatics:
Molecular biology is an
information-rich and web-oriented research area. Today most of the relevant
data are available through web. This includes genome sequences, protein
sequences, protein structures, interaction data, microarray
data, publications and experimental data. Data are continuously updated, and a
large variety of small, specialised databases is the rule rather than the
exception. Therefore the research focus in the field, bioinformatics,
is increasingly towards the semantic web. A relevant example is the Gene
Ontology project. Organised as a consortium (http://www.geneontology.org) the
project is working on a structured, precisely defined, common, controlled vocabulary
for describing the roles of genes and gene products in any organism. The
vocabulary is organised into three categories, biological process, molecular
function and cellular component. Currently more than 10.000 terms are defined.
However, the Gene Ontology (GO) project covers only a very small part of all
vocabularies used in molecular biology, and it has been argued that GO is more of a nomenclature or controlled vocabulary
for molecular biology rather than a full-fledged gene ontology. Improved solutions
and standards are strongly needed in most areas, and in particular standards
for data retrieval and integration are almost non-existent. Also natural
language processing is important, in particular for automatic data retrieval
for publications.
Health informatics: Health services depend on – and generate – huge
amounts of information in their operation, e.g., information about patients,
diseases, and treatments, as well as budgets, personnel info etc. With
traditional IT support the effective generation, use, and communication of this
information has been a significant problem, yielding health services that are
sub-optimal in terms of cost vs. stakeholder satisfaction. For instance, most
doctors only spend a fraction of their time treating patients, more being spent
on administrative activities and information search. The semantic web has
potential for creating more effective IT support for the health services,
building on meta-models / ontologies for patients,
diseases, treatments, doctors’ fields of expertise etc. With systems that
understand the meaning of the information they store, it becomes easier to
retrieve the right information for a particular need. With systems that also
understand the work-context where the information is needed (e.g., having
active workflow models for doctors, nurses, and patients), it becomes possible
to support health workers in an even more timely manner, providing just the
information that is needed for every task. Such systems can also generate much
of the required information about actions taken, thus relieving health workers
of some of their administrative burdens and give more time for actual contact
with the patients.
E-learning: A key issue in pedagogy is individualization, i.e.,
adapting the teaching to the needs of various learners. In many cases, however,
IT supported education has so far focussed most on porting existing courses
with traditional teaching methods onto the web, just making non-individualized
teaching even more widely available. The semantic web has potential regarding
the creation of more intelligent e-learning applications, providing
individualization without a prohibitive increase in man-power. Some preliminary
ideas:
• Make
models of subjects, in terms of what knowledge the subject comprises
• Make
models of courses or teaching/learning resources, in terms of what subjects
they address (learning goals, topic matter, skills), as well as available
teaching methods
• Make
models of each student, i.e., a defined and gradually updated profile showing
hers/his background knowledge, short-term and long-term learning needs,
preferences in terms of teaching methods, and constraints, e.g., in terms of
time and money. All of this may vary highly between, e.g., a full-time student
taking a full course and an industry consultant seeking urgent update on a
specific topic.
If these representations are semantically
interoperable, it should be possible for an e-learning application to match
them to package an optimal learning process for each student, including
guidelines on how to evaluate that the individual learning needs are being met.
Information security: In current IS development, security issues are often
overlooked in the analysis phase, partly due to pressure from short lead times,
and partly because mainstream IS engineers lack the competence to use methods
for secure systems engineering, which tend to be heavyweight and require
advanced mathematical knowledge. The semantic web has potential to address some
of these difficulties through reuse of models. For instance, one can
• Model
the organization, its information / IT assets, and the security goals for these
assets (both existing assets and planned assets).
• Model
threats / attacks, ranging from technically sophisticated hacker attacks through
script-kiddy attacks and misuse committed by
insiders, as well as physical sabotage and social engineering attacks.
• Model
requirements that address various threats, and their
links to IS architectures, products, and design mechanisms to ensure various
levels of security.
With semantic interoperability, a development tool
should be able to match these various models to save work to the development
project, for instance in expressing security requirements more quickly and with
a higher degree of completeness than before. E.g., given an organization with
some existing and some planned information assets, what threats must be looked
into, what possible requirements can be expressed to deal with those threats,
and what possible designs / products exist to ensure such security
requirements.
Digital libraries: In digital libraries it is of interest to make models
both of the overall library structure, and in more detail of the various items
in the library, to facilitate effective retrieval of information. Moreover, one
can envision that the digital libraries of the future will not only offer
information to their users, but also information related services. The natural
medium for accessing such digital libraries will be the web, meaning that the
core technologies of the semantic web (information modelling, meta-modelling,
information service engineering) are highly relevant to this application area.
Natural language technology: Natural language is the most intuitive mode for
humans to access information. Hence there are strong couplings between the
field of natural language technology and semantic web. NLT can enrich the
semantic web by making web applications accessible via natural language
interfaces. On the other hand, natural language technology can also gain from
the semantic web and ontology engineering, which can act as semantic models for
natural language grammars, both for specific fields of expertise and in more
general cases. Since language change over time, model management becomes a key
issue – the language models must be easy to update, yet older versions must
also be kept in order to understand documents written in the past. NLT
technology is also highly relevant for concept extraction, i.e. in the process
of establishing semantic models.