About 30 zettabytes (30 · 10^21 bytes) of data are generated worldwide
every second — so much that over 90 % of the data in the world today
has been created in the last two years alone. Science as well is flooded
by an ever increasing amount of data. However, accessing the infor-
mation hidden in this massive amount of data is a challenging task
and in science often presence a hindrance to knowledge discovery.
One way to overcome this is a good visualization, which can greatly
support people and scientists in exploring, understanding, and enjoy-
ing data. In this thesis, I present three examples for a task oriented
visualization in some of the most data-rich disciplines in science: bio-
chemistry, healthcare, and biology.
The first example is situated in the field of biochemistry. Since the
1980s, natural sciences challenged educational institutions and media
to keep the society on an appropriate level of knowledge and un-
derstanding. By investigating the potential of infographics, graphical
design, and game motivation, I present a mnemonic card game based
on creative design to aid the learning of a special group of biomol-
ecules, the amino acids. Each amino acid is composed of a number
of features. The latter are intuitively encoded into shapes, colors, and
textures to assist our abilities in interpreting visual stimuli. Thus, it
facilitates recognizing such features, grouping them, noting relation-
ships, and ultimately memorizing the structural formulas. The cards
translate complex molecular structures into visual formats that are
both easier to assess and to understand. The result is a unique teach-
ing tool that is not only subject-oriented, fun, and engaging, but also
helps students retain relevant information such as properties and for-
mulas through perceptual memory.
The second example tackles a problem from the field of healthcare.
Oral cancer has a major impact worldwide, accounting for 274 000
new cases and 145 000 deaths each year, making it sixth most com-
mon cancer. Developing methods for the detection of cancer in its
earliest stages can greatly increase the chances for a successful treat-
ment. Many cancers (including oral cancer) are known to develop-
ment through multiple steps, which are caused by certain mutations
to the genome. A recently published protocol by Hughesman et al.
(2016) describes means for high-throughput detection of these mu-
tations using droplet digital PCR. However, methods for automated
analysis and visualization of this data are unavailable. In this the-
sis, I present ddPCRclust, an R package for automated analysis of
droplet digital PCR data. It can automatically analyze and visualize
data from droplet digital PCR experiments with up to four targets per
reaction in a non-orthogonal layout. Results are on a par with manual
analysis, but only take minutes to compute instead of hours. The ac-
companying Shiny application ddPCRvis provides easy access to the
functionalities of ddPCRclust through a web-browser based graphical
user interface, enabling the user to interactively filter data and change
parameters, as well as view and modify results.
The third example involves some of the most data-rich disciplines
in biology - transcriptomics, proteomics, and metabolomics. Omics
Fusion is a web based platform for the integrative analysis of omics
data. It provides a collection of new and established tools and visual-
ization methods to support researchers in exploring omics data, vali-
dating results, or understanding how to adjust experiments in order
to make new discoveries. It is easily extendible and new visualization
methods are added continuously. I present an example for a task-
oriented visualization of functional annotated omics data based on
the established Clusters of Orthologous Groups (COG) database and
gene ontology (GO) terms.