This chapter provides an overview of research data management in the health sciences, primarily focused upon the sort of data curated by the European Bioinformatics Institute and similar organisations. In this field, data management is well-advanced, with a sophisticated infrastructure created and maintained by the community for the benefit of all.
These advances have been brought about because the field has been data-intense for many years and has been driven by the challenges biology faces.
Science in this area cannot be done on a small scale: it is effectively a collaborative effort where data must be shared for any advances to be made.
This has long been acknowledged. The HUGO (Human Genome Project) set the standards, because the demands of that project were so great that only a concerted effort across the whole genome science community would enable the achievement of that goal. It established new norms of scientific behaviour in this discipline and has influenced cultural developments in the discipline
ever since.
The human genome is now long-decoded, but today’s scientific questions in health sciences are no less challenging. The infrastructure, practices, standards and norms established in the life sciences can be viewed as good practice markers for those who wish to learn from what has gone before. Not everything practised in the life sciences will read across to other fields and
disciplines, but many basic principles of research data management practice have been established that will transfer readily elsewhere. Perhaps most importantly, the life sciences have now reached the stage where the issues of long term planning, organisation and sustainability are now being tackled. The answers to these things are only partially worked out as yet, but some
fundamental principles are being elucidated and these will be useful in a more general sense.