Avoiding Data Disasters: Best practices in Research Data Management for Biological Sciences
When and where?
- Friday 23rd February 2018; 12:30 - 16:30; eLearning 3 - School of Clinical Medicine
- Wednesday 12th July 2017; 13:30 - 16:30; Cancer Research UK - Cambridge Institute, University of Cambridge
Brought to you by the Cambridge University Data Champions
Description
- How much data would you lose if your laptop was stolen?
- Have you ever emailed your colleague a file named ‘final_final_versionEDITED’?
- Have you ever struggled to import your spreadsheets into R?
As a researcher, you will encounter research data in many forms, ranging from measurements, numbers and images to documents and publications. Whether you create, receive or collect data, you will certainly need to organise it at some stage of your project. This workshop will provide an overview of some basic principles on how we can work with data more effectively. We will discuss the best practices for research data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future.
Aims: During this course you will learn about:
- Options for backing up your computer
- Ideas for naming and organising your files
- Strategies for exchanging files with collaborators
- Tips and tricks to make sure that your spreadsheets are readable by programming languages such as R
- Learn how to use the OpenRefine software for data cleaning
- Preparing high-throughput biological data for submission to a public repository such as Gene Expression Omnibus (GEO) or ArrayExpress
Objectives: After this course you should be able to:
- Differentiate the pros and cons of using spreadsheets and avoid the common pitfalls in spreadsheet manipulation
- Use an appropriate backup strategy for your data
- Organise your files in a more structured and consistent manner
- Known what resources are available at The University of Cambridge for Research Data Management
Materials
- Data Formatting
- Open Refine Demo
- File Management Best Practices
- Data Sharing
- Patient Data for practicals - Right-click on link and select Save Link As…
- Electronic whiteboard (Etherpad)
References
- Data Carpentry Open Refine for Ecology
- Data Carpentry Workshops
- The Data Organisation Tutorial by Karl Broman
- The Quartz guide to bad data
- Three common bad practices in sharing tables and spreadsheets and how to avoid them
- Five Selfish Reasons to work reproducibly - Florian Markowetz
- Keith Baggerly lecture on Duke reproducibility scandal
- Biologists: this is why bioinformaticians hate you…
-
[Issues related to data preservation BBC Domesday Project](https://en.wikipedia.org/wiki/BBC_Domesday_Project) - Research Data Management at Cambridge University
- Managing and sharing data, UK Data Archive
- Information Compliance - Data protection
- File management, Cornell University
- File management, Curtin University
- File organization, MIT Libraries
License
This work is licensed under the Creative Commons Attribution 4.0 International License.