Summary¶

In this workshop we have discussed the following topics:

strategies for naming files in a way that is easy to work with using computational tools but also human-readable
hierarchical data organisation, keeping raw data separate from processed data and results
documenting your data organisation and how raw data was obtained and processed
how to decide what data to back up and options available at King's for secure back ups of data
using version control (specifically git) to manage code and track changes

Exercise

Spend a few minutes thinking about a research project you're working on at the moment. Which of these things are you already doing? What could you do differently? Make notes on what project organisation or management changes would have the biggest benefits for your project.

Optionally, compare notes with your neighbour. Did you note down the same things?

Going further¶

Version control¶

There are many things you can do with git that we didn't have time to discuss today. These include using branches to experiment with different versions of your code, and using online repositories to collaborate with others. For a more detailed tutorial on how to use these git features and more, we recommend the Software Carpentry lesson on version control.

If you prefer in-person training, both e-Research and the Hub for Applied Bioformatics have provided in-person version control training in the past. See our training page for more information.

Publishing data¶

Sharing research data allows others to reproduce your research and reuse your data for other research projects. This can increase the visibility and impact of your research. The FAIR principles for research data provide guidance on making your research data Findable, Accessible, Interoperable and Reusable. More information on why and how to publish your data is available from the Library.

Publishing code¶

It's now common in many fields to publish the code used for a paper alongside the paper. Similarly to publishing data, publishing code helps others reproduce and build on your work. The FAIR principles can be applied to research software as well as data.

One key consideration when publishing code is choosing a license for your code. Without a license, it isn't clear to users how they can use and build on your code, or whether they can use it at all. Having a license makes it more likely that others will use and build on your code. Information about different code licenses and how to choose one is available at choosealicense.com.