Module 2: Essentials for Research Data#

In Module 2, we dig deeper into Research Data (and other research objects) and its management. You will learn to recognise different types of research data, what essential questions you need to ask yourself to start implementing RDM in your project and what are your obligations at TU Delft regarding research data and software management.

At the end of this module you will be able to:

  • Identify different types of research data

  • Recognise what is considered confidential data in research

  • Realise what RDM entails within a research project

  • Recognise the responsibilities for TU Delft PhD candidates regarding RDM

  • Store and back up the research data of your project in a secure manner

These are different activities in this module you should complete:

βœ… Learn about the different types of data and what confidential data is
βœ… Watch the videos in which TU Delft researchers discuss their perspectives on research data and confidential research data
βœ… Look at the research cycle interactive image and the RDM questions you need to ask yourself at each step of your project
βœ… Watch the video about data storage & infrastructure available at TU Delft
βœ… Complete the quiz about data storage
βœ… Read the TU Delft RDM framework

2.1. Data within the research workflow#

Research data definitions#

Research data is any information and/or (digital) object that has been collected, observed, generated or created to validate research findings.

Depending on the discipline you work in, research data can be collected or produced in different ways. You can capture them in real-time (e.g. sensors, images), you can collect them using laboratory instruments and they can derive from interviews or numerical simulations, among others.

Data can be classified in various ways, which is important for effective data management. The following categories provide a structured framework for understanding and working with research data:

Research Data categorised by recording medium#

Digital data including tabular data, images, videos, algorithms, scripts, transcripts, codebooks, etc.

Non-digital data including laboratory samples, sketchbooks, prototypes, etc. Research software/code: Computer programs or code developed and utilised in research activities to support data collection, analysis, modelling, or visualisation.

Research Data categorised by file format#

Research data can also be encountered in diverse formats during the data acquisition process reflecting the specific needs and characteristics of each scientific domain. The ability to access and reuse your data in the future depends on the chosen format. If the associated software/hardware is no longer used, data may become inaccessible. To ensure the longevity and accessibility of your research data, it is strongly recommended to use standard, exchangeable, or open file formats. 4TU.ResearchData - the partner of TU Delft, provides a list of preferred file formats for which they guarantee long-term support. You can access this list here

Research Data categorised by data protection requirements#

Personal Data: This category includes data that can directly or indirectly identify individuals. Examples include names, addresses, social security numbers, etc. Personal data requires stringent privacy safeguards.

Non-Personal Data: This category comprises data that cannot identify individuals, making it less sensitive from a privacy standpoint. However, ethical and legal guidelines still apply to non-personal data.

Anonymized Data: Anonymization involves the removal of personal information. Therefore anonymized data is often considered less sensitive and may have fewer data protection requirements.

Pseudonymized Data: Pseudonymization involves replacing identifiable information with pseudonyms or codes. While it reduces the risk of identifying individuals, it may still be subject to data protection regulations depending on the level of re-identification risk.

De-identified Data: De-identification goes a step further by removing or modifying both direct and indirect identifiers, making it extremely challenging to re-identify individuals. De-identified data is generally subject to fewer data protection requirements.

Test your knowledge#

We talked to some TU Delft researchers working in different disciplines and we asked them: β€˜What is the research data you work with?’ Let’s see what they shared with us:

What do you consider as data in your research field?#

πŸ“½οΈ Sian Jones – Faculty Of Civil Engineering and Geosciences (CEG)

πŸ“½οΈ Wirawan Agahari – TU Delft Faculty of Technology, Policy and Management (TPM)

πŸ“½οΈ Aerospace researcher - TU Delft

In this video, a researcher from the Faculty of Aerospace Engineering at the TU Delft, tells us what data she collects or uses within her research project and what she needs to take care of when working with it.

Warning

⚠️ This video is only available with a TU Delft login ⚠️. Please click on this link

Confidential Data#

(from Research Data Management TU Delft)

There are multiple types of confidential data that you might be working with during your research project. Some examples include:

  • personal data (information about an identified or identifiable natural person)

  • national security data (e.g. nuclear research)

  • data falling under export control regulations

  • confidential data received from commercial, or other external partners

  • data related to competitive advantage (e.g. patent, IP)

  • data which could lead to reputation/brand damage (e.g. climate change, personal information, animal research)

  • politically-sensitive data (e.g. research commissioned by public authorities, research on societal issues)

When working with confidential data, you need additional security measures for your data to make sure that they are not accidentally released.

If you work with personal data, please take the time to check these websites with relevant information to follow TU Delft policies:

In the next two videos, researchers from TU Delft tell us about the confidential data they work with and what RDM best-practice they follow:

πŸ“½οΈ Sian Jones talking about collaboration with industry

πŸ“½οΈ Wirawan Agahari talking about personal data in his research

When you work with personal data at TU Delft, these are the materials to read:

2.2 Data Management in a Research project#

Relevant RDM questions within the research cycle#

In the following interactive image you can go through a simplified cycle which can represent the workflow of your project. Have a look at the RDM questions you might need to ask yourself at each step of research.

Test your knowledge!#

2.3 Research Data Infrastructure at TU Delft#

Before starting the data collection/creation within your project, it is important to reflect where you will store and how you will back up the data. Selecting a storage and backup strategy will ensure that data is safe during your research project, including in the case of unpredicted problems. Following best practices on data storage and backup can help protect against data loss and facilitate effective collaboration.

In this video, we will go through the infrastructure provided centrally at TU Delft for storing, backup and sharing Research Data. You should ask your supervisor if there is a preferred approach for data storage and backup within your research group/department/ project, or if there are customised solutions already in place.

Test your knowledge! - Use cases on storage#

After the topic on the Research data infrastructure, you should now be able to answer the questions of this quiz. Good luck!

2.4 The Responsibilities of PhD candidates Regarding RDM and Research Software at TU Delft#

In this section we would like to make you aware of the responsibilities of TU Delft PhD candidates regarding Research Data Management. These responsibilities are detailed in the University and Faculty Policies.

It is very important for TU Delft that researchers follow best practices on Research Data and Software Management. That is why TU Delft has been publishing policies since 2018. The policies provide a clear vision of the rules and responsibilities around Research Data and Software Management.

Click to read the Research Data framework policy πŸ“–

This Framework policy is accompanied by Faculty-specific research data management policies, which provide more detailed requirements and guidelines for the disciplines associated with each Faculty.

Click to read Faculty Research Data Management Policies πŸ“–

Test your knowledge!#

Research Software Policy#

At TU Delft, software is recognised as a valuable research output. Therefore, software should be well documented, preserved and, whenever possible, consistent with the FAIR principles (should be well-managed the same as research data). The TU Delft Research Software Policy provides a clear division of roles and responsibilities. It also sets out a simplified and streamlined process to help researchers share software. Software is recognised at TU Delft as a valuable research output that needs to be well documented, preserved and, whenever possible, consistent with the FAIR principles.

Check these resources:

TU Delft Research Software Policy πŸ“–
TU Delft Guidelines on Research Software πŸ“–

Question to you:

Were you already aware of any of these policies? And, now that you have looked at the different policies, are your responsibilities regarding Research Data and Software management clear to you?