Skip to Main Content

Data management for thesis

Thesis and data protection

The EU General Data Protection Regulation (GDPR) regulates the processing of personal data also in theses.

The research data itself may contain personal data, but personal data may also be in the documents required for the collection of the data, such as the consent forms of the subject. Personal data can also be generated from a survey intended to be anonymous, if the survey is carried out with an online form that saves, for example, the respondent's IP address, or if the survey has open-answer options.

The student has an obligation to take care of data protection when doing the thesis. The thesis supervisor's duty is to advise the student on data protection issues.

What is personal data?

Personal data is any information that can be used to directly or indirectly identify an individual. Research data may also contain identifying information about the research subjects' close contacts or other individuals.

Direct personal data includes a person's full name, personal identification number, and various biometric identifiers such as fingerprints, facial images, voice, and hand-drawn signatures.

Strong indirect identifiers are individual pieces of information that can reasonably be used to identify a person. Examples include address, phone number, rare occupational title, rare disease, and unique identifiers such as computer IP address, student ID, or account number.

Indirect identifiers are all the pieces of information that, when combined, can identify an individual. These may include gender, age, place of residence, occupational title, household composition, income, marital status, language, nationality, ethnic background, workplace, or school. When the target group of the research is already limited and reasonably small, combining indirect background information can reasonably identify individuals.

Source: Finnish Social Science Data Archive

Special categories of personal data

Sensitive personal data refers to special categories of personal data as defined by the General Data Protection Regulation. These categories include information that reveals a person's:

  • Race or ethnic origin
  • Political opinions
  • Religious or philosophical beliefs
  • Trade union membership
  • Data concerning health
  • Sexual orientation or activity
  • Genetic and biometric data for identifying the person

Sensitive data must be treated with special care as their processing can pose risks to fundamental rights of individuals. Therefore, their processing is generally prohibited. However, there are exceptions to this prohibition, and one of them is the explicit consent of the individual for the processing of such sensitive personal data.

Storing data containing sensitive personal information in cloud services is prohibited at Metropolia.

Minimisation of personal data

The principle of data minimisation in the collection of personal data means avoiding the collection of unnecessary personal data. This principle should be followed already in the planning phase of your thesis research.

  • Collect only those personal data that are necessary to answer the research questions.
  • Do not collect personal data "just in case."
  • Avoid collecting sensitive information.
  • Avoid open-ended response options in surveys, as you cannot control what participants write.
  • During personal interviews, you can ask the interviewee to refrain from providing specific details such as names or workplaces.
  • Consider how detailed information you actually need. Is it sufficient to use a category or approximation instead of precise data? For example, instead of exact age, use an age group like "20-29 years old," or instead of specifying "Metropolia University of Applied Sciences," use the term "university of applied sciences."

 

Checklist for Data Privacy in Thesis

  1. Create a data management plan. Identify whether you collect and process personal data. Only collect essential personal data relevant to the research.
  2. If the data involves significant privacy risks for the participants, conduct a Data Protection Impact Assessment (DPIA). This is necessary, for example, when the data contains sensitive personal information or involves research on children. An ethical review may also be necessary.
  3. Determine the data controller.
  4. Prepare a privacy noticy. Plan in advance and clearly describe the collection, storage, processing, and disposal of personal data in a comprehensive and understandable manner in the privacy noticy. The privacy noticy is included in Metropolia's template for informing research participants.
  5. You need a legal basis for collecting and processing personal data. In the case of theses, it is usually the consent given by the research participant.
  6. Before collecting data, inform the research participants about the study and the processing of their personal data in a clear and understandable manner. Metropolia has its own template for this purpose. After providing the information, the research participant can give consent to participate in the study and allow the processing of their personal data. There is also a separate form for this.
  7. Handle personal data with care and in accordance with the information provided to the research participants. Use only Metropolia-approved tools for data collection, transfer, and storage. Data privacy breaches and negligence may lead to sanctions or rejection of the thesis.

Metropolia Library and Information Services | Accessibility Statement