Scope of GDPR

Does your project involve information to which the GDPR applies?

The GDPR only applies to the 'processing' of 'personal data'. It will usually be obvious whether your project falls within the scope of the GDPR, but this may not always be the case and the constituent elements of this phrase are considered below.

1. Are you processing?

Processing means almost anything a research team might do with personal data, including:

  • collecting it
  • holding or storing it
  • retrieving, consulting or using it
  • organising or adapting it
  • publishing, disclosing or sharing it
  • destroying it

2. Does your project involve personal data?

Personal data is information which relates to a living individual who can be identified from that information, whether directly or indirectly, and in particular by reference to an identifier. It includes, for example, a name, an identification number, location data, or an online identifier, such as the IP address, as long as that information can be linked by the University to a living individual. It could also include information that identifies an individual’s characteristics, whether physical, physiological, genetic, cultural or social. 

This definition is intentionally broad, and its application to particular types of research data is considered in more detail below. Where there is any doubt, the ICO advises erring on the side of caution with regard to the interpretation of personal data and looking to the flexibility in the application of the data protection principles (see below).

Expand All


The ability to identify the individual to whom the information relates is crucial to the definition of personal data. Where that individual cannot be identified, and it is not possible to re-identify the individual, the information will not constitute personal data and the duties and obligations of the GDPR will not apply.

Researchers should, however, consider whether or not an individual is identifiable, notwithstanding the removal of the usual identifiers. Indeed a combination of details on a categorical level (eg age, regional origin, medical condition) may allow an individual to be recognised by narrowing down the group to which they belong.

In determining whether an individual is identifiable, account should be taken of all the means reasonably likely to be used to identify that individual, whether by the research team or by any other person. While this does not include a mere hypothetical possibility, it does require consideration of the means that are likely to be used by a determined person with a particular reason to want to identify an individual.



Pseudonymisation is the practice of disguising the identities of individuals to whom information relates. This usually involves the removal of common identifiers and the use of a pseudonym (often a randomly allocated number), so that data can be continually collected about the same individual without recording their identity. Pseudonymising data can be useful in research.

Pseudonymous data can be collected in such a way that no re-identification is possible (eg one-way cryptography), in which case it is essentially anonymous data and the considerations above apply. However, it is often retraceable (eg key-coding and two-way cryptography) and therefore may be personal data.

Where the researcher (or any other person operating within the University) possesses the means to identify any of the individuals to whom the information relates, that information will still constitute personal data. Where, however, the pseudonymised data is received from or supplied to third parties without the means to identify the individuals, the effectiveness of the pseudonymisation will depend on a number of factors (eg how secure it is against reverse tracing, and the size of the population in which the individual is concealed).



Aggregation is the process of combining information about many individuals into broad classes, groups or categories, so that it is no longer possible to distinguish information relating to those individuals. It follows that this data should not be personal data, but its effectiveness will depend on such factors as the size of the population in which the individual is concealed.



The definition of personal data in the GDPR includes biometric data where it allows the unique identification of an individual, as well as genetic data. The term biometric data is used here to describe those intrinsic, biological, physical or behavioural traits that are both unique to an individual and measurable. Examples commonly include fingerprints, retinal patterns, facial structure, voice, hand geometry, and vein patterns; but biometric data also includes deeply ingrained skills and behaviours (eg a handwritten signature and a particular way of walking or speaking).

Biometric data has a dual character in that it is both information about a particular individual and information which is capable of identifying an individual. DNA shares this duality of character. Accordingly, in most cases biometric data and DNA will be personal data for the purposes of the GDPR, in which case it will also be 'special category personal data' (see below).

Human tissue samples may provide a source from which biometric data can be extracted, but they are not biometric data themselves; that is, the extraction of information from samples may result in the collection of personal data. The collection, storage and use of tissue samples are subject to different laws, except that those samples may be accompanied by information (eg name, age) which also constitutes personal data.



Where an individual participates in research which involves a recorded interview, that individual may disclose personal data about themselves or other people.

However, researchers should also be aware that the existence of photographs, videos and sound recordings of people (whether or not those individuals voluntarily disclose any information) may comprise information about that individual and may allow that individual to be identified. Accordingly, these are media which are capable of being personal data.


What is special category personal data?

The GDPR recognises that some categories of personal data are particularly private and/or could be used in a discriminatory way. As a result, the GDPR requires researchers to treat this 'special category personal data'1 with greater care.

Special category personal data includes any personal data consisting of the following information:

  • racial or ethnic origin
  • political opinions
  • religious or philosophical beliefs
  • trade union membership
  • genetic data
  • biometric data for the purpose of uniquely identifying a person
  • health
  • sex life and sexuality

Information about criminal convictions and offences is not included in the definition of special category personal data, but may be processed only under the control of official authority or when authorised by domestic law, which provides for appropriate safeguards. The Data Protection Bill lays down specific conditions for the processing of criminal offence data.

Due consideration should be given to information which may indirectly disclose special category personal data about an individual. For example, photographs and names may give an indication of a person’s race or religious beliefs, but will not always be special category data merely because an assumption about a person’s race or religious belief might be drawn from appearance or name. The issue will arise if that information is processed on the basis of those assumptions (for example, grouping people based on skin colour or likely ethnic origin of surname). The additional legal requirements in relation to special category personal data are described in Responsibilities under GDPR.


1Special category personal data was previously known as 'sensitive' personal data, which under the Data Protection Act 1998 included criminal convictions and allegations