Instructions on the anonymisation of test data

Date of issue
4/6/2023
Validity
4/6/2023 - Until further notice

1 Key terms

Anonymisation Anonymisation means that personal data is processed such that the person cannot be identified on the basis of the data under any circumstances. Also, it must be impossible to convert the data back into an identifiable form.

Pseudonymisation Pseudonymisation means that personal data is converted such that it cannot be attributed to a specific person without the use of additional information. A requirement is that the additional information is kept separately and that it is subject to technical and organisational measures ensuring that the personal data cannot be connected to an identified or identifiable natural person. Pseudonymised data is still personal data and its processing is subject to the data protection regulations. In other words, the person can be identified by combining data from different datasets even though the data has been pseudonymised.

GDPR General Data Protection Regulation (EU) 2016/679.

2 Instructions on the anonymisation of test data

These instructions provide the information needed to anonymise test data used in testing. The instructions are for parties participating in the stakeholder testing of the project to establish a positive credit register.

The data used in testing may not be connectable to natural persons or their data. In anonymisation, the data is made unidentifiable such that a person cannot be identified on the basis of the data in any situation, and the data cannot be converted back to the original state by any means. Personal identity codes or other data that can be connected to natural persons may not be sent as test data under any circumstances: the test data must always be anonymised.

If the data can be restored with a code key or by combining it with other data, the data in question is pseudonymised data. According to the GDPR, pseudonymised personal data that can be connected to a natural person with the use of additional information must be regarded as data relating to an identifiable natural person. Consequently, pseudonymisation is not enough but all test data must be fully anonymised.

In the first place, artificial business IDs and artificial personal IDs provided by the project will be used as test IDs in stakeholder testing. The project delivers the IDs to the stakeholder's master user, who has filled in the testing start notification in the Enter portal.

The data generated by stakeholders during testing and submitted to APIs by them will not be anonymised to any extent by the project. The stakeholders are responsible for the anonymisation of their own test data in every respect.

3 IDs used in testing

The project provides the testing stakeholders with:

artificial personal IDs, or test customer IDs

an artificial Business ID and an artificial organisation name

artificial business IDs to be used as the reassignee’s ID code.

Primarily the above-mentioned artificial personal IDs, Business IDs or foreign business IDs must be used in testing. Natural persons’ actual personal data may not be used in testing under any circumstances. In addition, no identifiers that could be connected to natural persons by any means may be used in testing. For this reason, other identifiers used in testing, such as Business IDs, must also be artificial.

If the stakeholder wants to use its own identifiers in addition to or instead of the artificial identifiers provided by the project, it should send the identifiers to the project in anonymised form unless they are artificial. The identifiers can be artificial or anonymised personal identity codes, Business IDs and foreign business IDs. The project does not anonymise any data submitted to it. The stakeholders are responsible for the anonymisation of their own test data in every respect. Anonymisation must be carried out in accordance with these instructions.

A stakeholder wanting to use its own identifiers in testing must report this on the Enter portal's contact form. The stakeholder will then receive more detailed instructions on how to deliver the identifiers.

The stakeholder’s own artificial or anonymised identifiers can be used in testing only if they have been separately delivered to the project and entered in the testing environment by the project.

For detailed instructions on the anonymisation of identifiers, see section 6.

4 Loan data to be reported to the register

During testing, stakeholders send test data connected to artificial or anonymised personal identity codes to the testing environment through APIs. All test data submitted through APIs must be artificial or anonymised in accordance with these instructions. In addition to the personal identity code, the loan number, the lender's marketing name and the reassignees’ identifying information, for example, must be anonymised.

For detailed instructions on the anonymisation of data content, see section 6.

5 Other situations

If the stakeholder wishes to report such income data for its own test customer IDs that will be shown in the credit register extract, the data must be anonymised. You can request instructions on submitting income data by filling in the contact form in the Enter portal.

6 Data to be anonymised and examples of anonymisation

Table 1 below specifies the data content to be anonymised and any special features that should be taken into account in the anonymisation, such as various requirements for the format. All data in the table must be anonymised. Table 2 contains some examples of anonymisation.

Table 1. Data to be anonymised, requirements and reasons for anonymisation.

Täytä tähän taulukon kuvaus, esteettömyyttä silmälläpitäen, ei näy kuin ruudunlukijoille
Data Special features relevant to anonymisation Reason for anonymisation

Personal identity code (IdCode)

The personal identity code must be fully anonymised. The end part of the personal identity code must be a number between 900 and 999 (see example in Table 2). The requirement for the format of the personal identity code is fulfilled but the code is not an actual personal identity code.  The personal identity code is a means of identifying the individual. Applies both to the borrower and to the guarantor.

Loan number (Number)

 

The loan number can be connected to a natural person.

Gross income
(GrossIncomeOnFile)
  Certain individuals can be identified with their exceptional income data, such as an exceptionally high salary or benefit.
Net income
(NetIncomeOnFile)
  Certain individuals can be identified with their exceptional income data, such as an exceptionally high salary or benefit.

Business ID (IdCode)

The Business ID must be in correct format, i.e. the control character must be correctly calculated. (See example in Table 2 below)

Some companies have few customers. If the Business ID is known, it may be possible to connect loan data to a natural person.

Lender’s name for marketing purposes
(LenderMarketingName)
Lender's name for marketing purposes must be fully anonymised.  Lender’s name for marketing purposes may refer to a small-scale lender, for example, so it may be possible to identify individuals.
Reassignee’s ID code (IdCode) If the ID code is a Business ID, it must be in correct format, i.e. the control character must be correctly calculated. Some companies have few customers. If the ID code is known, it may be possible to identify the natural person that the data relates to.

Reassignee's name (Name)

The reassignee’s name must be fully anonymised.

 

The reassignee’s name can refer to a small-scale lender, for example, so it may make it be possible to identify individuals.

Reassignee’s country code (CountryCode) The country code must correspond to a 2-letter code according to ISO 3166. Some countries may have few operators to which a loan can be transferred. In such cases, it may be possible to identify the person that the data refers to from the country code.
Foreign business ID (IdCode)   Some companies have few customers. In such cases, it may be possible to identify the person that the data relates to from the foreign business ID.
Date of conclusion (ContractDate)   The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Amortization paid (AmortizationPaid)   The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Interest paid
(InterestPaid)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Other loan expenses paid
(OtherExpenses)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Loan balance (Balance)   The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Amount of loan balance (Balance)   The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Unpaid amount of an instalment
(DelayedInstalment)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Original due date of a delayed instalment
(OriginalDueDate)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Deferments of amortizations
(DefermentPeriods)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Leasing – Monthly instalment
(MonthlyInstalment)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Leasing – Date of conclusion
(ContractPeriodStartDate)
  The person that the data relates to can be identified on the basis of the data as such or in combination with other data.
Income data All income data reported for artificial personal identity codes used in testing must be fully anonymised.  Certain individuals can be identified with their exceptional income data, such as an exceptionally high salary or benefit.

The table below contains examples of anonymisation. The first column shows the original value of the data element, the second column provides anonymisation instructions, and the third column shows an example of successfully completed anonymisation.

Table 2. Anonymisation examples.

Täytä tähän taulukon kuvaus, esteettömyyttä silmälläpitäen, ei näy kuin ruudunlukijoille

Data element
Original value

Anonymisation Result
Personal identity code The personal identity code must be fully anonymised, i.e. both the date of birth and the end part must be changed. However, the personal identity code must have the same format as Finnish personal identity codes in order to pass the format check. In addition, the code must be a number between 900 and 999.

251159-945P
Business ID
0245458-3
The Business ID must be anonymised in such a way that it is in correct format, i.e. the control character is correctly calculated.

8351634-3

Loan number
FI5992954118737928

The loan number must be fully anonymised, including the part identifying the lender (if such a part is used).

There are no requirements for the format of the loan number.

123456789

Page last updated 4/6/2023