Data Processing

What is data processing?

Data processing refers to all operations performed on the data.

Data processing model

Data processing includes:

  • Data entry (manual and machine-processed) and verification tasks.
  • Data cleaning – identification and resolution of data discrepancies using listing review, source data verification, computational monitoring for trends, medical review, or use of specialised data cleaning or review reports to identify data discrepancies.  
  • Medical coding - The process by which a verbatim term or originally recorded term entered in the electronic case report form (eCRF) is translated into a standardised medical term using a medical dictionary. 
  • Transformations performed on data – includes mapping data to different coding schemes or scales, reformatting data.
  • Integration of externally managed data, eg central laboratory data.
  • Data-assisted trial operations such as safety event detection and reporting. Alerts may be programmed to notify the Sponsor-Investigator when adverse events or serious adverse events are reported by sites and may facilitate follow-up by the site or Sponsor-Investigator/coordinating site.

Documenting the data processing performed

The Data Management Plan (DMP) must document or reference documentation that specifies all the operations that will be performed on the trial data.  

Standard operating procedures (SOPs) must be developed that document the procedures for processing data. These procedures must be updated to reflect changes throughout the trial.  

The Sponsor-Investigator must delegate all data processing activities to a member(s) of the research team who has relevant expertise and has received training in the relevant trial SOPs.  

What types of checks are performed during data cleaning?

The quality and integrity of the data should be checked when the source data is first collected and entered into the case report form (CRF) and again when it is entered into the study database. In addition, additional quality checks should be set up if the trial uses electronic data capture or migration of data from external sources, e.g. central laboratories, data from devices.  

The trial monitor is responsible for checking the quality and accuracy of data transcribed from source data into the paper CRF or eCRF, ie source data verification. It is useful to use a Participant-Level Monitoring Form that includes all source data verification activities that have been identified for the trial as per the trial specific Clinical Monitoring Plan.

A Source Document Plan Guidance and Template or Source Data Identification Log is also a valuable tool for the monitor to identify the source document for all source data. Two examples that are available for download:

The Sponsor-Investigator should delegate data management responsibilities to a member of the team who has received training to undertake the role and has the relevant expertise. This person is responsible for validating the data entered. The types of checks that may be performed include:

Types of data checks

Types Data Checks 

All data queries need to be resolved with the Investigator

If either the trial monitor or the data manager become aware of any data issues they must inform the site Principal Investigator to resolve the issue. Documentation regarding the data issue and its resolution must be kept to ensure there is an audit trail of any changes made to the data.