Four Ways to Identify Rater Drift in CNS Clinical Trials + Remediation Strategies

November 16, 2023

Successful drug development hinges on the collection of valid data. And in clinical trials related to the central nervous system that involve clinical raters, it becomes vital that raters collect and score interviews the same way throughout the trial and maintain high inter-rater reliability.

Cogstate Clinician Network Senior Manager, Felice Ockun, recently presented information on how to identify and remediate rater drift when it happens. Ockun has 25 years of experience in the mental healthcare industry with expertise in global clinical trials, psychiatric diagnosis, rater training, and clinical data analytics.

What is Rater Drift and how do you identify it in your clinical trials?

The term rater drift can be thought of as diminishing inter-rater reliability that occurs over the life of a study. Raters can drift in different ways; from how they are administering scales—such as how thoroughly they are conducting interviews—to how they are scoring scales—such as forgetting a particular scoring convention.

Ockun recommends four ways to identify raters that may require recalibration.

Round Robins:The lowest touch exercise, requiring the least amount of your organization’s resources, is called a round robin,” said Ockun. In this approach, raters split up the interview with one subject and collectively take turns administering the items, while  all participants score independently.  Scores and rationales are then discussed either after each item or at the conclusion.

Gold Standard Calibration Videos: In this approach, raters each view and score a recorded interview. Their scores are then compared to gold standard scores determined by an expert panel. This allows teams to identify raters who need re-training on using scoring conventions or scoring anchors.

“The big advantage to this method is that once the upfront work is done of creating the videos and convening an expert panel to reach consensus on the scores, your organization can then use this video for training and calibration purposes on an unlimited number of raters indefinitely,” said Ockun.

One-on-One Observed Interviews: In this approach, a rater is observed conducting an interview and is provided with feedback from an expert trainer. Raters discuss their scores with the trainer and provide rationales, so it is evident if they understand the conventions and anchors.

“This approach does take more of your organization’s resources,” said Ockun, “but it allows you to observe drift occurring in administration, as well as in scoring.”

Data Analytics: In this approach, algorithms are applied to the study data to identify patterns among raters or identify outliers. This is done throughout the life of the study.

“Examples of flags might be too large of a score increase between visits, too large of a score decrease between visits, or inconsistent scores on two items that should hang together,” said Ockun.

CASE EXAMPLE – Rater Drift in MADRS and HAM-D Assessments

The MADRS and HAM-D are two of the most widely used instruments for assessing depression severity. Some common areas of drift on these scales include looser adherence to interview structure, less clarification, reduced reliance on scoring conventions, and less attention to scoring anchors.

To demonstrate one way these drift areas can impact quality data capture and clinical trial decision making, we look to an example of looser adherence on interview structure. Here we refer to Dr. Janet William’s SIGHD structured interview guide, question #3 on early insomnia.

Looser adherence to structure could manifest here by the rater failing to obtain the usual hours of going to sleep and waking up at baseline. Failing to establish a pre-morbid baseline in a HAMD or MADRS can potentially result in a big impact on scoring and could disqualify a subject from participation who could be eligible:

  • In this case, when asked if she has had trouble falling asleep, the participant replied “yes,” and that it had taken her a half an hour almost every night this week. This seems like an obvious score of 2.
  • However, let’s say the same subject has always had sleeping trouble, even when well.  This week represents no change from baseline. Our score immediately goes to 2 to 0.
  • If this failure to compare to the pre-morbid baseline occurs across the three sleep items, that is a potential 6 point difference in total score.


How do you remediate Rater Drift in your clinical trials?

Once you have identified rater drift, the next step is remediation. The method of remediation will be based on the type of drift you have identified.

  • If drift is based on how the rater is CONDUCTING the interview, the remediation should include a didactic, or instructional re-training, ideally followed by an observed interview or applied training. This will allow the trainer an opportunity to assess if the rater is implementing the instruction.
  • If drift is based on how the rater is SCORING the interview the recommendation here would also be to provide didactic re-training on either scoring conventions or scoring anchors, and then ideally have the rater review a training video and provide scores to see if they are recalibrated. Since their performance was not the issue you do not need to conduct an observed interview.

In all cases, ongoing data analysis allows you to continue providing oversight following remediation.

If you are interested in learning more, please feel free to contact our team and/or access the free webinar with more details.

 

Felice Ockun | Senior Manager, Clinician Network, Clinical Trials

Felice Ockun is a seasoned mental healthcare professional with more than 25 years of experience in the mental healthcare industry, specializing in global clinical CNS trials, psychiatric diagnosis, rater training, quality oversight, and clinical data analytics. Felice’s diverse background spans management roles in clinical trials, inpatient/outpatient mental healthcare, and emergency mental healthcare settings.

Felice received an MS in Clinical Psychology from LIU’s Clinical Psychology Doctoral Program, and an MS in Clinical Social Work from Columbia University. She also holds a BA in Psychology and Sociology from Emory University.

Back to Blog