By Chase Doyle

VANCOUVER, B.C.—A novel artificial intelligence algorithm could help standardize and improve the evaluation of endoscope instrument channels with borescopes, according to data presented at the 2023 annual meeting of the American College of Gastroenterology.

Analysis of the algorithm showed effective detection and classification of defects in endoscope channels compared with manual review.

“Manual inspection of endoscope working channels with borescopes currently lacks standardization and is a time- and focus-intensive process,” said lead investigator Sagar Shah, MD. “Future advancements in this technology could improve quality control of inspections and augment decision making pertaining to endoscope reprocessing,” added Dr. Shah, an internal medicine resident at UCLA Health, in Los Angeles.

Although several studies have demonstrated that borescopes can effectively detect defects such as scratches, stains, crimps, fluid droplets and foreign objects, current practices for post-reprocessing inspection lack standardization, Dr. Shah noted. In addition, he said inspection quality is highly dependent on the technician, demonstrating considerable interobserver variability.

Dr. Shah and his co-investigators hypothesized that AI could aid in borescope image interpretation and facilitate the standardization of this process. The researchers reviewed the test characteristics of a novel automated AI algorithm on borescope (FIS-007 Flexible Inspection Scope, Healthmark) videos of the working channels of endoscopes used at a single tertiary center between August 2022 and May 2023.

They then analyzed the AI algorithm (WatchDog Endo Alm, BH2 Innovations) against manual reviews of the video recordings, which served as the gold standard. The researchers evaluated the sensitivity, specificity and accuracy for each type of defect on a per-inspection basis and constructed receiver operating characteristic (ROC) curves.

AI Integration Yields Promising Results

For the pilot program of the study, the AI system was used to perform 51 borescope inspections of 48 endoscopes, producing promising results when compared with an expert reviewer (abstract P3406).

The Al system detected a median of 36 scratches (IQR, 74.6), one area of peeling (IQR, 5.5), 16 stains (IQR, 32.5), five droplets (IQR, 14.5) and seven crimps (IQR, 12) per inspection. The overall accuracy for identifying different defects varied and was highest with scratches (80%) and peeling (75%). Crimps and stains were detected 65% of the time. Droplets were the hardest for the program to identify. The AI system was accurate a little less than half of the time (47%). However, the ROC stats were very close for detecting each defect, except for identifying droplets (0.60). The ROC was highest in identifying peeling (0.79), followed by scratches (0.78) and then crimps and stains (0.76 for both).

The researchers emphasized that Al systems have the potential to standardize borescope inspection and potentially reduce the risk for infection associated with working channel defects. “Improving standardization in endoscope inspections is crucial for infection control and functionality,” Dr. Shah concluded. “The application of artificial intelligence in borescope image interpretation could be a game changer in achieving that goal.”

According to Dr. Shah, future research should be directed toward understanding the extent to which borescope findings affect decisions made during endoscope reprocessing.

Addressing Challenges in AI Application

Lawrence F. Muscarella, PhD, an independent medical device safety expert and the president of LFM Healthcare Solutions, based in Lansdale, Pa. said that his primary question regarding the application of AI to endoscopy “involves how to design, teach and validate the AI to ensure its results are reliable, sensitive and accurate, no matter the AI’s manufacturer.

“The problem,” Dr. Muscarella added, “is that much today is unknown about which specific internal endoscope defects ... indicate a true safety risk and which can be reasonably ignored.”

According to Dr. Muscarella, AI algorithms will make judgment calls, at least at first, and such subjectivity could bring providers back full circle to a lack of standardization.

“Subjectivity is what we are trying to eliminate, so it could prove fruitless and costly to simply exchange one judgment call—a reprocessing technician’s assessment about a defect’s infection risk, for example—with that of an AI platform’s. [It’s] Peter for Paul,” he said. “We will need to watch and see how the designers of the AI address and mitigate these concerns, which would be necessary to prevent the AI from yielding false-negative and/or false-positive results.

“We do not want one AI platform sounding an ‘alarm’ for a defect inside an endoscope’s working channel that another AI platform/database concludes poses an entirely inconsequential risk,” he added.

As Dr. Muscarella explained, such differences between platforms are theoretically possible and would depend partly on the data used to train the AI and, again, those data can be error prone.

“If the AI is wrong and cries wolf too often about defects that pose no reasonable risk, hospitals will abandon the technology,” Dr. Muscarella observed. “The AI must prove to reliably discriminate between real infection risks and negligible ones. This is the challenge.”

Nevertheless, Dr. Muscarella remained confident that AI will ultimately “teach itself how to optimize its finding and assessments” and that “virtually all platforms, no matter the manufacturer, would ... yield the same meaningful results, but this may take some time.”

This article is from the May 2024 print issue.