Future Work

Improve Performance of YOLO Segmentation

We’re currently supervising a YOLO model to predict artifact segmentation pixels. However, YOLO segmentation requires careful annotation and high-quality training data for good generalization. This is an important part of the endoscopic restoration pipeline, and can always be improved.

Re-training LaMa on JnJ Internal Data

Current inpainting models such as LaMa struggle when domain shift is large, especially under JnJ-specific imaging conditions, sensor properties, and instrument artifacts. Additionally, LaMa’s inpainting strategy is to “repeat” the textures of its surrounding environment, so it heavily relies on the idea that the training and testing dataset distribution is similar. The results shown on this page are trained on public datasets.

Temporal Consistency

A key limitation of frame-by-frame inpainting is temporal flickering, where the model produces different textures for the same region across frames. We were able to achieve temporal consistency with the Temporal GAN methods, but saw shortcomings in the areas that we were prioritizing with the scope of this project. Future work could integrate temporal smoothing, recurrent architectures, or video-aware diffusion/inpainting models to ensure temporally coherent restorations in real endoscopic video streams.