File(s) not publicly available
Improving Interactive Segmentation using a Novel Weighted Loss Function with an Adaptive Click Size and Two-Stream Fusion
Interactive segmentation has recently attracted at-tention for specialized segmentation tasks where expert input is required to further enhance the segmentation performance. In this work, we propose a novel interactive segmentation framework, where user clicks are dynamically adapted in size based on the current segmentation mask. The clicked regions form a weight map and are fed to a deep neural network together with the image, where the network learns to discriminate important regions through a novel weighted loss function. To evaluate our loss function, a state-of-the-art interactive V-Net (IV-Net) model which utilizes both foreground and background user clicks as the main method of interaction is employed. To further improve on the IV-Net, we propose the use of a two-stream fusion interactive If-Net (TSFIV-Net) which applies multimodal fusion to allow for the propagation of image feature information throughout the architecture. We train and validate both the models on the BCV dataset, while testing on both seen and unseen structures from the MSD dataset to determine the models generalization and segmentation abilities in comparison to the standard IV-Net. Applying adaptive user click sizes increases the overall dice score by 4.86 % and 8.59 % for seen and unseen structures respectively by utilizing only a single user interaction on the IV-Net compared to the original version, and 9.88% and 10.35% on the TSFIV-Net.