*= Equal Contributors
Recommendation systems in large-scale online marketplaces are essential to aiding users in discovering new content. However, state-of-the-art systems for item-to-item recommendation tasks are often based on a shallow level of contextual relevance, which can make the system insufficient for tasks where item relationships are more nuanced. Contextually relevant item pairs can sometimes have problematic relationships that are confusing or even controversial to end users, and they could degrade user experiences and brand perception when recommended to users. For example, the recommendation of a book about one sports team to someone reading a book about that team’s biggest rival could be a bad experience, despite the presumed similarities of the books. In this paper, we propose a classifier to identify and prevent such problematic item-to-item recommendations and to enhance overall user experiences. The proposed approach utilizes active learning to sample hard examples effectively across sensitive item categories and employs human raters for data labeling. We also perform offline experiments to demonstrate the efficacy of this system for identifying and filtering problematic recommendations while maintaining recommendation quality.