In the fashion domain, predicting compatibility is a significantly difficult task due to its subjective nature. Previous
work in this domain focuses on comparing product images rather than a real-world scene. This fails to capture key context like body type, seasons, and other occasions in the scene.