Imagine you're looking for movies on a streaming service. You want comedies, but they shouldn't be romances, and they have to be British. This is a compositional query, a bit more complex than just finding comedies.
Currently, machine learning uses dot-products to connect items like movies to attributes like comedy. This works well for simple searches, like finding comedies. However, when you add more conditions, like not wanting romances and focusing on British films, this method starts to struggle.
That's where box embeddings come in. Think of them like Venn diagrams you can learn from. They're really good at handling these more complex, set-theoretic compositions. In fact, they perform much better than traditional methods when you're dealing with multiple criteria.
Scientists have introduced a new dataset to test these compositional queries. Their experiments show that while both methods work well for simple searches, box embeddings shine when you're looking for something specific. This is especially useful when you're trying to find something among a lot of options.