SinghA. x 2
bookRéférences 2
Flava: A foundational language and vision alignment model
… The hateful memes challenge: Detecting hate speech in multimodal memes. Proceedings of ...
2026-01-20 00:00:00
Human-adversarial visual question answering
Performance on the most commonly used Visual Question Answering dataset (VQA v2) is starting to appr...