HowesR. x 2

bookRéférences 2

Cit: Curation in training for effective vision-language data

Large vision-language models are generally applicable to many downstream tasks, but come at an exorb...

HowesR.XuH.XieS.HuangP.Y.YuL.

Adversarial evaluation of multimodal models under realistic gray box assumption

This work examines the vulnerability of multimodal (image + text) models to adversarial threats simi...

DolhanskyB.EvtimovI.HowesR.FiroozH.

Mots-clés associés