bookMSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
To reduce a model size but retain performance, we often rely on knowledge distillation (KD) which tr...
2021-01-01 00:00:00
JinW.SanjabiM.NieS.TanL.RenX.
bookModality-specific distillation
… The task is a binary classification problem, which is to detect hate speech in multimod...