DettmersT. x 1
bookRéférences 1
Branch-train-merge: Embarrassingly parallel training of expert language models
We present Branch-Train-Merge (BTM), a communication-efficient algorithm for embarrassingly parallel...
Branch-train-merge: Embarrassingly parallel training of expert language models
We present Branch-Train-Merge (BTM), a communication-efficient algorithm for embarrassingly parallel...