fix: upload DeepPavlov BERT models with MLM & NSP heads parameters (!1502) · Merge requests · DeepPavlov / DeepPavlov

Merged Andrei Glinskii requested to merge fix/issue-1275 into dev Nov 08, 2021

Created by: yurakuratov

This PR fixes a problem with pre-trained BERT models by DeepPavlov. Previous checkpoints did not include some weights that are usually used only for pre-training.

Checkpoints from DeepPavlov docs http://docs.deeppavlov.ai/en/master/features/pretrained_vectors.html#bert did not include bias parameter in NSP head.

Checkpoints from HuggingFace https://huggingface.co/DeepPavlov did not include MLM head and NSP head parameters.

Updated checkpoints:

RuBERT (DeepPavlov/rubert-base-cased)
Slavic BERT (DeepPavlov/bert-base-bg-cs-pl-ru-cased)
Conversational BERT (DeepPavlov/bert-base-cased-conversational)
Conversational RuBERT (DeepPavlov/rubert-base-cased-conversational)

Also, tokenizer configuration (tokenizer_config.json) was added to every DeepPavlov BERT model.

TODO:

update urls in DeepPavlov docs
upload fixed models to HF

Related issues and discussions:

https://github.com/deepmipt/DeepPavlov/issues/1275
https://github.com/huggingface/transformers/issues/5806
https://forum.deeppavlov.ai/t/rubert/963
https://forum.deeppavlov.ai/t/rubert/305/2
https://opendatascience.slack.com/archives/C04N3UMSL/p1586900942057100