Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D DeepPavlov
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 18
    • Issues 18
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • DeepPavlov
  • DeepPavlov
  • Merge requests
  • !1502

fix: upload DeepPavlov BERT models with MLM & NSP heads parameters

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Andrei Glinskii requested to merge fix/issue-1275 into dev Nov 08, 2021
  • Overview 3
  • Commits 4
  • Pipelines 0
  • Changes 4

Created by: yurakuratov

This PR fixes a problem with pre-trained BERT models by DeepPavlov. Previous checkpoints did not include some weights that are usually used only for pre-training.

Checkpoints from DeepPavlov docs http://docs.deeppavlov.ai/en/master/features/pretrained_vectors.html#bert did not include bias parameter in NSP head.

Checkpoints from HuggingFace https://huggingface.co/DeepPavlov did not include MLM head and NSP head parameters.

Updated checkpoints:

  • RuBERT (DeepPavlov/rubert-base-cased)
  • Slavic BERT (DeepPavlov/bert-base-bg-cs-pl-ru-cased)
  • Conversational BERT (DeepPavlov/bert-base-cased-conversational)
  • Conversational RuBERT (DeepPavlov/rubert-base-cased-conversational)

Also, tokenizer configuration (tokenizer_config.json) was added to every DeepPavlov BERT model.

TODO:

  • update urls in DeepPavlov docs
  • upload fixed models to HF

Related issues and discussions:

  • https://github.com/deepmipt/DeepPavlov/issues/1275
  • https://github.com/huggingface/transformers/issues/5806
  • https://forum.deeppavlov.ai/t/rubert/963
  • https://forum.deeppavlov.ai/t/rubert/305/2
  • https://opendatascience.slack.com/archives/C04N3UMSL/p1586900942057100
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: fix/issue-1275