Created by: vikmary
Do not split stopwords
into subtokens in StramSpacyTokenizer
.
Example of use:
from deeppavlov.models.tokenizers.spacy_tokenizer import StreamSpacyTokenizer
tok = StreamSpacyTokenizer(alphas_only=False, stopwords=['__PERSON__'])
tok(['__PERSON__ пошел гулять'])
> [['__person__', 'пошел', 'гулять']]