NEWS
wordvector 0.6.2 (2026-04-06)
- Add
layer to perplexity() for textmodel_doc2vec models.
- Save document lengths as
ntoken in trained textmodel_doc2vec models.
- Update
as.textmode_doc2vec() to save output layer weights.
- Update tests for quanteda v4.4.0.
wordvector 0.6.1 (2026-02-25)
- Mention doc2vec in package description.
- Add
perplexity() to asses models' the goodness-of-fit to data.
- Save quanteda's internal docvars in the
textmodel_doc2vec objects.
- Add
group to as.matrix() to average sentence or paragraph vectors from the same documents.
wordvector 0.6.0 (2025-12-09)
- Upgrade
textmodel_doc2vec to train the distributed memory (DM) and distributed bag-of-word (DBOW) models.
- Add
as.textmodel_doc2vec() to create document vectors as weighted average of word vectors.
- Add
layer to as.matrix() to choose between word or document vectors.
normalize is now defunct in textmodel_word2vec().
wordvector 0.5.1 (2025-06-20)
- Add
normalize to textmodel_doc2vec() and pass it to as.matrix().
- Add
weights to textmodel_doc2vec() to adjust the salience of words in the document vectors.
- Add
include_data to textmodel_word2vec() to save the original tokens object.
wordvector 0.5.0 (2025-05-15)
- Add the
model argument to textmodel_word2vec() to update existing models.
- The
normalize argument is moved from textmodel_word2vec() to as.matrix(). The original argument is deprecated and set to FALSE by default.
- Remove
weights().
- Improve the structure of C++ code.
wordvector 0.4.0
- Add the
tolower argument and set to TRUE to lower-case tokens.
- Allow
x to be quanteda's tokens_xptr object to enhance efficiency.
wordvector 0.3.0 (2025-03-12)
- Save docvars in the
textmodel_doc2vec objects.
- Set zero for empty documents in the
textmodel_doc2vec objects.
- Add
probability() to compute probability of words.
wordvector 0.2.0 (2025-01-07)
- Rename
word2vec(), doc2vec() and lsa() to textmodel_word2vec(), textmodel_doc2vec() and textmodel_lsa() respectively.
- Simplify the C++ code to make maintenance easier.
- Add
normalize to word2vec to disable or enable word vector normalization.
- Add
weights() to extract back-propagation weights.
- Make
analogy() to convert a formula to named character vector.
- Improve the stability of
word2vec() when verbose = TRUE.
wordvector 0.1.0 (2024-12-11)
- Fork https://github.com/bnosac/word2vec and change the package name to wordvector.
- Replace a list of character with quanteda's tokens object as an input object.
- Recreate
word2vec() with new argument names and object structures.
- Create
lda() to train word vectors using Latent Semantic Analysis.
- Add
similarity() and analogy() functions using proxyC.
- Add
data_corpus_news2014 that contain 20,000 news summaries as package data.