Class MarianEmbedder

Class Documentation

class MarianEmbedder

MarianEmbedder takes a Marian sequence2sequence transformer model and produces sentence embeddings collected from the encoder.

Currently the model file is supposed to know how to do that.

Public Functions

MarianEmbedder()
std::vector<std::vector<float>> embed(const std::string &input)

input is a big string with multiple sentences separated by ‘

’.

Returns a vector of embedding vectors in order corresponding to input sentence order.

bool load(const std::string &modelPath, const std::string &vocabPath)

modelPath is a Marian model, vocabPath a matching SentencePiece model with *.spm suffix.