Command-line options for marian-server

Last updated: 22 March 2021


Marian: Fast Neural Machine Translation in C++

Version: v1.12.0 65bf82f 2023-02-21 09:56:29 -0800

Usage: ./marian-server [OPTIONS]

General options

-h,--help                             Print this help message and exit
--version                             Print the version number and exit
--authors                             Print list of authors and exit
--cite                                Print citation and exit
--build-info TEXT                     Print CMake build options and exit. Set to 'all' to print 
                                      advanced options
-c,--config VECTOR ...                Configuration file(s). If multiple, later overrides earlier
-w,--workspace INT=512                Preallocate arg MB of work space. Negative `--workspace -N` 
                                      value allocates workspace as total available GPU memory 
                                      minus N megabytes.
--log TEXT                            Log training process information to file given by arg
--log-level TEXT=info                 Set verbosity level of logging: trace, debug, info, warn, 
                                      err(or), critical, off
--log-time-zone TEXT                  Set time zone for the date shown on logging
--quiet                               Suppress all logging to stderr. Logging to files still works
--quiet-translation                   Suppress logging for translation
--seed UINT                           Seed for all random number generators. 0 means initialize 
--check-nan                           Check for NaNs or Infs in forward and backward pass. Will 
                                      abort when found. This is a diagnostic option that will 
                                      slow down computation significantly
--interpolate-env-vars                allow the use of environment variables in paths, of the form 
--relative-paths                      All paths are relative to the config file location
--dump-config TEXT                    Dump current (modified) configuration to stdout and exit. 
                                      Possible values: full, minimal, expand

Server options

-p,--port UINT=8080                   Port number for web socket server

Model options

-m,--models VECTOR ...                Paths to model(s) to be loaded. Supported file extensions: 
                                      .npz, .bin
--model-mmap                          Use memory-mapping when loading model (CPU only)
--ignore-model-config                 Ignore the model configuration saved in npz file
--type TEXT=amun                      Model type: amun, nematus, s2s, multi-s2s, transformer
--dim-vocabs VECTOR=0,0 ...           Maximum items in vocabulary ordered by rank, 0 uses all 
                                      items in the provided/created vocabulary file
--dim-emb INT=512                     Size of embedding vector
--factors-dim-emb INT                 Embedding dimension of the factors. Only used if concat is 
                                      selected as factors combining form
--factors-combine TEXT=sum            How to combine the factors and lemma embeddings. Options 
                                      available: sum, concat
--lemma-dependency TEXT               Lemma dependency method to use when predicting target 
                                      factors. Options: soft-transformer-layer, 
                                      hard-transformer-layer, lemma-dependent-bias, re-embedding
--lemma-dim-emb INT=0                 Re-embedding dimension of lemma in factors
--dim-rnn INT=1024                    Size of rnn hidden state
--enc-type TEXT=bidirectional         Type of encoder RNN : bidirectional, bi-unidirectional, 
                                      alternating (s2s)
--enc-cell TEXT=gru                   Type of RNN cell: gru, lstm, tanh (s2s)
--enc-cell-depth INT=1                Number of transitional cells in encoder layers (s2s)
--enc-depth INT=1                     Number of encoder layers (s2s)
--dec-cell TEXT=gru                   Type of RNN cell: gru, lstm, tanh (s2s)
--dec-cell-base-depth INT=2           Number of transitional cells in first decoder layer (s2s)
--dec-cell-high-depth INT=1           Number of transitional cells in next decoder layers (s2s)
--dec-depth INT=1                     Number of decoder layers (s2s)
--skip                                Use skip connections (s2s)
--layer-normalization                 Enable layer normalization
--right-left                          Train right-to-left model
--input-types VECTOR ...              Provide type of input data if different than 'sequence'. 
                                      Possible values: sequence, class, alignment, weight. You 
                                      need to provide one type per input file (if --train-sets) 
                                      or per TSV field (if --tsv).
--best-deep                           Use Edinburgh deep RNN configuration (s2s)
--tied-embeddings                     Tie target embeddings and output embeddings in output layer
--tied-embeddings-src                 Tie source and target embeddings
--tied-embeddings-all                 Tie all embedding layers and output layer
--output-omit-bias                    Do not use a bias vector in decoder output layer
--transformer-heads INT=8             Number of heads in multi-head attention (transformer)
--transformer-no-projection           Omit linear projection after multi-head attention 
--transformer-rnn-projection          Add linear projection after rnn layer (transformer)
--transformer-pool                    Pool encoder states instead of using cross attention 
                                      (selects first encoder state, best used with special token)
--transformer-dim-ffn INT=2048        Size of position-wise feed-forward network (transformer)
--transformer-decoder-dim-ffn INT=0   Size of position-wise feed-forward network in decoder 
                                      (transformer). Uses --transformer-dim-ffn if 0.
--transformer-ffn-depth INT=2         Depth of filters (transformer)
--transformer-decoder-ffn-depth INT=0 Depth of filters in decoder (transformer). Uses 
                                      --transformer-ffn-depth if 0
--transformer-ffn-activation TEXT=swish
                                      Activation between filters: swish or relu (transformer)
--transformer-dim-aan INT=2048        Size of position-wise feed-forward network in AAN 
--transformer-aan-depth INT=2         Depth of filter for AAN (transformer)
--transformer-aan-activation TEXT=swish
                                      Activation between filters in AAN: swish or relu (transformer)
--transformer-aan-nogate              Omit gate in AAN (transformer)
--transformer-decoder-autoreg TEXT=self-attention
                                      Type of autoregressive layer in transformer decoder: 
                                      self-attention, average-attention (transformer)
--transformer-tied-layers VECTOR ...  List of tied decoder layers (transformer)
--transformer-guided-alignment-layer TEXT=last
                                      Last or number of layer to use for guided alignment training 
                                      in transformer
--transformer-preprocess TEXT         Operation before each transformer layer: d = dropout, a = 
                                      add, n = normalize
--transformer-postprocess-emb TEXT=d  Operation after transformer embedding layer: d = dropout, a 
                                      = add, n = normalize
--transformer-postprocess TEXT=dan    Operation after each transformer layer: d = dropout, a = 
                                      add, n = normalize
--transformer-postprocess-top TEXT    Final operation after a full transformer stack: d = dropout, 
                                      a = add, n = normalize. The optional skip connection with 
                                      'a' by-passes the entire stack.
                                      Train positional embeddings instead of using static 
                                      sinusoidal embeddings
--transformer-depth-scaling           Scale down weight initialization in transformer layers by 1 
                                      / sqrt(depth)
--bert-mask-symbol TEXT=[MASK]        Masking symbol for BERT masked-LM training
--bert-sep-symbol TEXT=[SEP]          Sentence separator symbol for BERT next sentence prediction 
--bert-class-symbol TEXT=[CLS]        Class symbol BERT classifier training
--bert-masking-fraction FLOAT=0.15    Fraction of masked out tokens during training
--bert-train-type-embeddings=true     Train bert type embeddings, set to false to use static 
                                      sinusoidal embeddings
--bert-type-vocab-size INT=2          Size of BERT type vocab (sentence A and B)

Translator options

-i,--input VECTOR=stdin ...           Paths to input file(s), stdin by default
-o,--output TEXT=stdout               Path to output file, stdout by default
-v,--vocabs VECTOR ...                Paths to vocabulary files have to correspond to --input
-b,--beam-size UINT=12                Beam size used during search with validating translator
-n,--normalize FLOAT=0                Divide translation score by pow(translation length, arg)
--max-length-factor FLOAT=3           Maximum target length as source length times factor
--word-penalty FLOAT                  Subtract (arg * translation length) from translation score
--allow-unk                           Allow unknown words to appear in output
--allow-special                       Allow special symbols to appear in output, e.g. for 
                                      SentencePiece with byte-fallback do not suppress the 
                                      newline symbol
--n-best                              Generate n-best list
--alignment TEXT                      Return word alignment. Possible values: 0.0-1.0, hard, soft
--force-decode                        Use force-decoding of given prefixes. Forces decoding to 
                                      follow vocab IDs from last stream in the batch (or the 
                                      first stream, if there is only one). Use either as 
                                      `./marian-decoder --force-decode --input source.txt 
                                      prefixes.txt [...]` where inputs and prefixes align on 
                                      line-level or as `paste source.txt prefixes.txt | 
                                      ./marian-decoder --force-decode --tsv --tsv-fields 2 [...]` 
                                      when reading from stdin.
--word-scores                         Print word-level scores. One score per subword unit, not 
                                      normalized even if --normalize
--stat-freq TEXT=0                    Display speed information every arg mini-batches. Disabled 
                                      by default with 0, set to value larger than 0 to activate
--no-spm-decode                       Keep the output segmented into SentencePiece subwords
--max-length UINT=1000                Maximum length of a sentence in a training sentence pair
--max-length-crop                     Crop a sentence to max-length instead of omitting it if 
                                      longer than max-length
--tsv                                 Tab-separated input
--tsv-fields UINT                     Number of fields in the TSV input. By default, it is guessed 
                                      based on the model type
-d,--devices VECTOR=0 ...             Specifies GPU ID(s) to use for training. Defaults to 
--num-devices UINT                    Number of GPUs to use for this process. Defaults to 
                                      length(devices) or 1
--cpu-threads UINT=0                  Use CPU-based computation with this many independent 
                                      threads, 0 means GPU-based computation
--mini-batch INT=1                    Size of mini-batch used during batched translation
--mini-batch-words INT                Set mini-batch size based on words instead of sentences
--maxi-batch INT=1                    Number of batches to preload for length-based sorting
--maxi-batch-sort TEXT=none           Sorting strategy for maxi-batch: none, src, trg (not 
                                      available for decoder)
--data-threads UINT=8                 Number of concurrent threads to use during data reading and 
--fp16                                Shortcut for mixed precision inference with float16, 
                                      corresponds to: --precision float16
--precision VECTOR=float32 ...        Mixed precision for inference, set parameter type in 
                                      expression graph
--skip-cost                           Ignore model cost during translation, not recommended for 
                                      beam-size > 1
--shortlist VECTOR ...                Use softmax shortlist: path first best prune
--weights VECTOR ...                  Scorer weights
--output-sampling VECTOR ...          Noise output layer with gumbel noise. Implicit default is 
                                      'full 1.0' for sampling from full distribution with softmax 
                                      temperature 1.0. Also accepts 'topk num temp' (e.g. topk 
                                      100 0.1) for top-100 sampling with temperature 0.1
--output-approx-knn VECTOR ...        Use approximate knn search in output layer (currently only 
                                      in transformer)
--optimize=false                      Optimize the graph on-the-fly
-g,--gemm-type TEXT=float32           GEMM Type to be used for on-line quantization/packing: 
                                      float32, packed16, packed8
--quantize-range FLOAT=0              Range for the on-line quantiziation of weight matrix in 
                                      multiple of this range and standard deviation, 0.0 means 
                                      min/max quantization