Generation

logits_process

Logits process

class mindnlp.generation.logits_process.EncoderNoRepeatNGramLogitsProcessor(encoder_ngram_size: int, encoder_input_ids: Tensor)[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces no repetition of encoder input ids n-grams for the decoder ids. See [ParlAI](https://github.com/facebookresearch/ParlAI/blob/master/parlai/core/torch_generator_agent.py#L1350).

Parameters:

encoder_ngram_size (int) – All ngrams of size ngram_size can only occur within the encoder input ids.
encoder_input_ids (int) – The encoder_input_ids that should not be repeated within the decoder ids.

class mindnlp.generation.logits_process.EncoderRepetitionPenaltyLogitsProcessor(penalty: float, encoder_input_ids: Tensor)[source]

Bases: LogitsProcessor

[LogitsProcessor] enforcing an exponential penalty on tokens that are not in the original input.

Parameters:

hallucination_penalty (float) – The parameter for hallucination penalty. 1.0 means no penalty.
encoder_input_ids (torch.LongTensor) – The encoder_input_ids that should not be repeated within the decoder ids.

class mindnlp.generation.logits_process.ExponentialDecayLengthPenalty(exponential_decay_length_penalty: Tuple, eos_token_id: Union[int, List[int]], input_ids_seq_length: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] that exponentially increases the score of the eos_token_id after regulation_start has been reached.

Parameters:

exponential_decay_length_penalty (tuple(int, float), optional) – This tuple shall consist of: (start_index, decay_factor) where start_index indicates where penalty starts and decay_factor represents the factor of exponential decay
eos_token_id (Union[int, List[int]]) – The id of the end-of-sequence token. Optionally, use a list to set multiple end-of-sequence tokens.
input_ids_seq_length (int) – The length of the input sequence.

class mindnlp.generation.logits_process.ForceTokensLogitsProcessor(force_token_map: List[List[int]])[source]

Bases: LogitsProcessor

This processor takes a list of pairs of integers which indicates a mapping from generation indices to token indices that will be forced before sampling. The processor will set their log probs to inf so that they are sampled at their corresponding index.

class mindnlp.generation.logits_process.ForcedBOSTokenLogitsProcessor(bos_token_id: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces the specified token as the first generated token.

Parameters:: bos_token_id (int) – The id of the token to force as the first generated token.

class mindnlp.generation.logits_process.ForcedEOSTokenLogitsProcessor(max_length: int, eos_token_id: Union[int, List[int]])[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces the specified token as the last generated token when max_length is reached.

Parameters:

max_length (int) – The maximum length of the sequence to be generated.
eos_token_id (Union[int, List[int]]) – The id of the token to force as the last generated token when max_length is reached. Optionally, use a list to set multiple end-of-sequence tokens.

class mindnlp.generation.logits_process.HammingDiversityLogitsProcessor(diversity_penalty: float, num_beams: int, num_beam_groups: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces diverse beam search. Note that this logits processor is only effective for [PreTrainedModel.group_beam_search]. See [Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models](https://arxiv.org/pdf/1610.02424.pdf) for more details.

Parameters:

diversity_penalty (float) – This value is subtracted from a beam’s score if it generates a token same as any beam from other group at a particular time. Note that diversity_penalty is only effective if group beam search is enabled.
num_beams (int) – Number of beams used for group beam search. See [this paper](https://arxiv.org/pdf/1610.02424.pdf) for more details.
num_beam_groups (int) – Number of groups to divide num_beams into in order to ensure diversity among different groups of beams. See [this paper](https://arxiv.org/pdf/1610.02424.pdf) for more details.

class mindnlp.generation.logits_process.InfNanRemoveLogitsProcessor[source]

Bases: LogitsProcessor

[LogitsProcessor] that removes all nan and inf values to avoid the generation method to fail. Note that using the logits processor should only be used if necessary since it can slow down the generation method. max_length is reached.

class mindnlp.generation.logits_process.LogitNormalization[source]

Bases: LogitsProcessor, LogitsWarper

[LogitsWarper] and [LogitsProcessor] for normalizing the scores using log-softmax. It’s important to normalize the scores during beam search, after applying the logits processors or warpers, since the search algorithm used in this library doesn’t do it (it only does it before, but they may need re-normalization) but it still supposes that the scores are normalized when comparing the hypotheses.

class mindnlp.generation.logits_process.LogitsProcessor[source]

Bases: object

Abstract base class for all logit processors that can be applied during generation.

class mindnlp.generation.logits_process.LogitsProcessorList(iterable=(), /)[source]

Bases: list

This class can be used to create a list of [LogitsProcessor] or [LogitsWarper] to subsequently process a scores input tensor. This class inherits from list and adds a specific __call__ method to apply each [LogitsProcessor] or [LogitsWarper] to the inputs.

class mindnlp.generation.logits_process.LogitsWarper[source]

Bases: object

Abstract base class for all logit warpers that can be applied during generation with multinomial sampling.

class mindnlp.generation.logits_process.MinLengthLogitsProcessor(min_length: int, eos_token_id: Union[int, List[int]])[source]

Bases: LogitsProcessor

[LogitsProcessor] enforcing a min-length by setting EOS probability to 0.

Parameters:

min_length (int) – The minimum length below which the score of eos_token_id is set to -float(“Inf”).
eos_token_id (Union[int, List[int]]) – The id of the end-of-sequence token. Optionally, use a list to set multiple end-of-sequence tokens.

class mindnlp.generation.logits_process.MinNewTokensLengthLogitsProcessor(prompt_length_to_skip: int, min_new_tokens: int, eos_token_id: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] enforcing a min-length of new tokens by setting EOS (End-Of-Sequence) token probability to 0.

Parameters:

prompt_length_to_skip (int) – The input tokens length.
min_new_tokens (int) – The minimum new tokens length below which the score of eos_token_id is set to -float(“Inf”).
eos_token_id (int) – The id of the end-of-sequence token.

class mindnlp.generation.logits_process.NoBadWordsLogitsProcessor(bad_words_ids: List[List[int]], eos_token_id: Union[int, List[int]])[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces that specified sequences will never be sampled.

Parameters:

bad_words_ids (List[List[int]]) – List of list of token ids that are not allowed to be generated. In order to get the token ids of the words that should not appear in the generated text, use tokenizer(bad_words, add_prefix_space=True, add_special_tokens=False).input_ids.
eos_token_id (Union[int, List[int]]) – The id of the end-of-sequence token. Optionally, use a list to set multiple end-of-sequence tokens.

class mindnlp.generation.logits_process.NoRepeatNGramLogitsProcessor(ngram_size: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces no repetition of n-grams. See [Fairseq](https://github.com/pytorch/fairseq/blob/a07cb6f40480928c9e0548b737aadd36ee66ac76/fairseq/sequence_generator.py#L345).

Parameters:: ngram_size (int) – All ngrams of size ngram_size can only occur once.

class mindnlp.generation.logits_process.PrefixConstrainedLogitsProcessor(prefix_allowed_tokens_fn: Callable[[int, Tensor], List[int]], num_beams: int)[source]

Bases: LogitsProcessor

[LogitsProcessor] that enforces constrained generation and is useful for prefix-conditioned constrained generation. See [Autoregressive Entity Retrieval](https://arxiv.org/abs/2010.00904) for more information.

Parameters:: prefix_allowed_tokens_fn – (Callable[[int, torch.Tensor], List[int]]): This function constraints the beam search to allowed tokens only at each step. This function takes 2 arguments inputs_ids and the batch ID batch_id. It has to return a list with the allowed tokens for the next generation step conditioned on the previously generated tokens inputs_ids and the batch ID batch_id.

class mindnlp.generation.logits_process.RepetitionPenaltyLogitsProcessor(penalty: float)[source]

Bases: LogitsProcessor

[LogitsProcessor] enforcing an exponential penalty on repeated sequences.

Parameters:: repetition_penalty (float) – The parameter for repetition penalty. 1.0 means no penalty. See [this paper](https://arxiv.org/pdf/1909.05858.pdf) for more details.

class mindnlp.generation.logits_process.SuppressTokensAtBeginLogitsProcessor(begin_suppress_tokens, begin_index)[source]

Bases: LogitsProcessor

[SuppressTokensAtBeginLogitsProcessor] supresses a list of tokens as soon as the generate function starts generating using begin_index tokens. This should ensure that the tokens defined by begin_suppress_tokens at not sampled at the begining of the generation.

class mindnlp.generation.logits_process.SuppressTokensLogitsProcessor(suppress_tokens)[source]

Bases: LogitsProcessor

This processor can be used to suppress a list of tokens. The processor will set their log probs to -inf so that they are not sampled.

class mindnlp.generation.logits_process.TemperatureLogitsWarper(temperature: float)[source]

Bases: LogitsWarper

[TemperatureLogitsWarper] for temperature (exponential scaling output probability distribution). :param temperature: The value used to module the logits distribution. :type temperature: float

class mindnlp.generation.logits_process.TopKLogitsWarper(top_k: int, filter_value: float = -inf, min_tokens_to_keep: int = 1)[source]

Bases: LogitsWarper

[LogitsWarper] that performs top-k, i.e. restricting to the k highest probability elements.

Parameters:

top_k (int) – The number of highest probability vocabulary tokens to keep for top-k-filtering.
filter_value (float, optional, defaults to -float(“Inf”)) – All filtered values will be set to this float value.
min_tokens_to_keep (int, optional, defaults to 1) – Minimum number of tokens that cannot be filtered.

class mindnlp.generation.logits_process.TopPLogitsWarper(top_p: float, filter_value: float = -inf, min_tokens_to_keep: int = 1)[source]

Bases: LogitsWarper

[LogitsWarper] that performs top-p, i.e. restricting to top tokens summing to prob_cut_off <= prob_cut_off.

Parameters:

top_p (float) – If set to < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation.
filter_value (float, optional, defaults to -float(“Inf”)) – All filtered values will be set to this float value.
min_tokens_to_keep (int, optional, defaults to 1) – Minimum number of tokens that cannot be filtered.

stopping_criteria

Stopping criteria

class mindnlp.generation.stopping_criteria.MaxLengthCriteria(max_length: int)[source]

Bases: StoppingCriteria

This class can be used to stop generation whenever the full generated number of tokens exceeds max_length. Keep in mind for decoder-only type of transformers, this will include the initial prompted tokens.

Parameters:: max_length (int) – The maximum length that the output sequence can have in number of tokens.

class mindnlp.generation.stopping_criteria.MaxNewTokensCriteria(start_length: int, max_new_tokens: int)[source]

Bases: StoppingCriteria

This class can be used to stop generation whenever the generated number of tokens exceeds max_new_tokens. Keep in mind for decoder-only type of transformers, this will not include the initial prompted tokens. This is very close to MaxLengthCriteria but ignores the number of initial tokens.

Parameters:

start_length (int) – The number of initial tokens.
max_new_tokens (int) – The maximum number of tokens to generate.

class mindnlp.generation.stopping_criteria.MaxTimeCriteria(max_time: float, initial_timestamp: Optional[float] = None)[source]

Bases: StoppingCriteria

This class can be used to stop generation whenever the full generation exceeds some amount of time. By default, the time will start being counted when you initialize this function. You can override this by passing an initial_time.

Parameters:

max_time (float) – The maximum allowed time in seconds for the generation.
initial_time (float, optional, defaults to time.time()) – The start of the generation allowed time.

class mindnlp.generation.stopping_criteria.StoppingCriteria[source]

Bases: object

Abstract base class for all stopping criteria that can be applied during generation.

class mindnlp.generation.stopping_criteria.StoppingCriteriaList(iterable=(), /)[source]

Bases: list

property max_length: Optional[int]: return max length

mindnlp.generation.stopping_criteria.validate_stopping_criteria(stopping_criteria: StoppingCriteriaList, max_length: int) → StoppingCriteriaList[source]: validate stopping criteria