Skip to content

Tokenization¤

Module: generative_models.extensions.nlp.tokenization

Source: generative_models/extensions/nlp/tokenization.py

Overview¤

Advanced tokenization for text generation tasks.

This module provides JAX-compatible tokenization utilities for text processing and generation tasks.

Classes¤

AdvancedTokenization¤

class AdvancedTokenization

Functions¤

call¤

def __call__()

init¤

def __init__()

add_special_tokens¤

def add_special_tokens()

apply_masking¤

def apply_masking()

compute_token_frequencies¤

def compute_token_frequencies()

create_attention_mask¤

def create_attention_mask()

create_position_ids¤

def create_position_ids()

decode_batch¤

def decode_batch()

detokenize¤

def detokenize()

encode_batch¤

def encode_batch()

get_vocabulary_info¤

def get_vocabulary_info()

tokenize¤

def tokenize()

truncate_sequences¤

def truncate_sequences()

Module Statistics¤

  • Classes: 1
  • Functions: 13
  • Imports: 5