Skip to content

Production¤

Module: generative_models.inference.optimization.production

Source: generative_models/inference/optimization/production.py

Overview¤

Note: This module is experimental and under active development as part of ongoing work towards production-ready inference capabilities.

Inference optimization infrastructure for scaled models.

This module provides optimization infrastructure including:

  • Automatic optimization pipeline selection
  • Inference optimization strategies
  • Model adapter classes for different architectures
  • Monitoring and debugging tools

All implementations follow JAX/Flax NNX best practices and prioritize performance through hardware-aware optimization.

Classes¤

CompiledModel¤

class CompiledModel

MonitoringMetrics¤

class MonitoringMetrics

OptimizationResult¤

class OptimizationResult

OptimizationTarget¤

class OptimizationTarget

ProductionMonitor¤

class ProductionMonitor

ProductionOptimizer¤

class ProductionOptimizer

ProductionPipeline¤

class ProductionPipeline

Functions¤

call¤

def __call__()

init¤

def __init__()

init¤

def __init__()

init¤

def __init__()

init¤

def __init__()

compiled_forward¤

def compiled_forward()

create_production_optimizer¤

def create_production_optimizer()

create_production_pipeline¤

def create_production_pipeline()

create_production_pipeline¤

def create_production_pipeline()

get_metrics¤

def get_metrics()

get_monitoring_metrics¤

def get_monitoring_metrics()

optimize_for_production¤

def optimize_for_production()

predict¤

def predict()

predict_batch¤

def predict_batch()

record_request¤

def record_request()

reset¤

def reset()

reset_monitoring¤

def reset_monitoring()

Module Statistics¤

  • Classes: 7
  • Functions: 17
  • Imports: 7