Production¤
Module: generative_models.inference.optimization.production
Source: generative_models/inference/optimization/production.py
Overview¤
Note: This module is experimental and under active development as part of ongoing work towards production-ready inference capabilities.
Inference optimization infrastructure for scaled models.
This module provides optimization infrastructure including:
- Automatic optimization pipeline selection
- Inference optimization strategies
- Model adapter classes for different architectures
- Monitoring and debugging tools
All implementations follow JAX/Flax NNX best practices and prioritize performance through hardware-aware optimization.
Classes¤
CompiledModel¤
MonitoringMetrics¤
OptimizationResult¤
OptimizationTarget¤
ProductionMonitor¤
ProductionOptimizer¤
ProductionPipeline¤
Functions¤
call¤
init¤
init¤
init¤
init¤
compiled_forward¤
create_production_optimizer¤
create_production_pipeline¤
create_production_pipeline¤
get_metrics¤
get_monitoring_metrics¤
optimize_for_production¤
predict¤
predict_batch¤
record_request¤
reset¤
reset_monitoring¤
Module Statistics¤
- Classes: 7
- Functions: 17
- Imports: 7