Skip to content

Performance¤

Module: generative_models.core.performance

Source: generative_models/core/performance.py

Overview¤

Performance infrastructure for roofline analysis and optimization.

This module provides core performance analysis capabilities including:

  • Hardware detection and specification
  • Roofline model analysis for performance estimation
  • FLOP counting and arithmetic intensity calculation
  • JAX function profiling and benchmarking

All implementations follow JAX/Flax NNX best practices and avoid numpy usage within any performance-critical code paths.

Classes¤

HardwareDetector¤

class HardwareDetector

HardwareSpecs¤

class HardwareSpecs

PerformanceEstimator¤

class PerformanceEstimator

RooflineMetrics¤

class RooflineMetrics

Functions¤

init¤

def __init__()

init¤

def __init__()

analyze_roofline¤

def analyze_roofline()

benchmark_operation¤

def benchmark_operation()

calculate_arithmetic_intensity¤

def calculate_arithmetic_intensity()

detect_hardware¤

def detect_hardware()

estimate_flops_attention¤

def estimate_flops_attention()

estimate_flops_linear¤

def estimate_flops_linear()

estimate_memory_usage¤

def estimate_memory_usage()

estimate_transformer_layer_performance¤

def estimate_transformer_layer_performance()

get_batch_size_recommendation¤

def get_batch_size_recommendation()

get_critical_batch_size¤

def get_critical_batch_size()

get_optimal_batch_size¤

def get_optimal_batch_size()

is_batch_size_optimal¤

def is_batch_size_optimal()

profile_jax_function¤

def profile_jax_function()

Module Statistics¤

  • Classes: 4
  • Functions: 15
  • Imports: 4