Calculation Methodology
Energy Calculation: The calculator uses the active parameter count when estimating compute. Each token triggers roughly floating-point operations, so total energy is , where is hardware efficiency in FLOPs/Joule.
• Overall parameters represent the full model size (all experts for MoE).
• Active parameters are the subset actually multiplied for a single token. Dense models haveactive = overall; MoE models often have active ≪ overall.
This distinction means MoE models show lower compute-energy and cost here than equally sized dense models. We do not currently account for memory bandwidth or expert-routing overhead, so real-world MoE energy can be somewhat higher.
Carbon Emissions: , where is the region-specific carbon intensity (kg CO₂/kWh). Selecting cleaner grids (lower ) therefore reduces emissions even when energy use is unchanged.
Hardware Assumptions: Based on NVIDIA H100 specifications (~ FLOPs/Joule, conservative estimate). Precision improvements (FP16/FP8) increase efficiency by 2×/4× respectively.
Variable definitions: – active parameters (billions); – token count; – total energy in kilowatt-hours; – regional carbon intensity (kg CO₂/kWh); – hardware efficiency (FLOPs/J).