One of the most starred, comprehensive and up-to-date collections of Diffusion Language Model papers, code and resources! If you find this repository helpful, please consider giving it a ⭐ to support.
Timeline of Diffusion Language Models
This figure highlights key milestones in the development of DLMs, categorized into three groups: continuous DLMs, discrete DLMs, and recent multimodal DLMs. We observe that while early research predominantly focused on continuous DLMs, discrete DLMs have gained increasing popularity in more recent years.
Table of Contents
- 🎮 Playground
- 🔥 Must-Read
- 📜 Surveys
- 🧱 Diffusion Foundation
- 🎲 Discrete DLMs
- 🌊 Continuous DLMs
- 🖼️ Multimodal DLMs
- 🎯 Training Strategies
- 🚀 Inference Optimization
- 🔨 Training Frameworks
- 📊 Benchmarks
- 💡 Applications
- 🔗 Resources
Playground
Must-Read
D3PM: Structured Denoising Diffusion Models in Discrete State-Spaces
LLaDA: Large Language Diffusion Models
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models (ICLR 2025)
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Super Data Learners: Diffusion Language Models are Super Data Learners
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Surveys
[12 Aug 2025] A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
[16 Jun 2025] Discrete Diffusion in Large Language and Multimodal Models: A Survey
[23 Feb 2024] Diffusion models in text generation: a survey (PeerJ Computer Science)
[29 Jun 2023] An Overview of Diffusion Models for Text Generation (MIPRO)
[24 May 2023] A Survey of Diffusion Models in Natural Language Processing
[14 Mar 2023] Diffusion Models in NLP: A Survey
[12 Mar 2023] Diffusion Models for Non-autoregressive Text Generation: A Survey (IJCAI 2023)
Diffusion Foundation
[7 Sep 2022] Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow (ICLR 2023)
[26 Nov 2020] Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021)
[6 Oct 2020] Denoising Diffusion Implicit Models (ICLR 2021)
[19 Jun 2020] Denoising Diffusion Probabilistic Models (NeurIPS 2020)
[12 Jul 2019] Generative Modeling by Estimating Gradients of the Data Distribution (NeurIPS 2019)
[12 Mar 2015] Deep Unsupervised Learning using Nonequilibrium Thermodynamics (ICML 2015)
Discrete DLMs
[2 Mar 2026] Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects
[2 Mar 2026] MetaState: Persistent Working Memory for Discrete Diffusion Language Models
[26 Feb 2026] Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?
[26 Feb 2026] dLLM: Simple Diffusion Language Modeling
[19 Feb 2026] Sink-Aware Pruning for Diffusion Language Models
[16 Feb 2026] Scaling Beyond Masked Diffusion Language Models
[15 Feb 2026] MAGE: All-[MASK] Block Already Knows Where to Look in Diffusion LLM
[12 Feb 2026] T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization
[10 Feb 2026] Advancing Block Diffusion Language Models for Test-Time Scaling
[9 Feb 2026] TEAM: Temporal-Spatial Consistency Guided Expert Activation for MoE Diffusion Language Model Acceleration
[8 Feb 2026] TDGNet: Hallucination Detection in Diffusion Language Models via Temporal Dynamic Graphs
[5 Feb 2026] DLM-Scope: Mechanistic Interpretability of Diffusion Language Models via Sparse Autoencoders
[2 Feb 2026] Understanding the Reversal Curse Mitigation in Masked Diffusion Models through Attention and Training Dynamics
[2 Feb 2026] Unifying Masked Diffusion Models with Various Generation Orders and Beyond
[1 Feb 2026] Balancing Understanding and Generation in Discrete Diffusion Models
[30 Jan 2026] Residual Context Diffusion Language Models
[30 Jan 2026] Relaxing Positional Alignment in Masked Diffusion Language Models
[29 Jan 2026] Thinking Out of Order: When Output Order Stops Reflecting Reasoning Order in Diffusion Language Models
[29 Jan 2026] Causal Autoregressive Diffusion Language Model
[27 Jan 2026] One Token Is Enough: Improving Diffusion Language Models with a Sink Token
[27 Jan 2026] Membership Inference Attacks Against Fine-tuned Diffusion Language Models
[22 Jan 2026] Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow
[22 Jan 2026] Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
[21 Jan 2026] Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models
[19 Jan 2026] Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation
[18 Jan 2026] LR-DWM: Efficient Watermarking for Diffusion Language Models
[16 Jan 2026] Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models
[12 Jan 2026] Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
[12 Jan 2026] DiffER: Diffusion Entity-Relation Modeling for Reversal Curse in Diffusion Large Language Models
[5 Jan 2026] CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
[27 Dec 2025] On the Role of Discreteness in Diffusion LLMs
[23 Dec 2025] MoE-DiffuSeq: Enhancing Long-Document Diffusion Models with Sparse Attention and Mixture of Experts
[15 Dec 2025] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
[7 Dec 2025] From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs
[27 Nov 2025] C$^2$DLM: Causal Concept-Guided Diffusion Large Language Models
[12 Nov 2025] TiDAR: Think in Diffusion, Talk in Autoregression
[05 Nov 2025] Training Optimal Large Diffusion Language Models
[02 Nov 2025] OpenMoE 2: Sparse Diffusion Language Models
[1 Nov 2025] SpecDiff-2: Scaling Diffusion Drafter Alignment For Faster Speculative Decoding
[31 Oct 2025] Diffuse Thinking: Exploring Diffusion Language Models as Efficient Thought Proposers for Reasoning
[30 Oct 2025] Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
[27 Oct 2025] Variational Masked Diffusion Models
[21 Oct 2025] How Efficient Are Diffusion Language Models? A Critical Examination of Efficiency Evaluation Practices
[20 Oct 2025] Soft-Masked Diffusion Language Models
[17 Oct 2025] Planner and Executor: Collaboration between Discrete Diffusion And Autoregressive Models in Reasoning
[17 Oct 2025] Attention Sinks in Diffusion Language Models
[15 Oct 2025] On the Reasoning Abilities of Masked Diffusion Language Models
[12 Oct 2025] UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models
[10 Oct 2025] Closing the Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs
[10 Oct 2025] Beyond Surface Reasoning: Unveiling the True Long Chain-of-Thought Capacity of Diffusion Large Language Models
[8 Oct 2025] Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
[7 Oct 2025] SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation
[5 Oct 2025] What Makes Diffusion Language Models Super Data Learners?
[5 Oct 2025] Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
[4 Oct 2025] Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs
[3 Oct 2025] DMark: Order-Agnostic Watermarking for Diffusion Large Language Models
[1 Oct 2025] Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling
[30 Sep 2025] dParallel: Learnable Parallel Decoding for dLLMs
[29 Sep 2025] Why mask diffusion does not work
[29 Sep 2025] DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models
[29 Sep 2025] LLaDA-MoE: A Sparse MoE Diffusion Language Model
[29 Sep 2025] Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct
[28 Sep 2025] SparseD: Sparse Attention for Diffusion Language Models
[28 Sep 2025] Sequential Diffusion Language Models
[27 Sep 2025] Tree Reward-Aligned Search for TReASURe in Masked Diffusion Language Models
[24 Sep 2025] FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
[17 Sep 2025] Masked Diffusion Models as Energy Minimization
[5 Sep 2025] Masked Diffusion Language Models with Frequency-Informed Training
[1 Sep 2025] Dream-Coder 7B: An Open Diffusion Language Model for Code
[31 Aug 2025] Any-Order Flexible Length Masked Diffusion
[17 Aug 2025] Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
[14 Aug 2025] Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs
[12 Aug 2025] Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models
[4 Aug 2025] Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference
[25 Jul 2025] Jailbreaking Large Language Diffusion Models: Revealing Hidden Safety Flaws in Diffusion-Based Text Generation
[15 Jul 2025] DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-Size Canvas
[15 Jul 2025] The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs
[10 Jul 2025] Your Absorbing Discrete Diffusion Secretly Models the Bayesian Posterior
[7 Jul 2025] Review, Remask, Refine (R3): Process-Guided Block Diffusion for Text Generation (ICML 2025)
[6 Jul 2025] Efficient perplexity bound and ratio matching in discrete diffusion language models (ICLR 2025)
[2 Jul 2025] Discrete Diffusion Models for Language Generation
[17 Jun 2025] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
[12 Jun 2025] The Diffusion Duality (ICML 2025)
[12 Jun 2025] Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
[2 Jun 2025] Esoteric Language Models
[25 May 2025] LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
[24 May 2025] Anchored Diffusion Language Model
[21 May 2025] Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
[20 May 2025] CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation
[9 May 2025] Insertion Language Models: Sequence Generation with Arbitrary-Position Insertions
[22 Apr 2025] Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion (ICML 2025)
[2 Apr 2025] Dream 7B
[16 Mar 2025] State Fourier Diffusion Language Model (SFDLM): A Scalable, Novel Iterative Approach to Language Modeling
[12 Mar 2025] Constrained Discrete Diffusion
[12 Mar 2025] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models (ICLR 2025)
[11 Mar 2025] Understanding the Quality-Diversity Trade-off in Diffusion Language Models
[6 Mar 2025] Generalized Interpolating Discrete Diffusion (ICML 2025)
[14 Feb 2025] Large Language Diffusion Models
[13 Feb 2025] Theoretical Benefit and Limitation of Diffusion Language Model
[13 Feb 2025] Non-Markovian Discrete Diffusion with Causal Language Models
[10 Feb 2025] Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions (ICML 2025)
[10 Nov 2024] Conditional [MASK] Discrete Diffusion Language Model
[28 Oct 2024] Beyond Autoregression: Fast LLMs via Self-Distillation Through Time (ICLR 2025)
[28 Oct 2024] Energy-Based Diffusion Language Models for Text Generation (ICLR 2025)
[24 Oct 2024] Scaling up Masked Diffusion Models on Text (ICLR 2025)
[23 Oct 2024] Scaling Diffusion Language Models via Adaptation from Autoregressive Models (ICLR 2025)
[18 Oct 2024] Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning (ICLR 2025)
[8 Oct 2024] (DDPD) Think While You Generate: Discrete Diffusion with Planned Denoising (ICLR 2025)
[2 Oct 2024] Discrete Copula Diffusion (ICLR 2025)
[4 Sep 2024] Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling (ICLR 2025)
[22 Jul 2024] Discrete Flow Matching (NeurIPS 2024)
[10 Jul 2024] Promises, Outlooks and Challenges of Diffusion Language Modeling
[11 Jun 2024] (MDLM) Simple and Effective Masked Diffusion Language Models (NeurIPS 2024)
[6 Jun 2024] (RADD) Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data (ICLR 2025)
[6 Jun 2024] (MD4) Simplified and Generalized Masked Diffusion for Discrete Data (NeurIPS 2024)
[7 Feb 2024] Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design (ICML 2024)
[30 Jan 2024] Transfer Learning for Text Diffusion Models
[25 Oct 2023] (SEDD) Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (ICML 2024)
[15 Oct 2023] FiLM: Fill-in Language Models for Any-Order Generation
[23 Aug 2023] Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
[30 May 2023] Likelihood-Based Diffusion Language Models (NeurIPS 2023)
[6 May 2023] Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation (EACL 2024)
[11 Feb 2023] A Reparameterized Discrete Diffusion Model for Text Generation (COLM 2024)
[28 Nov 2022] DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models (ACL 2023)
[30 Oct 2022] DiffusER: Discrete Diffusion via Edit-based Reconstruction (ICLR 2023)
[13 Dec 2021] (SUNDAE) Step-unrolled Denoising Autoencoders for Text Generation (ICLR 2022)
[7 Jul 2021] Structured Denoising Diffusion Models in Discrete State-Spaces (NeurIPS 2021)
[10 Feb 2021] Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions (NeurIPS 2021)
Continuous DLMs
[3 Mar 2026] CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think
[26 Oct 2025] CANDI: Hybrid Discrete-Continuous Diffusion Models
[6 Oct 2025] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning
[3 Oct 2025] Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner
[26 Jun 2025] Compressed and Smooth Latent Space for Text Diffusion Modeling
[28 May 2025] Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes (ACL 2025)
[24 May 2025] Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation
[20 Apr 2025] Perfect diffusion is TC^0 -- Bad diffusion is Turing-complete
[19 Feb 2025] TESS 2: A Large-Scale Generalist Diffusion Language Model
[15 Dec 2024] Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models (ACL 2025)
[17 Oct 2024] Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration (NeurIPS 2024)
[8 Aug 2024] Diffusion Guided Language Modeling (ACL Findings 2024)
[May 2024] Effective Integration of Text Diffusion and Pre-Trained Language Models with Linguistic Easy-First Schedule (LREC-COLING 2024)
[17 Mar 2024] Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows (NAACL 2024)
[14 Mar 2024] LDSeq: Latent Diffusion Models for Sequence to Sequence Text Generation (CSAI 23)
[Mar 2024] Flow Matching for Conditional Text Generation in a Few Sampling Steps (EACL 2024)
[29 Feb 2024] TEncDM: Understanding the Properties of Diffusion Model in the Space of Language Model Encodings
[29 Feb 2024] Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding (ICML 2024)
[31 Oct 2023] LADIDA: Latent Diffusion for Document Generation with Sequential Decoding (NeurIPS Workshop 2023)
[18 Oct 2023] InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation (EMNLP 2023)
[09 Oct 2023] DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models (EMNLP 2023)
[26 Jul 2023] How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data? (ECAI 2023)
[19 May 2023] DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text Diffusion
[16 May 2023] AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation (NeurIPS 2023)
[15 May 2023] TESS: Text-to-Text Self-Conditioned Simplex Diffusion (EACL 2024)
[25 Apr 2023] Glyphdiffusion: Text generation as image generation
[10 Apr 2023] A Cheaper and Better Diffusion Language Model with Soft-Masked Noise (EMNLP 2023)
[20 Feb 2023] Dinoiser: Diffused conditional sequence learning by manipulating noises (TCAL 2024)
[22 Dec 2022] (GENIE) Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise (ICML 2023)
[20 Dec 2022] Seqdiffuseq: Text diffusion with encoder-decoder transformers (NAACL 2024)
[19 Dec 2022] Latent Diffusion for Language Generation (NeurIPS 2023)
[19 Dec 2022] (Difformer) Empowering Diffusion Models on the Embedding Space for Text Generation (NAACL 2024)
[28 Nov 2022] Continuous diffusion for categorical data
[8 Nov 2022] Self-conditioned Embedding Diffusion for Text Generation
[31 Oct 2022] SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control (ACL 2023)
[17 Oct 2022] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models (ICLR 2023)
[1 Aug 2022] Composable Text Controls in Latent Space with ODEs (EMNLP 2023)
[13 Jun 2022] Latent Diffusion Energy-Based Model for Interpretable Text Modeling (ICML 2022)
[27 May 2022] Diffusion-LM Improves Controllable Text Generation (NeurIPS 2022)
Multimodal DLMs
[25 Jan 2026] VidLaDA: Bidirectional Diffusion Large Language Models for Efficient Video Understanding
[17 Dec 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
[16 Dec 2025] Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models
[12 Nov 2025] MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation
[3 Nov 2025] Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process
[22 Oct 2025] From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model
[30 Sep 2025] dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought
[23 Sep 2025] Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation
[9 Sep 2025] Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
[8 Sep 2025] LLaDA-VLA: Vision Language Diffusion Action Models
[29 May 2025] Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model
[26 May 2025] FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities
[22 May 2025] LaViDa: A Large Diffusion Language Model for Multimodal Understanding
[22 May 2025] Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
[22 May 2025] LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
[21 May 2025] MMaDA: Multimodal Large Diffusion Language Models
[26 Mar 2025] Unified Multimodal Discrete Diffusion
Training Strategies
[6 Feb 2026] Diffusion-State Policy Optimization for Masked Diffusion Language Models
[2 Feb 2026] AR-MAP: Are Autoregressive Large Language Models Implicit Teachers for Diffusion Large Language Models?
[21 Jan 2026] The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models
[12 Jan 2026] d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation
[16 Dec 2025] Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
[10 Dec 2025] d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models
[3 Dec 2025] Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective
[24 Nov 2025] CDLM: Consistency Diffusion Language Models For Faster Sampling
[26 Oct 2025] Aligning Diffusion Language Models via Unpaired Preference Optimization
[24 Oct 2025] MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
[03 Oct 2025] Training Optimal Large Diffusion Language Models
[13 Oct 2025] Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models
[10 Oct 2025] SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
[5 Oct 2025] Principled and Tractable RL for Reasoning with Diffusion Language Models
[2 Oct 2025] Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models
[27 Sep 2025] A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models
[12 Sep 2025] Inpainting-Guided Policy Optimization for Diffusion Large Language Models
[8 Sep 2025] Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models
[7 Sep 2025] BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
[27 Aug 2025] Blockwise SFT for Diffusion Language Models: Reconciling Bidirectional Attention and Autoregressive Decoding
[18 Aug 2025] MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
[7 Jul 2025] wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models
[25 Jun 2025] DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
[25 May 2025] LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
[21 May 2025] MMaDA: Multimodal Large Diffusion Language Models
[15 May 2025] Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
[16 Apr 2025] d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
[2 Apr 2025] Dream 7B
[3 Feb 2025] Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
[Jan 2025] Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation (COLING 2025)
[23 Oct 2024] Scaling Diffusion Language Models via Adaptation from Autoregressive Models (ICLR 2025)
[17 Oct 2024] Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design (ICLR 2025)
[19 Feb 2024] Text Diffusion with Reinforced Conditioning
[12 Feb 2024] Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models (NeurIPS 2024)
[8 May 2023] Can Diffusion Model Achieve Better Performance in Text Generation? Bridging the Gap between Training and Inference! (ACL 2023)
Inference Optimization
[5 Mar 2026] Free Lunch for Pass@$k$? Low Cost Diverse Sampling for Diffusion Language Models
[3 Mar 2026] Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration
[27 Feb 2026] Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding
[26 Feb 2026] Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching
[26 Feb 2026] Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference
[20 Feb 2026] Improving Sampling for Masked Diffusion Models via Information Gain
[12 Feb 2026] dVoting: Fast Voting for dLLMs
[11 Feb 2026] Just on Time: Token-Level Early Stopping for Diffusion Language Models
[11 Feb 2026] Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models
[10 Feb 2026] Where-to-Unmask: Ground-Truth-Guided Unmasking Order Learning for Masked Diffusion Language Models
[7 Feb 2026] Improving Variable-Length Generation in Diffusion Language Models via Length Regularization
[6 Feb 2026] DAWN: Dependency-Aware Fast Inference for Diffusion LLMs
[6 Feb 2026] Stopping Computation for Converged Tokens in Masked Diffusion-LM Decoding
[5 Feb 2026] Stop the Flip-Flop: Context-Preserving Verification for Fast Revocable Diffusion Decoding
[5 Feb 2026] DFlash: Block Diffusion for Flash Speculative Decoding
[5 Feb 2026] DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs
[4 Feb 2026] EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models
[4 Feb 2026] Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models
[2 Feb 2026] Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing
[2 Feb 2026] Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models
[30 Jan 2026] Time-Annealed Perturbation Sampling: Diverse Generation for Diffusion Language Models
[30 Jan 2026] FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation
[30 Jan 2026] $ρ$-$\texttt{EOS}$: Training-free Bidirectional Variable-Length Control for Masked Diffusion LLMs
[29 Jan 2026] ILRR: Inference-Time Steering Method for Masked Diffusion Language Models
[28 Jan 2026] Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space
[25 Jan 2026] Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding
[18 Jan 2026] Plan, Verify and Fill: A Structured Parallel Decoding Approach for Diffusion Language Models
[6 Jan 2026] DIP: Dynamic In-Context Planner For Diffusion Language Models
[5 Jan 2026] Deferred Commitment Decoding for Diffusion Language Models
[30 Dec 2025] Activation Steering for Masked Diffusion Language Models
[28 Dec 2025] WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference
[24 Dec 2025] Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty
[23 Dec 2025] Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs
[13 Dec 2025] Diffusion Language Model Inference with Monte Carlo Tree Search
[2 Dec 2025] Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules
[26 Nov 2025] Beyond Confidence: Adaptive and Coherent Decoding for Diffusion Language Models
[26 Nov 2025] From Bits to Rounds: Parallel Decoding with Exploration for Diffusion Language Models
[24 Nov 2025] Orchestrating Dual-Boundaries: An Arithmetic Intensity Inspired Acceleration Framework for Diffusion Language Models
[28 Oct 2025] Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way
[24 Oct 2025] Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing
[20 Oct 2025] Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model
[16 Oct 2025] Attention Is All You Need for KV Cache in Diffusion LLMs
[16 Oct 2025] Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
[13 Oct 2025] Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States
[13 Oct 2025] Unlocking the Potential of Diffusion Language Models through Template Infilling
[10 Oct 2025] Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
[9 Oct 2025] dInfer: An Efficient Inference Framework for Diffusion Language Models
[8 Oct 2025] Accelerating Diffusion LLM Inference via Local Determinism Propagation
[7 Oct 2025] CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credits
[6 Oct 2025] Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models
[6 Oct 2025] Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts
[5 Oct 2025] Self Speculative Decoding for Diffusion Large Language Models
[30 Sep 2025] Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
[30 Sep 2025] Fast-dLLM v2: Efficient Block-Diffusion LLM
[29 Sep 2025] RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance
[29 Sep 2025] Learning to Parallel: Accelerating Diffusion Large Language Models via Adaptive Parallel Decoding
[28 Sep 2025] Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step
[28 Sep 2025] Don't Settle Too Early: Self-Reflective Remasking for Diffusion Language Models
[28 Sep 2025] DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding
[27 Sep 2025] d2Cache: Accelerating Diffusion-based LLMs via Dual Adaptive Caching
[25 Sep 2025] Enabling Approximate Joint Sampling in Diffusion LMs
[22 Sep 2025] Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
[18 Sep 2025] Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
[31 Aug 2025] Reward-Weighted Sampling: Enhancing Non-Autoregressive Characteristics in Masked Diffusion LLMs
[27 Aug 2025] Diffusion Language Models Know the Answer Before Decoding
[20 Aug 2025] Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
[19 Aug 2025] DPad: Efficient Diffusion Language Models with Suffix Dropout
[18 Aug 2025] PC-Sampler: Position-Aware Calibration of Decoding Bias in Masked Diffusion Models
[14 Aug 2025] DLLMQuant: Quantizing Diffusion-based Large Language Models
[13 Aug 2025] Constrained Decoding of Diffusion LLMs with Context-Free Grammars
[8 Aug 2025] Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing
[4 Aug 2025] Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
[1 Aug 2025] Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models
[24 Jul 2025] Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
[11 Jul 2025] Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
[6 Jul 2025] Unveiling the Potential of Diffusion Large Language Model in Controllable Generation
[23 Jun 2025] Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models
[12 Jun 2025] Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
[12 Jun 2025] The Diffusion Duality (ICML 2025)
[2 Jun 2025] Esoteric Language Models
[31 May 2025] Accelerating Diffusion LLMs via Adaptive Parallel Decoding
[30 May 2025] DLM-One: Diffusion Language Models for One-Step Sequence Generation
[30 May 2025] Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
[28 May 2025] Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
[28 May 2025] DINGO: Constrained Inference for Diffusion LLMs
[27 May 2025] Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion
[26 May 2025] Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
[22 May 2025] Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
[17 May 2025] dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching
[21 May 2025] dKV-Cache: The Cache for Diffusion Language Models
[1 Mar 2025] Remasking Discrete Diffusion Models with Inference-Time Scaling (ICLR 2025)
[11 Oct 2024] Distillation of Discrete Diffusion through Dimensional Correlations (ICML 2025)
[8 Oct 2024] (DDPD) Think While You Generate: Discrete Diffusion with Planned Denoising
[Nov 2024] Enable Fast Sampling for Seq2Seq Text Diffusion (EMNLP Findings 2024)
[10 Aug 2024] Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion (NAACL 2025)
[May 2024] Few-shot Temporal Pruning Accelerates Diffusion Models for Text Generation (LREC-COLING 2024)
[15 Mar 2024] Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality
[15 Feb 2024] Quantized Embedding Vectors for Controllable Diffusion Language Models
[3 Jun 2024] Unlocking Guidance for Discrete State-Space Diffusion and Flow Models (ICLR 2025)
[09 Oct 2023] DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models (EMNLP 2023)
[24 May 2023] David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs (NAACL 2024)
[18 May 2023] Diffusion Language Models Generation Can Be Halted Early
Training Frameworks
[19 Nov 2025] DiRL: An Efficient Training Framework for Diffusion Language Models
[02 Nov 2025] MegaDLMs: Training Diffusion Language Models at Any Scale
Benchmarks
[6 Oct 2025] ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs
Applications
[13 Feb 2026] DiffuRank: Effective Document Reranking with Diffusion Language Models
[12 Feb 2026] DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
[19 Jan 2026] The Bitter Lesson of Diffusion Language Models for Agentic Workflows: A Comprehensive Reality Check
[09 Nov 2025] LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation
[1 Oct 2025] Syntax-Guided Diffusion Language Models with User-Integrated Personalization
[30 Sep 2025] TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
[29 Sep 2025] DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern
[24 Sep 2025] Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving
[14 Aug 2025] Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling
[2 Aug 2025] TreeDiff: AST-Guided Code Generation with Diffusion LLMs
[25 Jul 2025] Arg-LLaDA: Argument Summarization via Large Language Diffusion Models and Sufficiency-Aware Refinement
[26 Jun 2025] DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
[17 Jun 2025] Mercury: Ultra-Fast Language Models Based on Diffusion
[16 Jun 2025] Flexible-length Text Infilling for Discrete Diffusion Models
[11 Jun 2025] Debunk and Infer: Multimodal Fake News Detection via Diffusion-Generated Evidence and LLM Reasoning
[9 Jun 2025] Diffusion Sequence Models for Enhanced Protein Representation and Generation
[28 May 2025] CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models (ICML 2025)
[14 May 2025] Gemini Diffusion
[27 Feb 2025] EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models (ACL 2025)
[31 Jan 2025] TermDiffuSum: A Term-guided Diffusion Model for Extractive Summarization of Legal Documents (COLING 2025)
[1 Jan 2025] DiffETM: Diffusion Process Enhanced Embedded Topic Model (ICASSP 2025)
[23 Dec 2024] DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak
[5 Nov 2024] DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models (ACL 2025)
[30 Oct 2024] Private Synthetic Text Generation with Diffusion Models (NAACL 2025)
[22 Oct 2024] MeMDLM: De Novo Membrane Protein Design with Masked Discrete Diffusion Protein Language Models (ICLR 2025)
[17 Oct 2024] Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model
[17 Oct 2024] DPLM-2: A Multimodal Diffusion Protein Language Model
[17 Oct 2024] Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design (ICLR 2025)
[10 Oct 2024] Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction (ICLR 2025)
[14 Sep 2024] Towards Diverse and Efficient Audio Captioning via Diffusion Models (DAC-Interspeech25)
[10 Sep 2024] Table-to-Text Generation with Pretrained Diffusion Models (IEEE 2024)
[5 Sep 2024] An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification (EMNLP 2024)
[Aug 2024] DiffusPoll: Conditional Text Diffusion Model for Poll Generation (ACL 2024)
[25 Jun 2024] Discrete Diffusion Language Model for Efficient Text Summarization (NAACL 2025)
[16 Apr 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? (NAACL 2024)
[13 Apr 2024] Improved Paraphrase Generation via Controllable Latent Diffusion
[10 Apr 2024] DiffusionDialog: A Diffusion Model for Diverse Dialog Generation with Latent Space (LREC-COLING 2024)
[10 Apr 2024] Diffuwords: A Contrastive Diffusion Model for Lexically Constrained Text Generation (SSRN 2024 Apr)
[28 Mar 2024] Benchmarking Diffusion Models for Machine Translation (EACL 2024)
[26 Mar 2024] Improving Iteration-based Non-Autoregressive Language Model With Time Step Awareness (ICPADS 2023)
[28 Feb 2024] Diffusion Language Models Are Versatile Protein Learners (ICML 2024)
[26 Feb 2024] DiffuCOMET: Contextual Commonsense Knowledge Diffusion (ACL 2024)
[24 Feb 2024] IPED: An Implicit Perspective for Relational Triple Extraction based on Diffusion Model (NAACL 2024)
[23 Feb 2024] Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models (LREC-COLING 2024)
[20 Feb 2024] Text-Guided Molecule Generation with Diffusion Language Model (AAAI 2024)
[16 Feb 2024] Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation
[11 Jan 2024] MDM: Meta diffusion model for hard-constrained text generation (Knowledge-Based Systems)
[Dec 2023] DiffusionSL: Sequence Labeling via Tag Diffusion Process (EMNLP 2023)
[19 Dec 2023] IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition (IJCV 2025)
[12 Dec 2023] DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models (EMNLP 2023)
[3 Dec 2023] DiffuCom: A novel diffusion model for comment generation (Knowledge-Based Systems)
[Dec 2023] DiffusionRet: Diffusion-Enhanced Generative Retriever using Constrained Decoding (EMNLP 2023)
[16 Nov 2023] P^3SUM: Preserving Author's Perspective in News Summarization with Diffusion Language Models (NAACL 2024)
[31 Oct 2023] LADIDA: Latent Diffusion for Document Generation with Sequential Decoding (NeurIPS Workshop 2023)
[26 Oct 2023] DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation (EMNLP 2023)
[24 Oct 2023] ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts (EMNLP 2023)
[23 Oct 2023] DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM (EMNLP 2023)
[21 Oct 2023] Context-Aware Prompt for Generation-based Event Argument Extraction with Diffusion Models (CIKM 2023)
[16 Oct 2023] ForceGen: End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a protein language diffusion model (ScienceAdvances)
[29 Aug 2023] ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer (AAAI 2024)
[17 Aug 2023] Enhancing Phrase Representation by Information Bottleneck Guided Text Diffusion Process for Keyphrase Extraction (LREC-COLING 2024)
[25 Jul 2023] XDLM: Cross-lingual Diffusion Language Model for Machine Translation
[9 Jul 2023] Controllable Conversation Generation with Conversation Structures via Diffusion Models (ACL 2023)
[14 Jun 2023] PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in Poetry Generation (AAAI 2024)
[14 Jun 2023] DiffuDetox: A Mixed Diffusion Model for Text Detoxification (ACL 2023)
[5 Jun 2023] PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model (NeurIPS 2023)
[2 Jun 2023] DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation (ACL 2023)
[31 May 2023] Fine-grained Text Style Transfer with Diffusion-Based Language Models (RepL4NLP 2023)
[31 May 2023] Protein Design with Guided Discrete Diffusion (NeurIPS 2023)
[22 May 2023] Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation (EMNLP 2023)
[22 May 2023] DiffusionNER: Boundary Diffusion for Named Entity Recognition (ACL 2023)
[2 May 2023] DiffuSum: Generation Enhanced Extractive Summarization with Diffusion (ACL 2023)
[7 Jan 2023] ROIC-DM: Robust Text Inference and Classification via Diffusion Model
Resources
bansky-cl/diffusion-nlp-paper-arxiv
yczhou001/Awesome-Diffusion-LLM
StevenYuan666/Awesome-Diffusion-Models-for-NLP
AoiDragon/Awesome-Text-Diffusion-Models
kuleshov-group/awesome-discrete-diffusion-models
Citation
@article{li2025survey,
title={A Survey on Diffusion Language Models},
author={Li, Tianyi and Chen, Mingda and Guo, Bowei and Shen, Zhiqiang},
journal={arXiv preprint arXiv:2508.10875},
year={2025}
}
