The Innocent ToList() That’s Killing Your API

8 min read Original article ↗

Last week I told you modern hardware hides performance problems until you hit scale. Here’s the code that proves it.

This pattern looks perfectly reasonable in code review. Each method makes sense in isolation. Then production hits 1,000 concurrent users and your API starts timing out, memory usage spikes to 8GB, and garbage collection pauses freeze your app for seconds at a time.

Let me show you what’s actually happening.

// Repository layer
public class OrderRepository
{
    public List<Order> GetOrders(int customerId)
    {
        return _db.Orders
            .Where(o => o.CustomerId == customerId)
            .ToList(); // ToList #1
    }
}

// Service layer
public class OrderService
{
    public List<Order> GetActiveOrders(int customerId)
    {
        var orders = _repository.GetOrders(customerId).ToList(); // ToList #2
        return orders.Where(o => o.Status == "Active").ToList(); // ToList #3
    }
}

// Business logic layer
public class OrderProcessor
{
    public List<OrderSummary> ProcessOrders(int customerId)
    {
        var orders = _orderService.GetActiveOrders(customerId).ToList(); // ToList #4
        var summaries = orders.Select(o => new OrderSummary(o)).ToList(); // ToList #5
        return summaries.Where(s => s.Total > 1000).ToList(); // ToList #6
    }
}

// API Controller
[HttpGet("orders/{customerId}")]
public IActionResult GetOrders(int customerId)
{
    var result = _processor.ProcessOrders(customerId);
    return Ok(result); // ASP.NET serializes it
}

You’ve written this. Maybe not exactly this, but close enough. Each layer looks clean. Each method has a single responsibility. Your team approved it in code review.

You shipped it.

When a request comes in for a customer with 10,000 orders, here’s what your code does:

  1. Repository ToList(): Loads 10,000 Order objects from the database into a List<Order>. Allocation: ~2MB

  2. Service ToList() #1: Takes that list (already materialized!) and calls ToList() on it again. Creates a new array, copies all 10,000 references. Allocation: ~2MB

  3. Service ToList() #2: Filters to active orders (~8,000 items), calls ToList(). New array, copies 8,000 references. Allocation: ~1.6MB

  4. Processor ToList() #1: Takes the active orders list (already materialized!) and calls ToList() again. New array, copies 8,000 references. Allocation: ~1.6MB

  5. Processor ToList() #2: Maps to OrderSummary objects, calls ToList(). Creates 8,000 new objects AND a new array. Allocation: ~3MB

  6. Processor ToList() #3: Filters summaries (down to ~2,000 items), calls ToList(). Final array, copies 2,000 references. Allocation: ~0.5MB

Total allocation for one request: ~11MB

At 1,000 concurrent requests: 11GB of active memory

And that’s before ASP.NET serializes the response, which allocates even more.

Here’s what’s invisible: you copied the same data 6 times when you only needed to iterate it once.

Every .ToList() creates a new array and copies references into it. When you call ToList() on something that’s already a List, you’re paying the allocation cost for no reason - you’re creating a defensive copy because you don’t trust what the previous method returned.

Your team did this because it feels safe. Lists are concrete. You can count them. You can index into them. IEnumerable feels abstract and lazy - what if it changes underneath you?

But that “safety” costs you 10x the memory you actually needed.

Here’s what actually changed - and why it matters:

The naive version materializes at every layer boundary.

You asked for filtered, processed order summaries. But your code loaded all orders into a list, then copied that list, then filtered into a new list, then copied that list, then mapped into a new list, then filtered into a final list.

It’s like photocopying a document, then photocopying the photocopy, then photocopying that, six times - when you only needed the final version.

Each layer defends itself with ToList() because nobody trusts the previous layer.

The service doesn’t trust that the repository returned a stable collection, so it calls ToList(). The processor doesn’t trust the service, so it calls ToList(). Everyone’s being “safe” and the result is 6x memory usage.

Here’s the part that makes developers go “oh shit”:

When you call .ToList() on something that’s already a List<T>, you’re not “converting” it. You’re cloning it. You’re allocating a brand new array of the exact same size and copying every reference over.

List<Order> orders = GetOrders(); // Already a List
var copy = orders.ToList(); // Just allocated another array and copied 10,000 references

Why? Because ToList() doesn’t check if the source is already a list. It just iterates and builds a new one. Every. Single. Time.

The 10x memory reduction isn’t magic - it’s just not copying what you already have.

Want to blow your team’s minds? Show them this: at 1,000 concurrent requests, the naive version needs 11GB of RAM. The optimized version needs 1GB.

Same data to the client. 10x less memory. Because you stopped photocopying photocopies.

Before we look at the optimized code, you need to understand what “GC pressure” actually means.

Garbage Collection (GC) is how .NET automatically cleans up memory you’re no longer using. Every object you create eventually becomes garbage that needs collecting.

.NET uses generational GC - it groups objects by age:

  • Gen 0: Brand new objects. Collected frequently (every few milliseconds), very fast.

  • Gen 1: Objects that survived one collection. Collected less often.

  • Gen 2: Long-lived objects. Collected rarely, but when it happens, it’s expensive - we’re talking 200-500ms pauses where your entire application freezes.

Here’s the problem: when you allocate tons of memory quickly (like creating 6 copies of 10,000 objects per request), you fill up Gen 0 fast. Objects get promoted to Gen 1, then Gen 2. Eventually the GC has to do a full Gen 2 collection to reclaim memory, and your API freezes for half a second.

Under load, this happens every few seconds. Your users see it as random latency spikes. You see it as “why is production so slow?”

The goal: Keep allocations small and short-lived so the GC only does fast Gen 0 collections, never expensive Gen 2 sweeps.

Now let’s see how the optimized code achieves that.

// Repository layer - return IEnumerable
public class OrderRepository
{
    public IEnumerable<Order> GetOrders(int customerId)
    {
        return _db.Orders
            .Where(o => o.CustomerId == customerId)
            .AsEnumerable(); // Lazy, not materialized yet
    }
}

// Service layer - stay lazy
public class OrderService
{
    public IEnumerable<Order> GetActiveOrders(int customerId)
    {
        return _repository.GetOrders(customerId)
            .Where(o => o.Status == "Active"); // Still lazy
    }
}

// Business logic layer - compose the pipeline
public class OrderProcessor
{
    public IEnumerable<OrderSummary> ProcessOrders(int customerId)
    {
        return _orderService.GetActiveOrders(customerId)
            .Select(o => new OrderSummary(o))
            .Where(s => s.Total > 1000); // Entire pipeline is lazy
    }
}

// API Controller - materialize ONCE
[HttpGet("orders/{customerId}")]
public IActionResult GetOrders(int customerId)
{
    var result = _processor.ProcessOrders(customerId);
    return Ok(result); // ASP.NET enumerates it once during serialization
}

What changed:

Every layer returns IEnumerable<T> instead of List<T>. The pipeline stays lazy through all layers. The data only materializes once - when ASP.NET serializes the response.

Memory allocation:

  • Before: 6 copies of the data, ~11MB per request

  • After: One enumeration, ~1MB per request

  • 11x reduction in memory usage

Iterations:

  • Before: Iterated the collection 6 times

  • After: Iterated once

  • 6x reduction in CPU work

Garbage Collection:

  • Before: Gen 2 collections every few seconds under load, 200-500ms pauses

  • After: Gen 0 collections only, <10ms pauses

  • GC pressure eliminated

Here’s where AI becomes useful. Not to write the code - to reveal what it costs.

I took the naive version and prompted Claude:

“Analyze this C# code path for memory efficiency. Trace how many times the data is materialized and copied across layers. Estimate memory allocation per request.”

The response showed me:

  • ToList() called 6 times on the same data

  • Each call allocating a new array

  • Total memory allocation: 10-15MB per request

  • At 1,000 concurrent users: 10-15GB active memory

  • GC Gen 2 collections every 3-5 seconds under load

Then I asked:

“Show me how to optimize this using lazy evaluation and IEnumerable. Keep it clean but eliminate redundant materializations.”

AI gave me the optimized version and explained the key insight: compose the pipeline, materialize once.

That’s not replacing understanding. That’s making the invisible costs visible again, the way a 200MHz CPU used to make them visible by choking when you allocated too much memory.

On your development machine with 100 test orders, both versions run in under 20ms. The naive version uses 500KB of memory. Your laptop doesn’t even notice.

This is what I meant last week: GHz processors hide the cost until scale forces you to see it.

The algorithmic complexity didn’t change. Creating N copies of M items always costs O(N × M) memory. These are CS fundamentals that were true in the MHz era and are still true now.

The difference is that modern hardware made the problem invisible during development. You never felt the pain of copying 10,000 objects 6 times into 64MB of RAM because you have 32GB. You never noticed GC pauses because the GC is incredibly fast on modern CPUs.

Until 1,000 users hit your API simultaneously and suddenly you’re out of memory.

Search your codebase for methods that return List<T>. Look at what calls them. Check if the caller immediately calls ToList() on the result.

You’re looking for this pattern:

var data = GetData().ToList(); // GetData already returns List<T>

It’s everywhere. I guarantee it.

Then prompt AI:

“Analyze this code path for redundant ToList() calls. Show me where data is being copied unnecessarily and how to use lazy evaluation instead.”

You’ll be surprised what you find. Code that looks perfectly clean in isolation often creates 5-10x memory overhead when you trace the full call stack.

The ToList() looks harmless. It’s one line. It passes code review. But when you chain 6 of them together across layers, you’ve just turned a 1MB operation into an 11MB operation.

The GHz just made you think it didn’t matter.

I’ll show you another pattern that’s even more invisible: string concatenation and interpolation in loops. Seems harmless. Costs you more than you think.

Subscribe so you don’t miss it. Every week we’ll take code that “works fine” and show you what it’s actually costing you.

Because the best time to fix performance problems is before they’re production incidents.