LINQ Internals and Pitfalls for C# Developers

Understanding IEnumerable vs IQueryable: In LINQ, both IEnumerable and IQueryable let you work with collections of data, but they behave differently behind the scenes. IEnumerable: Works with in-memory data like lists, arrays, or collections Processes data in your application's memory (client-side) Simple to use but can be inefficient with large datasets // IEnumerable example List customers = GetAllCustomers(); var goldCustomers = customers.Where(c => c.Type == "Gold"); // The entire customers list is loaded into memory first IQueryable: Designed for remote data sources like databases Translates your C# expressions into the data source's language (like SQL) More efficient with large datasets as filtering happens at the source // IQueryable example IQueryable customers = dbContext.Customers; var goldCustomers = customers.Where(c => c.Type == "Gold"); // Translates to SQL: SELECT * FROM Customers WHERE Type = 'Gold' How Deferred Execution Works: LINQ uses "lazy evaluation" - queries aren't executed when you write them, but only when you actually need the results. // This just creates a query definition, nothing executes yet var expensiveProducts = products.Where(p => p.Price > 100); // The query only executes now, when we start using the results foreach (var product in expensiveProducts) { Console.WriteLine(product.Name); } This is powerful because: You build queries in steps without repeatedly processing data The data source is only accessed when needed You can create reusable query templates Query execution is triggered when you: Loop through results with foreach Call methods like ToList(), ToArray(), First(), Count() Use aggregate functions like Sum(), Average(), etc. Avoiding Performance Issues: Problem 1: Multiple Enumerations When you enumerate the same query multiple times, it re-executes each time: // BAD PRACTICE var products = dbContext.Products.Where(p => p.Category == "Electronics"); // This executes the query once if (products.Any()) { // This executes the SAME query again! foreach (var product in products) { Console.WriteLine(product.Name); } // And here it executes a THIRD time! Console.WriteLine($"Total products: {products.Count()}"); } Better approach: // GOOD PRACTICE // Execute once and store results var products = dbContext.Products .Where(p => p.Category == "Electronics") .ToList(); // Query executes here and results are stored if (products.Any()) { foreach (var product in products) { Console.WriteLine(product.Name); } Console.WriteLine($"Total products: {products.Count}"); } Problem 2: Complex Query Chains Complex LINQ chains can be hard to understand and maintain: // Complex and inefficient var result = customers .Where(c => c.Orders.Any()) .SelectMany(c => c.Orders) .Where(o => o.Total > 100) .OrderBy(o => o.Date) .Select(o => new { o.Id, o.Total, CustomerName = o.Customer.Name }); Better approach: // Break it down into meaningful steps var customersWithOrders = customers.Where(c => c.Orders.Any()); var largeOrders = customersWithOrders .SelectMany(c => c.Orders) .Where(o => o.Total > 100); var orderedResults = largeOrders .OrderBy(o => o.Date) .Select(o => new { o.Id, o.Total, CustomerName = o.Customer.Name }) .ToList(); // Execute once and store Problem 3: N+1 Query Problem This happens when you load a collection and then access related entities in a loop: // BAD: Causes N+1 queries (1 for orders, N for customers) var orders = dbContext.Orders.Take(100).ToList(); foreach (var order in orders) { // This triggers a separate database query for EACH order! Console.WriteLine($"Order #{order.Id} by {order.Customer.Name}"); } Better approach: // GOOD: Use Include() to load related data in a single query var orders = dbContext.Orders .Include(o => o.Customer) // Load customers with orders .Take(100) .ToList(); foreach (var order in orders) { // No additional queries needed; data already loaded Console.WriteLine($"Order #{order.Id} by {order.Customer.Name}"); } Real-World Example: Product Catalog Filtering Let's see how these concepts apply to a product catalog filter: // Common real-world scenario public List GetFilteredProducts( string category = null, decimal? minPrice = null, decimal? maxPrice = null, string sortBy = "name") { // Start with IQueryable IQueryable query = dbContext.Products; // Build query conditionally if (!string.IsNullOrEmpty(category)) query = query.Where(p => p.Category == category); if (minPrice.HasValue) query = query.Where(p => p.Price >= minPrice.Value); if (maxPrice.HasValue) query = query.Where(p => p.Price query.OrderBy(p => p.Price), "date" => query.O

Apr 7, 2025 - 09:11
 0
LINQ Internals and Pitfalls for C# Developers

Understanding IEnumerable vs IQueryable:

In LINQ, both IEnumerable and IQueryable let you work with collections of data, but they behave differently behind the scenes.

IEnumerable:

  • Works with in-memory data like lists, arrays, or collections
  • Processes data in your application's memory (client-side)
  • Simple to use but can be inefficient with large datasets
// IEnumerable example
List<Customer> customers = GetAllCustomers();
var goldCustomers = customers.Where(c => c.Type == "Gold");
// The entire customers list is loaded into memory first

IQueryable:

  • Designed for remote data sources like databases
  • Translates your C# expressions into the data source's language (like SQL)
  • More efficient with large datasets as filtering happens at the source
// IQueryable example
IQueryable<Customer> customers = dbContext.Customers;
var goldCustomers = customers.Where(c => c.Type == "Gold");
// Translates to SQL: SELECT * FROM Customers WHERE Type = 'Gold'

How Deferred Execution Works:

LINQ uses "lazy evaluation" - queries aren't executed when you write them, but only when you actually need the results.

// This just creates a query definition, nothing executes yet
var expensiveProducts = products.Where(p => p.Price > 100);

// The query only executes now, when we start using the results
foreach (var product in expensiveProducts)
{
    Console.WriteLine(product.Name);
}

This is powerful because:

  1. You build queries in steps without repeatedly processing data
  2. The data source is only accessed when needed
  3. You can create reusable query templates

Query execution is triggered when you:

  • Loop through results with foreach
  • Call methods like ToList(), ToArray(), First(), Count()
  • Use aggregate functions like Sum(), Average(), etc.

Avoiding Performance Issues:

Problem 1: Multiple Enumerations

When you enumerate the same query multiple times, it re-executes each time:

// BAD PRACTICE
var products = dbContext.Products.Where(p => p.Category == "Electronics");

// This executes the query once
if (products.Any())
{
    // This executes the SAME query again!
    foreach (var product in products)
    {
        Console.WriteLine(product.Name);
    }

    // And here it executes a THIRD time!
    Console.WriteLine($"Total products: {products.Count()}");
}

Better approach:

// GOOD PRACTICE
// Execute once and store results
var products = dbContext.Products
    .Where(p => p.Category == "Electronics")
    .ToList();  // Query executes here and results are stored

if (products.Any())
{
    foreach (var product in products)
    {
        Console.WriteLine(product.Name);
    }

    Console.WriteLine($"Total products: {products.Count}");
}

Problem 2: Complex Query Chains

Complex LINQ chains can be hard to understand and maintain:

// Complex and inefficient
var result = customers
    .Where(c => c.Orders.Any())
    .SelectMany(c => c.Orders)
    .Where(o => o.Total > 100)
    .OrderBy(o => o.Date)
    .Select(o => new { o.Id, o.Total, CustomerName = o.Customer.Name });

Better approach:

// Break it down into meaningful steps
var customersWithOrders = customers.Where(c => c.Orders.Any());
var largeOrders = customersWithOrders
    .SelectMany(c => c.Orders)
    .Where(o => o.Total > 100);
var orderedResults = largeOrders
    .OrderBy(o => o.Date)
    .Select(o => new { o.Id, o.Total, CustomerName = o.Customer.Name })
    .ToList();  // Execute once and store

Problem 3: N+1 Query Problem

This happens when you load a collection and then access related entities in a loop:

// BAD: Causes N+1 queries (1 for orders, N for customers)
var orders = dbContext.Orders.Take(100).ToList();
foreach (var order in orders)
{
    // This triggers a separate database query for EACH order!
    Console.WriteLine($"Order #{order.Id} by {order.Customer.Name}");
}

Better approach:

// GOOD: Use Include() to load related data in a single query
var orders = dbContext.Orders
    .Include(o => o.Customer)  // Load customers with orders
    .Take(100)
    .ToList();

foreach (var order in orders)
{
    // No additional queries needed; data already loaded
    Console.WriteLine($"Order #{order.Id} by {order.Customer.Name}");
}

Real-World Example: Product Catalog Filtering

Let's see how these concepts apply to a product catalog filter:

// Common real-world scenario
public List<ProductViewModel> GetFilteredProducts(
    string category = null,
    decimal? minPrice = null,
    decimal? maxPrice = null,
    string sortBy = "name")
{
    // Start with IQueryable
    IQueryable<Product> query = dbContext.Products;

    // Build query conditionally
    if (!string.IsNullOrEmpty(category))
        query = query.Where(p => p.Category == category);

    if (minPrice.HasValue)
        query = query.Where(p => p.Price >= minPrice.Value);

    if (maxPrice.HasValue)
        query = query.Where(p => p.Price <= maxPrice.Value);

    // Apply sorting
    query = sortBy.ToLower() switch
    {
        "price" => query.OrderBy(p => p.Price),
        "date" => query.OrderByDescending(p => p.CreatedDate),
        _ => query.OrderBy(p => p.Name)
    };

    // Execute query and map to view model
    return query
        .Select(p => new ProductViewModel
        {
            Id = p.Id,
            Name = p.Name,
            Price = p.Price,
            Category = p.Category,
            InStock = p.StockQuantity > 0
        })
        .ToList();  // Query executes here
}

This example shows good practices:

  1. Using IQueryable for database queries
  2. Building the query in stages without executing until needed
  3. Converting to a List only once at the end
  4. Using projection (Select) to get only the data needed

By understanding these LINQ internals, you can write code that's not just functional but also performs well even with large datasets.