How to Save Memory in Python by Using Generators Instead of Lists
When working with large datasets or data streams, using Python lists can quickly consume significant amounts of memory. Generators offer a memory-efficient alternative by yielding values lazily, one at a time. This makes them ideal for scenarios where you don’t need to hold all items in memory simultaneously. 1. Lists vs Generators Let’s look at a basic comparison between a list and a generator: # Using a list squares_list = [x * x for x in range(10**6)] # Memory intensive # Using a generator squares_gen = (x * x for x in range(10**6)) # Much more memory efficient The list creates and stores all values at once. The generator yields one value at a time as needed. 2. Measuring Memory Usage You can use the sys module to compare the memory footprint: import sys list_version = [x for x in range(100000)] gen_version = (x for x in range(100000)) print("List size:", sys.getsizeof(list_version)) print("Generator size:", sys.getsizeof(gen_version)) The list will show a much higher memory usage than the generator, which only stores the iterator state. 3. When to Use Generators Processing large files line by line Streaming data (e.g., from APIs or sockets) Lazy evaluation of computations # Reading a file line by line using a generator def read_file_lines(filename): with open(filename) as f: for line in f: yield line.strip() for line in read_file_lines('bigfile.txt'): print(line) 4. Chaining Generators with Yield You can chain multiple generators to build complex pipelines without using much memory: def gen_numbers(): for i in range(1, 1000000): yield i def filter_even(numbers): for n in numbers: if n % 2 == 0: yield n def square(numbers): for n in numbers: yield n * n for result in square(filter_even(gen_numbers())): print(result) 5. Generator Functions vs Generator Expressions Generator expressions are like list comprehensions, but with parentheses. Generator functions use yield and are more flexible when logic is complex. # Expression gen_exp = (x*x for x in range(1000)) # Function def gen_func(): for x in range(1000): yield x * x Conclusion Using generators is one of the best ways to reduce memory usage in Python programs that process large amounts of data. They allow you to keep your code efficient and scalable without sacrificing readability or structure. If this post helped you, consider supporting me here: buymeacoffee.com/hexshift
When working with large datasets or data streams, using Python lists can quickly consume significant amounts of memory. Generators offer a memory-efficient alternative by yielding values lazily, one at a time. This makes them ideal for scenarios where you don’t need to hold all items in memory simultaneously.
1. Lists vs Generators
Let’s look at a basic comparison between a list and a generator:
# Using a list
squares_list = [x * x for x in range(10**6)] # Memory intensive
# Using a generator
squares_gen = (x * x for x in range(10**6)) # Much more memory efficient
The list creates and stores all values at once. The generator yields one value at a time as needed.
2. Measuring Memory Usage
You can use the sys
module to compare the memory footprint:
import sys
list_version = [x for x in range(100000)]
gen_version = (x for x in range(100000))
print("List size:", sys.getsizeof(list_version))
print("Generator size:", sys.getsizeof(gen_version))
The list will show a much higher memory usage than the generator, which only stores the iterator state.
3. When to Use Generators
- Processing large files line by line
- Streaming data (e.g., from APIs or sockets)
- Lazy evaluation of computations
# Reading a file line by line using a generator
def read_file_lines(filename):
with open(filename) as f:
for line in f:
yield line.strip()
for line in read_file_lines('bigfile.txt'):
print(line)
4. Chaining Generators with Yield
You can chain multiple generators to build complex pipelines without using much memory:
def gen_numbers():
for i in range(1, 1000000):
yield i
def filter_even(numbers):
for n in numbers:
if n % 2 == 0:
yield n
def square(numbers):
for n in numbers:
yield n * n
for result in square(filter_even(gen_numbers())):
print(result)
5. Generator Functions vs Generator Expressions
Generator expressions are like list comprehensions, but with parentheses. Generator functions use yield
and are more flexible when logic is complex.
# Expression
gen_exp = (x*x for x in range(1000))
# Function
def gen_func():
for x in range(1000):
yield x * x
Conclusion
Using generators is one of the best ways to reduce memory usage in Python programs that process large amounts of data. They allow you to keep your code efficient and scalable without sacrificing readability or structure.
If this post helped you, consider supporting me here: buymeacoffee.com/hexshift