Python SQLAlchemy memory leak on Linux
I wrote a script that iterates through a large database table. (~ 150K lines.) To avoid using too much memory, I use this windowed_query method . My script looks something like this:
query = db.query(Table)
count = 0
for row in windowed_query(query, Table.id, 1000):
points = 0
# +100 points for a logo
if row.logo_id:
points += 100
# +10 points for each image
points += 10 * len(row.images) #images is a SQLAlchemy one-to-many relationship
#...The script continues with much of the same...
row.points = points
db.add(row)
count += 1
if count % 100 == 0:
db.commit()
print count
request.db.commit()
When you try to run it on a CentOS server, it does so 9000 lines before being killed by the kernel because it uses ~ 2GB of memory .
In my Mac development environment, it works like a charm, even though it runs on exactly the same version of Python (2.7.3), SQLAlchemy (0.7.8) and psycopg2 (2.4.5).
Using memory_profiler for some simple debugging: On Linux, every piece of code that queries the database has increased the amount of memory by a small amount, and the growth has never stopped. On Mac, the same thing happened, but after growing ~ 4MB, it flattened out. It's as if Linux doesn't collect anything. (I even tried to run gc.collect () every 100 lines. Did nothing.)
Does anyone know what's going on?
source to share