A colleague called me over today for some help with a memory usage issue in his PHP script. The script was performing the basic (but critical) task of importing data, pulling it in from MySQL in large chunks, then exporting it elsewhere. He was receiving the wonderful “Fatal error: Allowed memory size of XXXX bytes exhausted (tried to allocate YY bytes)” error message.

The code was following a basic flow:
10 Get stuff from the database (large array of arrays)
20 Iterate over it, doing stuff
30 Goto 10

I fixed the memory usage exceeded problem with an unset(), at the end of a loop.

Take a look at this sample program: loopOverStuff(); function loopOverStuff() { $var = null; for($i = 0; $i < 10; $i++) { $var = getData(); //Do stuff } } function getData() { $a = str_repeat(“This string is exactly 40 characters long”, 20000); return $a; }

The important thing to realize is that PHP will end up needing around twice as much memory as getData() takes. The problem is the line $var = getData(). The first time it is called $var is incredibly small, it’s clobbered and the return value of getData()is assigned to it. The second time through the loop $var still holds the value from the previous iteration, so while getData() is executing you’re maintaining the original data (in $var), and a whole new set (being built in getData()).

Fixing this is incredibly easy: function loopOverStuff() { $var = null; for($i = 0; $i < 10; $i++) { $var = stealMemories(); //Do Stuff unset($var); } }

This way we avoid the duplication in memory of those values on that line. To see this happen in more detail take a look at this sample script with ouptut: memory-usage-example.php.

This isn’t critical, except when it is. Once loopOverStuff() completes, and the function ends. The memory is released back to the rest of PHP automatically. You’ll only run into problems where Other Stuff + (2 * memory needed in loop) > Memory Limit. There are better architectures available to avoid the issue entirely (like not storing everything in the array, just to iterate over it later) but they’re an issue for a different post.

For a very simple base case demonstration of the issue take a look at the simple example.


Comments »

Social comments and analytics for this post
This post was mentioned on Twitter by planetphp: Memory usage in PHP - Paul Reinheimer http://blog.preinheimer.com/index.php?/archives/354-Memory-usage-in-PHP.html
Weblog: uberVU - social comments
Tracked: Mar 20, 01:16
This why learning a language like C/C++ is so useful. Those languages force you to manage memory and that skill translates nicely to scripting languages like PHP/Python.
#1 Herman Radtke (Homepage) on 2010-03-19 22:30 (Reply)

Also, using unbuffered queries with MySQL can save lots of memory. I talked about that at the 2008 MySQL Conference and on WebDevRadio. Slides and interview at http://brian.moonspot.net/2008/05/03/interview-with-webdevradio/
#2 Brian Moon (Homepage) on 2010-03-19 22:37 (Reply)

If you run the script with Xdebug, like:

php -dxdebug.auto_trace=1 -dxdebug.trace_format=1 paul.php

You can clearly see this behaviour as well.
#3 Derick (Homepage) on 2010-03-19 22:51 (Reply)

I'm just curious, as this would a micro-optimization, would it be more efficient to set $var to an empty value instead of unsetting it?

Thanks
#4 Rob O. (Homepage) on 2010-03-20 01:26 (Reply)

I'd like to disagree with calling this a micro-optimization.

The actual data sets we're dealing with are greater than 20MB, reducing peak memory usage by that value has real (positive) effects on the system.

True, using unbuffered queries, and avoiding buffering within PHP itself would be a far more efficient architecture (the developer in question will be going that way in version 1.2). Reducing peak memory usage now is far from "micro".

That said, it seems like unset() vs "" or null would fit in the micro category :-).
#5 Paul Reinheimer (Homepage) on 2010-03-20 01:34 (Reply)

You are correct. I meant micro-optimization being the unset() vs "". Sorry for the confusion :-)
#6 Rob O. (Homepage) on 2010-03-20 01:43 (Reply)

What do you think about xcache mod for php...it's a php accellerator. So it's saving memory because most scripts get cached!
#7 Jochen Liebe (Homepage) on 2010-03-20 23:24 (Reply)

you're talking about *opcode caching*. that doesn't solve the main problem being discussed here.
#8 Nathan G (Homepage) on 2010-03-21 16:18 (Reply)

Actually, that has the opposite effect. Opcode caches use more memory as it is caching all your scripts in memory. Opcode caches are made to save CPU cycles.
#9 Brian Moon (Homepage) on 2010-03-21 16:28 (Reply)

Let's not confuse the guy more.

The article was talking exhausting the memory allowed to the PHP script, not all the memory on a server. The fact that opcode caching uses memory outside of the PHP script is not really relevant.
#10 Herman Radtke (Homepage) on 2010-03-21 16:34 (Reply)

Presumably there is something missing from this example - the "Do Stuff" comment would represent some code that makes use of the return value from `getData`. A better solution imho would be to move this code into a separate function and call it without using a temporary variable. Eg.:

function loopOverStuff() {
for ($i = 0; $i < 10; $i++) {
doStuff(getData());
}
}

Not only will this have the same effect as your proposed solution, but it has the added benefit of making the overall code more readable.

As a general rule of thumb, avoiding temporary variables by replacing them with function calls makes code simpler and less error-prone.
#11 Troels Knak-Nielsen on 2010-03-21 22:34 (Reply)

Not really, all you need to come up with an idea like "hey, using unset will free me some memory and $var set in the loop exists outside the loop and between each iteration!!!" is a bit of logical thinking and knowledge of basic PHP stuff, C/C++ has not much to do with that :-)
#12 Tomek on 2010-03-23 10:58 (Reply)

#6 has true, but generaly why ugly PHP doesn't have true block scope for variables? ({} scope for every loop, condition, ....)

C, C++, Perl... have it, PHP doesn't. Why PHP cannot have "use strict" like Perl?

Uff, php has more and more basic badly things, e.g. why function()[2] doesn't work when function()->property works?
#13 optik (Homepage) on 2010-03-23 14:12 (Reply)

PHP is optimized for web requests. Part of that optimization is lazily releasing memory in order to speed up the request. Those of us that use PHP for non-web things have to know how it works and be responsible for how we use it.
#14 Brian Moon (Homepage) on 2010-03-23 15:06 (Reply)

You still hit the peak memory usage even when using unset though. You said this is a work around for all the memory allocation issues PHP has. All this does is reduce memory usuage every other itterance. The app will still crash....
#15 Adam on 2010-03-23 16:54 (Reply)

This is a great tip.

I've never encountered a scenario where something like this caused any noticeable memory leakage, but it this is great to know.

I am probably going to blog about this on my site. Good job!
#16 Chris Roane (Homepage) on 2010-03-24 14:20 (Reply)

Little bit off-topic to the concrete discussed problem, but try to use pre-increment instead of post-increment in the for-loop.

i have tested this issue like this:

$loops = 10000000;

$startPost = microtime(true);
for ($i = 0; $i < $loops; $i++) {
$test = 1;
}
$endPost = microtime(true);
$startPre = microtime(true);
for ($i = 0; $i < $loops; ++$i) {
$test = 1;
}
$endPre = microtime(true);

echo $loops.' post icrements needed: '.($endPost - $startPost).'';
echo $loops.' pre icrements needed: '.($endPre - $startPre).'';


10000000 post icrements needed: 17.445150852203
10000000 pre icrements needed: 13.529909849167

This would not take much influence at your problem but should be considered generally: use post increment only if really needed
#17 chasm on 2010-03-25 09:25 (Reply)

@chasm Sure, if you care about a performance increase of 0.0000003915 seconds. That's not even a micro-optimisation - That's a nano-optimisation.
#18 Troels Knak-Nielsen on 2010-03-25 09:36 (Reply)

@troels

no its a increase of 4 seconds because with the parameter of microtime() it will return seconds.

But you are right, its a kind of micro optimisation.

But why using a function (return old value and then increment) that will not be used this way? The interpeter has to copy the value, increment the original and, when incrementation finished, return the copy.

@paul, i'm interested in the other architectures, you have mentioned for resolving the main problem.

greetings
chasm
#19 chasm on 2010-03-25 09:46 (Reply)

I meant, per cycle:

(17.445150852203 - 13.529909849167) / 10000000

The only reason you can see the difference is because you made a loop of 10 million iterations. If you have to do any kind of work inside the loop, then it will surely dwarf the difference between post and pre increment.

You're right of course, but my point being that this only matters for a very few and rather abnormal edge cases.
#20 Troels Knak-Nielsen on 2010-03-25 09:59 (Reply)

A simple solution to a complex problem.
#21 foster on 2010-03-27 00:14 (Reply)

Using unset will not work if you have circular references in the data you try to unset (PHP 5.3 has a special INI option that should improve handling this though). So it could help for certain specific cases, but isn't always the solution for OOM errors...
#22 Wim Vandersmissen on 2010-04-06 14:47 (Reply)


Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
 

Hi, I’m Paul Reinheimer, a developer working on the web.

I co-founded WonderProxy which provides access to over 200 proxies around the world to enable testing of geoip sensitive applications. We've since expanded to offer more granular tooling through Where's it Up

My hobbies are cycling, photography, travel, and engaging Allison Moore in intelligent discourse. I frequently write about PHP and other related technologies.

Search