I’ve been using memcached to store session data for the past while, but we ran into a few problems at work that led me to dive in a bit deeper and see how PHP, Sessions, and memcached play along.

Short Version

When you start up a session a new entry is created in memcache, the expiry is the same as session.gc_maxlifetime. Session data is retrieved each time a session is started (either with session_start(), or if you have session.auto_start enabled, etc).

Long Version:

When session_start() is called for a new user several things happen very quickly:

  1. A session ID is generated for the end user, this is seen in the HTTP response headers. HTTP/1.x 200 OK Date: Fri, 10 Jul 2009 16:07:21 GMT Server: Apache/2.2.11 X-Powered-By: PHP/5.2.6-3ubuntu4.1 Set-Cookie: PHPSESSID=c446fc5309c3b8d3aa6e90343dddab29; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 23 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html
  2. PHP Contacts memcached and attempts to retrieve any information in that particular session ID get c446fc5309c3b8d3aa6e90343dddab29
  3. Memcached returns no data for that entry, as it doesn’t exist yet.
  4. Your page executes, setting data in the session
  5. PHP pushes data into memcached: set c446fc5309c3b8d3aa6e90343dddab29 0 1440 32 name |s:6”asdasd;bar|s:3:”foo”;

On every subsequent page where a session is used, a very similar sequence of events is executed. When the session is started PHP contacts memcached and asks for the session data. When the page is complete it pushes the data now in the session to memcached. That last step is kind of critical, because it’s what’s controlling the expiry of the data within memcached. Take a quick look at the data being set once more.

set c446fc5309c3b8d3aa6e90343dddab29 0 1440 32 name |s:6”asdasd;bar|s:3:”foo”;

Data is being set, the key is your session id, 0 is a set of flags that the client can use to mark the data (often used to mark data that’s been compressed). The 1440 represents the expiry time for the data, 1440 seconds from now (24 minutes). That value was taken directly from session.gc_maxlifetime . It’s important to keep in mind that what memcached guarantees is that the data will not be available after that expiry runs out, but it promises nothing about how long it will be available. If your memcached instance crashes, or runs out of memory you’re going to start losing data a lot sooner. Finally we have the data in a somewhat truncated version of PHP’s serialize.

The fact that the data is sent to memcached after every request, even if you haven’t changed what’s stored in the session, is rather important. It’s what makes your session last for session.gc_maxlifetime after the user last did something, rather than simply that amount plus the last time you requested a document.


Comments »

No Trackbacks
Paul,
I can't figure out if you're positive about using memcache for php sessions or not... What's the verdict? Any thoughts on when to use it and when not?
Thx,
Bob
#1 Bob on 2009-07-13 20:39

Hi Bob,

You're probably not clear because I made no attemps to come down on either side of the issue. Just to explain how it ends up working.

Memcache is fast, much faster than reading and writing files off disk. It also has the advantage of being easy to share, put one memcached server behind several webservers and sessions automagically work even if requests jump from one server to the next. No need for sticky load balancers.

That's all super.

There's also several downsides. The primary one being that memcached actually makes no promises except that data will no longer be available after expiration. So even without a server crashing memcached isn't making any promises about your data continuing to be available for the full amount of time before its expiry. If memcached is low on ram, and it hasn't been used lately, it's out. If a server does go down, all of the data that was stored within it is also lost.

If you can accept the less than perfect reliability memcached is offering, it's great, fast, and easy to get going with. If those aren't acceptable limitations for you (imagine a customer losing their entire shopping cart when a server goes down or runs out of ram) you'll need to look elsewhere.

At work we accept that limitation, if a memcached server craps out users need to log in again. It's a pain for users, but it works.
#2 Paul Reinheimer (Homepage) on 2009-07-13 20:48

Did you build a custom handler for the sessions or just set the ini_set('session.save_handler', 'memcache');? I have been looking into sessions lately but I had not thought about the data loss issue you brought up.


http://www.php.net/manual/en/memcache.examples-overview.php
#3 David (Homepage) on 2009-07-13 21:42

We're basically example #2

We use a loop to iterate over a series of IPs and ports and make one large string, but that's essentially it.
#4 Paul Reinheimer (Homepage) on 2009-07-13 21:44

Do you have benchmarks that prove that memcached is faster than file based sessions? Those files should get cached by the system (if there's enough ram), and therefor be fast as well, probably faster as there's no networking overhead?

I might miss something, but that was my first thought on reading this.
#5 Olly (Homepage) on 2009-07-14 03:14

the files still have to be synced out to disk on a regular basis which causes extra thrashing than required and php will goto resave them (contents changed or not) every page request, this all adds up fast, thats not including stat() overheads from the os itself, try it sometime with either the mm storage engine or memcache and you will see a world of difference

note the complaints about memcache losing data if a server goes down, it has support in it for automatic duplicates across a cluster so you can set a minimum of 2 or 3 clones of the data across the cluster if you need more reliability, that still wont help if you are out of memory in the instances though but that can be monitored easily enough

personally im quite the memcache fanboi, it does the world of good to stop accessing the disk or database with simple things, there are plenty of examples on a website where even just a 60second cache on a high traffic site will cut the load generated from a page by a hell of a lot
#6 trophaeum (Homepage) on 2009-07-14 03:53

Don't get me wrong, I do like memcached, just never used it for sessions, so I was just wondering if its really worth it (even if you use only one webserver).
Thanks for the clarification.

You can set memcache to stop it from deleting (valid) data, but then it will fail storing new values, probably the worse choice, but easier to catch.
#7 Olly (Homepage) on 2009-07-14 05:20

As to file based sessions, I ran into an issue where a client was using frames and loading 4 pages. Since the session files are locked until each page processes, we had to wait for each one to complete before it was displayed. We changed to memcache and it allows them to run in parallel. You do have to watch out for race conditions (updating the session while the others are using it), but that's something the developer should be to blame.
#8 Ian on 2009-07-14 12:55

Hi Paul. I've been using memcached sessions on a fairly high volume site. The site is built on the Kohana framework and we're using the Kohana session library for most session handling. For a while we had many problems with sessions getting lost, but then I disabled session regeneration in our Kohana session config and we haven't had a lost session since (about 2 weeks with no lost sessions so far). I don't know if that info is helpful to anyone outside the Kohana community, but I thought I should mention it since it took me several days to stumble upon that fix.
#9 Brett Brewer (Homepage) on 2009-07-15 20:35


Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
 

Hi, I’m Paul Reinheimer, a developer working on the web.

I wrote a book titled Professional Web APIs with PHP back in 2006, and am currently working in Biomedical Informatics for a major public health company.

I’m working on a project to help developers called WonderProxy which has proxies all over the world. Working on GeoIP development? Now you can finally test properly! We've also released Global Ping Statistics for expected ping times between cities, as well as a Load Testing Tool to measure your site's ability to handle load. Our most recent site checking tool is Where's it Up? which checks your sites availability globally, returning HTTP, DNS, and Traceroute details

My hobbies are cycling, photography, travel, and engaging Allison Moore in intelligent discourse. I frequently write about PHP and other related technologies.

I co-founded:

WonderNetwork Logo

Search