I'm really restricted on space, so I wanted to present a counter point on today's funcaday: Performance.

The disadvantage with the escape for now, not for later approach is simple. If you save a user's post to the database, then that user's post is displayed 2,000 times there will be some serious differences. Under the approach I reccomend the post will be escaped with mysql_real_escape_string() once, and with htmlentiteis() 2,000 times. If you had escaped it twice in the first place those functions would have been called once each, saving you 1,999 calls to htmlentities.

You will need to balance your security concerns with performance needs.

Note: This blog post was written well in advance, I'm on vacation, don't have my laptop or internet, and it's likely that my cell phone won't even turn on. So replies may be a bit tardy.

Note^2: But I'm not dumb, someone's looking after my server :-)

Comments »

Paul Reinheimer's Blog: Today's Funcaday (Escaping)
Paul Reinheimer, one of two behind the funcaday website (providing details ...
Weblog: PHPDeveloper.org
Tracked: Jan 07, 14:14
And that's why we have caches!
#1 Edward Z. Yang (Homepage) on 2008-01-05 06:58 (Reply)

That 2000 htmlentities calls are nothing compared to the 2000 queries that you are doing, so indeed, make sure you cache properly.

Doing htmlentities before it goes into the database makes your database data very html centric. If you ever switch to publishing to other mediums, that will be slightly more difficult.
#2 Ivo Jansch (Homepage) on 2008-01-05 09:41 (Reply)

I agree with the last comments, I prefer to have raw data in the database available to do whatever I want with, then cache as necessary.
#3 Dave Marshall (Homepage) on 2008-01-05 11:30 (Reply)

I think you made a good point. However, one of PHPs strengths is rapid prototyping. It is totally aligned with the open source mantra "release early, release often". PHP allows you to release an application as early as possible and grow it from there. The business is happy because there's no need to wait forever to launch a website or to release an application.

As you get traffic, there are a bunch of ways optimize. One of them is caching. You can cache on the htmlspecialchars() level by storing the result of htmlspecialchars() in a separate caching field in the database (besides storing the unescaped values for searching and other purposes).

It usually turns out however that caching is not the most efficient on the htmlspecialchars() level but on upper levels like the caching of whole blocks on a page. I think this is why the efficiency of htmlspecialchars() is not really an issue. From the technology point of view it is, but from the POV of a business, it doesn't really worth to spend your precious time and attention on it.
#4 Norbert (Homepage) on 2008-01-05 13:53 (Reply)

I believe in "best of both worlds" - save the raw and escaped versions ;-) Then if you need to edit the original or change mediums, you can do it, but still don't have to call htmlentities 2000 times...
#5 Elizabeth Smith on 2008-01-07 19:20 (Reply)

Best of both worlds definitely works :-) !
#6 Adam (Homepage) on 2008-01-09 13:27 (Reply)

In Today's Funcaday (http://funcaday.com/displayEntry.php?id=81), you spelled definition wrong, twice. That is, you used two completely different incorrect spellings. If you want to be taken seriously, you have to at least use spell check.
#7 Anonymous on 2008-01-14 15:30 (Reply)

I agree :-)
#8 sf (Homepage) on 2008-01-15 10:27 (Reply)

I use htmlspecialchars instead of htmlspecialentities and do not experience any problems. The only mistake that can catch you is calling this twice - before putting data into database and after. Just avoid it
#9 Jack (Homepage) on 2008-03-18 14:38 (Reply)

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.

Hi, I’m Paul Reinheimer, a developer working on the web.

I wrote a book titled Professional Web APIs with PHP back in 2006, and am currently working in Biomedical Informatics for a major public health company.

I’m working on a project to help developers called WonderProxy which has proxies all over the world. Working on GeoIP development? Now you can finally test properly! We've also released Global Ping Statistics for expected ping times between cities, as well as a Load Testing Tool to measure your site's ability to handle load. Our most recent site checking tool is Where's it Up? which checks your sites availability globally, returning HTTP, DNS, and Traceroute details

My hobbies are cycling, photography, travel, and engaging Allison Moore in intelligent discourse. I frequently write about PHP and other related technologies.

I co-founded:

WonderNetwork Logo