Common Mistakes in Securing Web Applications

Školení, která pořádám

I originally wrote this article for Smashing Magazine about a year ago but it hasn't found its way to publishing. However the ideas in the article are still valid so I am publishing it at least on my blog.

Terms like XSS, SQL Injection or CSRF are well known to more experienced web-application developers. The description of these and other attacks is usually easily understandable and the defense against them is straightforward. Unfortunately, there are still lots of things that can be done wrong. This article tries to point on the usual mistakes done while securing the application. The article is written primarily for PHP developers but most concepts are valid also in other programming languages.

Cross-Site Scripting

The defense against XSS is the easiest one, right? Just use htmlspecialchars somewhere and you are safe. Well, not really.

First of all, you also have to specify the page encoding by the charset parameter of Content-Type HTTP header (you can use default_charset configuration in PHP). Otherwise the attacker can trick the user to view the page in UTF-7 encoding (by displaying it inside a frame on own page in this encoding) where safe strings like +ADw-script+AD4- become dangerous. It is important to use encoding containing all characters (UTF-8 in particular, avoid Latin-1) otherwise browsers will send HTML entities for unknown characters filled in the form.

Next, it is vital to escape data just before outputting it to HTML and not on input for three reasons.

  1. You may want to use data in some other context. Then you do not need to unescape it.
  2. You may want to store data from some other application. For example, you will want to fix some typos in a comment through Adminer and you forget to escape some special character by hand. It may result in non-displaying the page if it is send as XHTML.
  3. The most important reason is simple – the escaped data are longer. Therefore, if you want to store data in varchar(20) and a user will input 15 characters with some special characters then it will not fit. Now you have two options – inform the user that the maximum allowed length is 20 characters (which she adhered) or silently truncate data (it is the default behavior of MySQL) which may result in invalid XHTML again.

Excuses for escaping input like “a new programmer comes in to the project, and forgets to sanitize data before printing it out” are wrong because you cannot escape data twice. Therefore, if you escape data on the input then you cannot escape it on the output, which is an anti-pattern.

The problem with manually applying htmlspecialchars (or its equivalent) is that it can be easily forgotten. Not only echo "?id=$_GET[id]" but also echo $_SERVER["PHP_SELF"] must be escaped which may not be evident (URL can contain special characters for example if Apache directive AcceptPathInfo is enabled which is the default). Thus, the best defense against XSS is to not escape data by hand at all but use a templating system that escapes data for us. Even the classic Smarty supports this feature. Best templating systems provide context-sensitive escaping, which can safely escape data also in JavaScript or style embedded in HTML.

Another common mistake in securing against XSS is using strip_tags on user input. One problem of this function is that it does nothing with character & which may result in invalid XHTML. Another problem is that it is not easy for user to write a character < itself. The usage of a second parameter of this function is even worse because it does not leave only the specified tags but also all their parameters including JavaScript events. Shortly, this function was not meant to escape the user input – it can be used for retrieving the text from XML document. If you want to sanitize the code inputted to WYSIWYG editor, use HTML Purifier.

If you have rock-solid defense against XSS then disable the reflected XSS filter in IE8 by X-XSS-Protection: 0 HTTP header (or parts of your page can be removed by a malicious user).

To sum this up: specify charset, do not escape data for output on input, use auto-escaping feature of your templating system, avoid strip_tags.

SQL Injection

The defense against SQL Injection is even easier than XSS, right? Just use prepared statements with bound variables and you are safe. Well, mostly.

The problem is that variables binding can be used only for data so ORDER BY ? does not work. Binding LIMIT ? or OFFSET ? is also tricky because PDOStatement::execute passes data as strings. You can bind the number by PDOStatement::bindValue with third parameter but easier is usually to simply use intval (or equivalent). If you want to specify a column then use whitelist:

$allowed_orders = array("id" => true, "created" => true);

$query = "SELECT ...";
if (isset($_GET["order"], $allowed_orders[$_GET["order"]])) {
	$query .= " ORDER BY $_GET[order]";
if (isset($_GET["limit"])) {
	$query .= " LIMIT " . intval($_GET["limit"]);

If you are stuck with the mysql_query and cannot use prepared statements, then use mysql_real_escape_string always inside apostrophes (use intval for numbers). You also need to specify encoding by mysql_set_charset function because different encodings escape different characters. Avoid addslashes because it does not respect server configuration.

Do not forget to disable magic_quotes_gpc or you will encounter backslashes in stored data.

To sum this up: disable magic_quotes_gpc, use variables binding wherever possible, use intval for numbers and whitelists for columns, specify encoding.

Security by Obscurity

To avoid Security by Obscurity, it is enough to hash password, right? Well, not only.

First of all, you also need to add a random salt to password before hashing. Generating a good random string is not easy – the best is to use /dev/random, pretty good is also md5(uniqid(mt_rand(), true)). However, there is bunch of other things you have to do which are thoroughly analyzed in the Month of PHP Security article.

The password is not the only secret thing that must be protected. All tokens are secret too and even the great MOPS article recommends storing password-regeneration token in plain text, which is Security by Obscurity. Always store hash of the token in the database or to the session variable to avoid Security by Obscurity.

If you need to store some sensitive information that you will need to retrieve in original form later (like credit card number) then use asymmetric cryptography. Credit card number will be encrypted by a public key and decrypted by a private key secured with a password. The point is that the password does not need to be stored anywhere (but your head) so even if the attacker would gain access to the database and source codes then he would not get the sensitive data. Delete the encrypted data as soon as it is not required.

To sum this up: use external library for saving passwords, store hash of tokens, use asymmetric cryptography for sensitive data that you need to restore.

Cross-Site Request Forgery

Defense against CSRF is simple, right? Just send some token in all forms and you are safe. Well, not really.

First of all, you do not want to send the token with the GET forms because it can leak the token through the Referer HTTP header. Send the token only in forms performing some operations, which should be always sent by POST. The token does not need to be sent in forms that can be submitted by anyone (for example the registration form). On the other hand, some forms require the token even for anonymous users – for example polls. Send the token if the form is specific for current user.

CSRF is not only the forms attack. If you pass private data in JavaScript executable files (var contacts = [ '...' ]) then the attacker can gain access to this data. Use AJAX for transferring the data instead or include it in HTML.

Never base the CSRF defense on the Referer HTTP header as it may be filtered by firewalls. The session.referer_check PHP configuration directive is useless.

A common mistake in CSRF defense is to generate only one valid token for each operation. It will make it impossible to use the application in more browser tabs or windows:

$operation = $_SERVER["REQUEST_URI"];

// avoid this or the application will not work in more browser tabs
if (!$_POST) {
	$token = generate_token();
	$_SESSION["token"][$operation] = md5($token);
} elseif (md5($_POST["token"]) == $_SESSION["token"][$operation]) {
	// here will be performing of the operation

// use this instead
if (!$_POST) {
	$token = generate_token();
	$_SESSION["token"][$operation][md5($token)] = true;
} elseif (isset($_SESSION["token"][$operation][md5($_POST["token"])])) {
	// here will be performing of the operation

The token should be random. Anything else would allow attacker to reproduce it.

To sum this up: generate and verify random token for all POST forms specific for a user, do not send data in JavaScript files, do not unset the token if it can still be in use in another browser tab.


The defense against ClickJacking is simple, right? Use some frame-busting JavaScript and you are safe. Well, absolutely no.

The problem with JavaScript is that it can be programmatically disabled in IE by <iframe security="restricted">. If you use the opposite approach and allow an action by JavaScript only if a user is not inside a frame then it will not work for the most paranoid users who disable JavaScript. The correct defense is to send the X-Frame-Options HTTP header, which will forbid using a page inside a frame in modern browsers. You can still use the JavaScript for older browsers but do not force users with the newest browsers to enable JavaScript just for securing your site.

To sum this up: send X-Frame-Options on pages specific for the current user.


Use the proper procedures to defend even against the simplest attacks. Try to use the simplest techniques to not forget defense anywhere – automatic escaping in HTML templates or variables binding in SQL are good examples.

There are many other things to take care of in securing web applications – proper initialization of variables, risks of remote execution, session attacks and so one. Defense against them can be also screwed up easily. It can be covered in a next article.


Jakub Vrána, Dobře míněné rady, 9.11.2011, comments: 3 (new: 0)


ikona hacafrakus:

Používám vlastní databázovou vrstvu, pro ošetření SQL jsem si napsal tuto metodu:

ikona Jakub Vrána OpenID:

Odpověz si na tyto otázky:

1. Jak se bude metoda chovat v případě, že databáze bude mít nastavené nějaké nestandardní výchozí kódování nebo konfiguraci? Nápověda: mysql_real_escape_string().

2. Jak se bude metoda chovat, když dostane objekt implementující metodu __toString()?

3. Co se stane, když mi někdo pošle 'SELECT "%s"'?

4. Dozvím se z chybového logu, že nějaký dotaz neprošel? Co se děje s vyhozenou výjimkou?

ikona hacafrakus:

1. Všem tabulkám nastavuji kódování a můj provider má standardní konfiguraci.
2. Nepošle, protože se to můj vlastní web a kód do něj píšu jen já.
3. Je to myšleno jako jako v parametru? Tak asi nic.
4. Nezachycené výjimky jsou zachytávány pomocí set_exception_handler()
Ale jinak ano, uznávám že to není dobré řešení co se týče přenositelnosti. Ale u 3. bodu si stojím za tím, že pokud je to parametr, tak se nestane nic (zkoušel jsem to) a do format se to nedostane, protože moje odpověď 2.

Diskuse je zrušena z důvodu spamu.

avatar © 2005-2023 Jakub Vrána. Publikované texty můžete přetiskovat pouze se svolením autora. Ukázky kódu smíte používat s uvedením autora a URL tohoto webu bez dalších omezení Creative Commons. Můžeme si tykat. Skripty předpokládají nastavení: magic_quotes_gpc=Off, magic_quotes_runtime=Off, error_reporting=E_ALL & ~E_NOTICE a očekávají předchozí zavolání mysql_set_charset. Skripty by měly být funkční v PHP >= 4.3 a PHP >= 5.0.