sanitize

Validate and Sanitize Data in PHP

Data Sanitization and Validation is an important step to maintain highest quality standard of Data Security. Since PHP 5.2.0, data sanitization and validation has been made significantly easier with the introduction of data filtering.

Why Sanitize and Validate??

On different web applications, there are sorts of threats related to data security from user-inputs and third-party data.

Some data security threats:

Cross-Site Scripting (XSS)

A form of code injection where a script is injected onto a website from a completely different website. This is by far the most common security vulnerability online. Two recent, very prominent examples of this technique are the Stalk Daily and Mikeyy Twitter Worms from earlier this year that used poorly sanitized inputs to launch Javascript via an “infected” Twitter web interface.

SQL Injection

SQL Injection is another form of code injection in which this is a second most common security vulnerability online. A script is used to participate in one numerous exploitative behaviors including (but not limited to) exposing and/or gaining unauthorized access to data, altering data inside of a database, or simply injecting code to be rendered or executed within a website thereby breaking or altering the website.

Cross Site Scripting (XSS)

Cross Site Scripting vulnerability refer an attack by a user where they enter some data to your website that includes a client side script (generally JavaScript). If your script output this data to web page without filtering it and the script will be executed.

What is Sanitization and Validation???

Refer to PHP manual, let see how the terms  “Sanitization” and “Validation” being described.

Validation is used to validate or check if the data meets certain qualifications. For example, passing in FILTER_VALIDATE_EMAIL will determine if the data is a valid email address, but will not change the data itself.

Sanitization will sanitize the data, so it may alter it by removing undesired characters. For example, passing in FILTER_SANITIZE_EMAIL will remove characters that are inappropriate for an email address to contain. That said, it does not validate the data.

How to use Filter in PHP???

PHP Filters for validation and sanitization are activated by passing at least two values to PHP Filters Extension function filter_var.

Here, let see how to use the Sanitize Filter for an Integer number:

$value = '123abc456def';
echo filter_var($value, FILTER_SANITIZE_NUMBER_INT);

In the above example, we have a variable $value that is passed through the  Filters Extension function filter_var using the FILTER_SANITIZE_NUMBER_INT filter. And the result as following:

123456

Top 10 Different Filters

The list below includes those filters that come standard with 5.2.0+ installations. Custom filters and those added from custom extensions are not included here.

1. FILTER_VALIDATE_BOOLEAN

It checks whether the data passed to the filter is a boolean value of TRUE or FALSE. If the value is a non-boolean value, it will return FALSE. And the script would echo “True”for the example data $value01 but would echo “FALSE” for the example data $value02:

$value01 = TRUE;
if(filter_var($value01,FILTER_VALIDATE_BOOLEAN)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
echo '<br /><br />'
$value02 = TRUE;
if(filter_var($value02,FILTER_VALIDATE_BOOLEAN)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
2. FILTER_VALIDATE_FLOAT

It checks whether the data passed to the filter is a valid integer value.

$value01 = '123456';
if(filter_var($value01,FILTER_VALIDATE_INT)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
echo '<br /><br />'
$value02 = '123.456';
if(filter_var($value02,FILTER_VALIDATE_INT)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
3. FILTER_VALIDATE_IP

Checks whether or not the data passed to the filter is a potentially valid IP address. It does not check if the IP address would resolve, just that it fits the required data structure for IP addresses.

$value01 = '192.168.0.1';
if(filter_var($value01,FILTER_VALIDATE_IP)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
echo '<br /><br />'
$value02 = '1.2.3.4.5.6.7.8.9';
if(filter_var($value02,FILTER_VALIDATE_IP)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
4. FILTER_VALIDATE_URL

Check the data passed to the filter is a potentially valid URL. However, it does not verify the URL would resolve, just that it fits the required data structure for URLs.

$value01 = 'http://www.ansoncheung.tk';
if(filter_var($value01,FILTER_VALIDATE_URL)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
echo '<br /><br />'
$value02 = 'ansoncheung';
if(filter_var($value02,FILTER_VALIDATE_URL)) {
 echo 'TRUE';
} else {
 echo 'FALSE';
}
5. FILTER_SANITIZE_STRING

The filter removes any data from a string that is not allowed in that string. For example, this filter will remove any HTML tags, like <script> or <strong> from an input string.

$value = '<script>alert('TROUBLE HERE');</script>';
echo filter_var($value, FILTER_SANITIZE_STRING);

This script would remove the tags and return as follow.

alert('TROUBLE HERE');  
6. FILTER_SANITIZE_ENCODED

Many developers use PHP’s urlencode() function to handle their URL Encoding. Howeve, this filter essentially does the same thing.

$value = '<script>alert('TROUBLE HERE');</script>';
echo filter_var($value, FILTER_SANITIZE_ENCODED);

The script would encode the punctuation, spaces, and brackets, then return the following.

%3Cscript%3Ealert%28%27TROUBLE%20HERE%27%29%3B%3C%2Fscript%3E
7. FILTER_SANITIZE_SPECIAL_CHARS

This filter will, by default, HTML-encode special characters like quotes, ampersands, and brackets (in addition to characters with ASCII value less than 32). While the demo page does not make it abundantly clear without viewing the source (because the HTML-encoded special characters will be interpreted and rendered out), if you take a look at the source code you’ll see the encoding at work:

$value = '<script>alert('TROUBLE HERE');</script>';
echo filter_var($value, FILTER_SANITIZE_SPECIAL_CHARS);

It converts the special characters into their HTML-encoded selves:

<script>alert('TROUBLE HERE');</script>
8. FILTER_SANITIZE_EMAIL

It removes any characters that are invalid in e-mail addresses (like parentheses, brackets, colons, etc). For example, let’s say you accidentally added parentheses around a letter of your e-mail address (don’t ask how, use your imagination):

$value = 't(e)st@example.com';
echo filter_var($value, FILTER_SANITIZE_EMAIL)

The result would be as follow.

test@example.com
9. FILTER_SANITIZE_URL

Similar to the e-mail address sanitize filter, this filter removes any characters that are invalid in a URL (like certain UTF-8 characters, etc).

$value = 'http://www.webstudy®online.tk';
echo filter_var($value, FILTER_SANITIZE_URL);

It removes the unwanted “®” and you get your handsome URL back:

http://www.webstudyonline.tk
10. FILTER_SANITIZE_NUMBER_INT

This filter is similar to the FILTER_VALIDATE_INT but instead of simply checking if it is an Integer or not, it actually removes everything non-integer from the value! Handy, indeed, for pesky spambots and tricksters in some input forms:

$value01 = '123abc456def';
echo filter_var($value01, FILTER_SANITIZE_NUMBER_INT);
echo '<br />';
$value02 = '1.2.3.4.5.6.7.8.9';
echo filter_var($value02, FILTER_SANITIZE_NUMBER_INT);

Again, all those silly letters and decimals get thrown right out:

123456
123456789
11. FILTER_SANITIZE_NUMBER_FLOAT

However, if you value have is a decimal value, then you should use this filter. One of the main reasons why “FILTER_SANITIZE_NUMBER_FLOAT” and “FILTER_SANITIZE_INT” are separate filters is to allow for this via a special Flag “FILTER_FLAG_ALLOW_FRACTION” that is added as a third value passed to filter_var:

$value = '1.23';
echo filter_var($value, FILTER_SANITIZE_NUMBER_FLOAT, FILTER_FLAG_ALLOW_FRACTION);

The result would keep the decimal.

1.23

Other Methods of Santizing Data with PHP

12. htmlspecialchars

This PHP function converts 5 special characters into their corresponding HTML entities:

‘&’ (ampersand) becomes ‘&amp;’
‘”‘ (double quote) becomes ‘&quot;’ when ENT_NOQUOTES is not set.
”’ (single quote) becomes ‘&#039;’ only when ENT_QUOTES is set.
‘<’ (less than) becomes ‘&lt;’
‘>’ (greater than) becomes ‘&gt;’

It is used like any other PHP string function:

echo htmlspecialchars('$string');
13. htmlentities

Htmlentities converts characters into corresponding HTML entities. The big difference is that ALL characters that can be converted will be converted. This is a useful method of obfuscating e-mail addresses from some bots that collect e-mail addresses, as not of them are programmed to read htmlentities.

echo htmlentities('$string'); 
14. mysql_real_escape_string

This MySQL function helps protect against SQL injection attacks. It is considered a best practice (or even a mandatory practice) to pass all data that is being sent to a MySQL query through this function. It escapes any special characters that could be problematic and would cause little Bobby Tables to destory yet another school students database.

$query = 'SELECT * FROM table WHERE value='.mysql_real_escape_string('$string').' LIMIT 1,1';
$runQuery = mysql_query($query);

Related Posts


8 thoughts on “Validate and Sanitize Data in PHP

  1. Roman Jandrey

    You actually make it appear so easy along with your presentation however I to find this topic to be actually one thing which I feel I might by no means understand. It sort of feels too complex and very vast for me. I am looking ahead for your subsequent post, I’ll try to get the hang of it!

    Reply
  2. Limitless SEO

    You’re really a good webmaster. The site loading velocity is incredible. It seems that you are doing any unique trick. Moreover, The contents are masterpiece. you’ve done a wonderful task in this topic!

    Reply
  3. mbt women s shoes

    Hello.This post was extremely remarkable, particularly since I was investigating for thoughts on this matter last Friday.

    Reply
  4. Anibal

    Greetings thanks for great publish i was searching for this challenge very last 2 nights. I’ll search for future valuable posts. Have enjoyable admin.

    Reply
  5. php wallpaper script

    I enjoy, result in I found just what I was taking a look for. You’ve ended my 4 day long hunt! God Bless you man. Have a nice day. Bye

    Reply
  6. fashion article

    Finally a site I agree with. I have read some real rubbish today, so its a treat to read something useful.

    Reply
  7. Duncan Owensby

    I simply want to tell you that I am very new to blogs and seriously enjoyed you’re website. More than likely I’m want to bookmark your website . You surely have good articles. Appreciate it for sharing with us your website.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *


*