CI_Input Class Reference

List of all members.


Public Member Functions

 CI_Input ()
 Constructor.
 _sanitize_globals ()
 Sanitize Globals.
 _clean_input_data ($str)
 Clean Input Data.
 _clean_input_keys ($str)
 Clean Keys.
 _fetch_from_array (&$array, $index= '', $xss_clean=FALSE)
 Fetch from array.
 get ($index= '', $xss_clean=FALSE)
 Fetch an item from the GET array.
 post ($index= '', $xss_clean=FALSE)
 Fetch an item from the POST array.
 get_post ($index= '', $xss_clean=FALSE)
 Fetch an item from either the GET array or the POST.
 cookie ($index= '', $xss_clean=FALSE)
 Fetch an item from the COOKIE array.
 server ($index= '', $xss_clean=FALSE)
 Fetch an item from the SERVER array.
 ip_address ()
 Fetch the IP Address.
 valid_ip ($ip)
 Validate IP Address.
 user_agent ()
 User Agent.
 filename_security ($str)
 Filename Security.
 xss_clean ($str, $is_image=FALSE)
 XSS Clean.
 xss_hash ()
 Random Hash for protecting URLs.
 _remove_invisible_characters ($str)
 Remove Invisible Characters.
 _compact_exploded_words ($matches)
 Compact Exploded Words.
 _sanitize_naughty_html ($matches)
 Sanitize Naughty HTML.
 _js_link_removal ($match)
 JS Link Removal.
 _js_img_removal ($match)
 JS Image Removal.
 _convert_attribute ($match)
 Attribute Conversion.
 _html_entity_decode_callback ($match)
 HTML Entity Decode Callback.
 _html_entity_decode ($str, $charset='UTF-8')
 HTML Entities Decode.
 _filter_attributes ($str)
 Filter Attributes.

Public Attributes

 $use_xss_clean = FALSE
 $xss_hash = ''
 $ip_address = FALSE
 $user_agent = FALSE
 $allow_get_array = FALSE
 $never_allowed_str
 $never_allowed_regex

Detailed Description

Definition at line 29 of file Input.php.


Member Function Documentation

CI_Input::_clean_input_data ( str  ) 

Clean Input Data.

This is a helper function. It escapes data and standardizes newline characters to

private

Parameters:
string 
Returns:
string

Definition at line 168 of file Input.php.

References _clean_input_keys(), and xss_clean().

Referenced by _sanitize_globals().

00169         {
00170                 if (is_array($str))
00171                 {
00172                         $new_array = array();
00173                         foreach ($str as $key => $val)
00174                         {
00175                                 $new_array[$this->_clean_input_keys($key)] = $this->_clean_input_data($val);
00176                         }
00177                         return $new_array;
00178                 }
00179 
00180                 // We strip slashes if magic quotes is on to keep things consistent
00181                 if (get_magic_quotes_gpc())
00182                 {
00183                         $str = stripslashes($str);
00184                 }
00185 
00186                 // Should we filter the input data?
00187                 if ($this->use_xss_clean === TRUE)
00188                 {
00189                         $str = $this->xss_clean($str);
00190                 }
00191 
00192                 // Standardize newlines
00193                 if (strpos($str, "\r") !== FALSE)
00194                 {
00195                         $str = str_replace(array("\r\n", "\r"), "\n", $str);
00196                 }
00197                 
00198                 return $str;
00199         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_clean_input_keys ( str  ) 

Clean Keys.

This is a helper function. To prevent malicious users from trying to exploit keys we make sure that keys are only named with alpha-numeric text and a few other items.

private

Parameters:
string 
Returns:
string

Definition at line 214 of file Input.php.

Referenced by _clean_input_data().

00215         {
00216                  if ( ! preg_match("/^[a-z0-9:_\/-]+$/i", $str))
00217                  {
00218                         exit('Disallowed Key Characters.');
00219                  }
00220 
00221                 return $str;
00222         }

Here is the caller graph for this function:

CI_Input::_compact_exploded_words ( matches  ) 

Compact Exploded Words.

Callback function for xss_clean() to remove whitespace from things like j a v a s c r i p t

public

Parameters:
type 
Returns:
type

Definition at line 871 of file Input.php.

00872         {
00873                 return preg_replace('/\s+/s', '', $matches[1]).$matches[2];
00874         }

CI_Input::_convert_attribute ( match  ) 

Attribute Conversion.

Used as a callback for XSS Clean

public

Parameters:
array 
Returns:
string

Definition at line 947 of file Input.php.

00948         {
00949                 return str_replace(array('>', '<'), array('&gt;', '&lt;'), $match[0]);
00950         }

CI_Input::_fetch_from_array ( &$  array,
index = '',
xss_clean = FALSE 
)

Fetch from array.

This is a helper function to retrieve values from global arrays

private

Parameters:
array 
string 
bool 
Returns:
string

Definition at line 237 of file Input.php.

References xss_clean().

Referenced by cookie(), get(), post(), and server().

00238         {
00239                 if ( ! isset($array[$index]))
00240                 {
00241                         return FALSE;
00242                 }
00243 
00244                 if ($xss_clean === TRUE)
00245                 {
00246                         return $this->xss_clean($array[$index]);
00247                 }
00248 
00249                 return $array[$index];
00250         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_filter_attributes ( str  ) 

Filter Attributes.

Filters tag attributes for consistency and safety

public

Parameters:
string 
Returns:
string

Definition at line 1038 of file Input.php.

Referenced by _js_img_removal(), and _js_link_removal().

01039         {
01040                 $out = '';
01041 
01042                 if (preg_match_all('#\s*[a-z\-]+\s*=\s*(\042|\047)([^\\1]*?)\\1#is', $str, $matches))
01043                 {
01044                         foreach ($matches[0] as $match)
01045                         {
01046                                 $out .= "{$match}";
01047                         }                       
01048                 }
01049 
01050                 return $out;
01051         }

Here is the caller graph for this function:

CI_Input::_html_entity_decode ( str,
charset = 'UTF-8' 
)

HTML Entities Decode.

This function is a replacement for html_entity_decode()

In some versions of PHP the native function does not work when UTF-8 is the specified character set, so this gives us a work-around. More info here: http://bugs.php.net/bug.php?id=25670

private

Parameters:
string 
string 
Returns:
string

Definition at line 997 of file Input.php.

Referenced by _html_entity_decode_callback().

00998         {
00999                 if (stristr($str, '&') === FALSE) return $str;
01000 
01001                 // The reason we are not using html_entity_decode() by itself is because
01002                 // while it is not technically correct to leave out the semicolon
01003                 // at the end of an entity most browsers will still interpret the entity
01004                 // correctly.  html_entity_decode() does not convert entities without
01005                 // semicolons, so we are left with our own little solution here. Bummer.
01006 
01007                 if (function_exists('html_entity_decode') && (strtolower($charset) != 'utf-8' OR version_compare(phpversion(), '5.0.0', '>=')))
01008                 {
01009                         $str = html_entity_decode($str, ENT_COMPAT, $charset);
01010                         $str = preg_replace('~&#x(0*[0-9a-f]{2,5})~ei', 'chr(hexdec("\\1"))', $str);
01011                         return preg_replace('~&#([0-9]{2,4})~e', 'chr(\\1)', $str);
01012                 }
01013 
01014                 // Numeric Entities
01015                 $str = preg_replace('~&#x(0*[0-9a-f]{2,5});{0,1}~ei', 'chr(hexdec("\\1"))', $str);
01016                 $str = preg_replace('~&#([0-9]{2,4});{0,1}~e', 'chr(\\1)', $str);
01017 
01018                 // Literal Entities - Slightly slow so we do another check
01019                 if (stristr($str, '&') === FALSE)
01020                 {
01021                         $str = strtr($str, array_flip(get_html_translation_table(HTML_ENTITIES)));
01022                 }
01023 
01024                 return $str;
01025         }

Here is the caller graph for this function:

CI_Input::_html_entity_decode_callback ( match  ) 

HTML Entity Decode Callback.

Used as a callback for XSS Clean

public

Parameters:
array 
Returns:
string

Definition at line 963 of file Input.php.

References $CFG, _html_entity_decode(), and load_class().

00964         {
00965                 $CFG =& load_class('Config');
00966                 $charset = $CFG->item('charset');
00967 
00968                 return $this->_html_entity_decode($match[0], strtoupper($charset));
00969         }

Here is the call graph for this function:

CI_Input::_js_img_removal ( match  ) 

JS Image Removal.

Callback function for xss_clean() to sanitize image tags This limits the PCRE backtracks, making it more performance friendly and prevents PREG_BACKTRACK_LIMIT_ERROR from being triggered in PHP 5.2+ on image tag heavy strings

private

Parameters:
array 
Returns:
string

Definition at line 930 of file Input.php.

References _filter_attributes().

00931         {
00932                 $attributes = $this->_filter_attributes(str_replace(array('<', '>'), '', $match[1]));
00933                 return str_replace($match[1], preg_replace("#src=.*?(alert\(|alert&\#40;|javascript\:|charset\=|window\.|document\.|\.cookie|<script|<xss|base64\s*,)#si", "", $attributes), $match[0]);
00934         }

Here is the call graph for this function:

CI_Input::_js_link_removal ( match  ) 

JS Link Removal.

Callback function for xss_clean() to sanitize links This limits the PCRE backtracks, making it more performance friendly and prevents PREG_BACKTRACK_LIMIT_ERROR from being triggered in PHP 5.2+ on link-heavy strings

private

Parameters:
array 
Returns:
string

Definition at line 912 of file Input.php.

References _filter_attributes().

00913         {
00914                 $attributes = $this->_filter_attributes(str_replace(array('<', '>'), '', $match[1]));
00915                 return str_replace($match[1], preg_replace("#href=.*?(alert\(|alert&\#40;|javascript\:|charset\=|window\.|document\.|\.cookie|<script|<xss|base64\s*,)#si", "", $attributes), $match[0]);
00916         }

Here is the call graph for this function:

CI_Input::_remove_invisible_characters ( str  ) 

Remove Invisible Characters.

This prevents sandwiching null characters between ascii characters, like Java.

public

Parameters:
string 
Returns:
string

Definition at line 833 of file Input.php.

Referenced by xss_clean().

00834         {
00835                 static $non_displayables;
00836                 
00837                 if ( ! isset($non_displayables))
00838                 {
00839                         // every control character except newline (dec 10), carriage return (dec 13), and horizontal tab (dec 09),
00840                         $non_displayables = array(
00841                                                                                 '/%0[0-8bcef]/',                        // url encoded 00-08, 11, 12, 14, 15
00842                                                                                 '/%1[0-9a-f]/',                         // url encoded 16-31
00843                                                                                 '/[\x00-\x08]/',                        // 00-08
00844                                                                                 '/\x0b/', '/\x0c/',                     // 11, 12
00845                                                                                 '/[\x0e-\x1f]/'                         // 14-31
00846                                                                         );
00847                 }
00848 
00849                 do
00850                 {
00851                         $cleaned = $str;
00852                         $str = preg_replace($non_displayables, '', $str);
00853                 }
00854                 while ($cleaned != $str);
00855 
00856                 return $str;
00857         }

Here is the caller graph for this function:

CI_Input::_sanitize_globals (  ) 

Sanitize Globals.

This function does the following:

Unsets $_GET data (if query strings are not enabled)

Unsets all globals if register_globals is enabled

Standardizes newline characters to

private

Returns:
void

Definition at line 89 of file Input.php.

References _clean_input_data(), and log_message().

Referenced by CI_Input().

00090         {
00091                 // Would kind of be "wrong" to unset any of these GLOBALS
00092                 $protected = array('_SERVER', '_GET', '_POST', '_FILES', '_REQUEST', '_SESSION', '_ENV', 'GLOBALS', 'HTTP_RAW_POST_DATA',
00093                                                         'system_folder', 'application_folder', 'BM', 'EXT', 'CFG', 'URI', 'RTR', 'OUT', 'IN');
00094 
00095                 // Unset globals for security. 
00096                 // This is effectively the same as register_globals = off
00097                 foreach (array($_GET, $_POST, $_COOKIE, $_SERVER, $_FILES, $_ENV, (isset($_SESSION) && is_array($_SESSION)) ? $_SESSION : array()) as $global)
00098                 {
00099                         if ( ! is_array($global))
00100                         {
00101                                 if ( ! in_array($global, $protected))
00102                                 {
00103                                         unset($GLOBALS[$global]);
00104                                 }
00105                         }
00106                         else
00107                         {
00108                                 foreach ($global as $key => $val)
00109                                 {
00110                                         if ( ! in_array($key, $protected))
00111                                         {
00112                                                 unset($GLOBALS[$key]);
00113                                         }
00114                         
00115                                         if (is_array($val))
00116                                         {
00117                                                 foreach($val as $k => $v)
00118                                                 {
00119                                                         if ( ! in_array($k, $protected))
00120                                                         {
00121                                                                 unset($GLOBALS[$k]);
00122                                                         }
00123                                                 }
00124                                         }
00125                                 }
00126                         }
00127                 }
00128 
00129                 // Is $_GET data allowed? If not we'll set the $_GET to an empty array
00130                 if ($this->allow_get_array == FALSE)
00131                 {
00132                         $_GET = array();
00133                 }
00134                 else
00135                 {
00136                         $_GET = $this->_clean_input_data($_GET);
00137                 }
00138 
00139                 // Clean $_POST Data
00140                 $_POST = $this->_clean_input_data($_POST);
00141                 
00142                 // Clean $_COOKIE Data
00143                 // Also get rid of specially treated cookies that might be set by a server
00144                 // or silly application, that are of no use to a CI application anyway
00145                 // but that when present will trip our 'Disallowed Key Characters' alarm
00146                 // http://www.ietf.org/rfc/rfc2109.txt
00147                 // note that the key names below are single quoted strings, and are not PHP variables
00148                 unset($_COOKIE['$Version']);
00149                 unset($_COOKIE['$Path']);
00150                 unset($_COOKIE['$Domain']);
00151                 $_COOKIE = $this->_clean_input_data($_COOKIE);
00152 
00153                 log_message('debug', "Global POST and COOKIE data sanitized");
00154         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_sanitize_naughty_html ( matches  ) 

Sanitize Naughty HTML.

Callback function for xss_clean() to remove naughty HTML elements

private

Parameters:
array 
Returns:
string

Definition at line 887 of file Input.php.

00888         {
00889                 // encode opening brace
00890                 $str = '&lt;'.$matches[1].$matches[2].$matches[3];
00891                 
00892                 // encode captured opening or closing brace to prevent recursive vectors
00893                 $str .= str_replace(array('>', '<'), array('&gt;', '&lt;'), $matches[4]);
00894                 
00895                 return $str;
00896         }

CI_Input::CI_Input (  ) 

Constructor.

Sets whether to globally enable the XSS processing and whether to allow the $_GET array

public

Definition at line 63 of file Input.php.

References $CFG, _sanitize_globals(), load_class(), and log_message().

00064         {
00065                 log_message('debug', "Input Class Initialized");
00066 
00067                 $CFG =& load_class('Config');
00068                 $this->use_xss_clean    = ($CFG->item('global_xss_filtering') === TRUE) ? TRUE : FALSE;
00069                 $this->allow_get_array  = ($CFG->item('enable_query_strings') === TRUE) ? TRUE : FALSE;
00070                 $this->_sanitize_globals();
00071         }

Here is the call graph for this function:

CI_Input::cookie ( index = '',
xss_clean = FALSE 
)

Fetch an item from the COOKIE array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 314 of file Input.php.

References _fetch_from_array().

00315         {
00316                 return $this->_fetch_from_array($_COOKIE, $index, $xss_clean);
00317         }

Here is the call graph for this function:

CI_Input::filename_security ( str  ) 

Filename Security.

public

Parameters:
string 
Returns:
string

Definition at line 454 of file Input.php.

00455         {
00456                 $bad = array(
00457                                                 "../",
00458                                                 "./",
00459                                                 "<!--",
00460                                                 "-->",
00461                                                 "<",
00462                                                 ">",
00463                                                 "'",
00464                                                 '"',
00465                                                 '&',
00466                                                 '$',
00467                                                 '#',
00468                                                 '{',
00469                                                 '}',
00470                                                 '[',
00471                                                 ']',
00472                                                 '=',
00473                                                 ';',
00474                                                 '?',
00475                                                 "%20",
00476                                                 "%22",
00477                                                 "%3c",          // <
00478                                                 "%253c",        // <
00479                                                 "%3e",          // >
00480                                                 "%0e",          // >
00481                                                 "%28",          // (  
00482                                                 "%29",          // ) 
00483                                                 "%2528",        // (
00484                                                 "%26",          // &
00485                                                 "%24",          // $
00486                                                 "%3f",          // ?
00487                                                 "%3b",          // ;
00488                                                 "%3d"           // =
00489                                         );
00490 
00491                 return stripslashes(str_replace($bad, '', $str));
00492         }

CI_Input::get ( index = '',
xss_clean = FALSE 
)

Fetch an item from the GET array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 262 of file Input.php.

References _fetch_from_array().

00263         {
00264                 return $this->_fetch_from_array($_GET, $index, $xss_clean);
00265         }

Here is the call graph for this function:

CI_Input::get_post ( index = '',
xss_clean = FALSE 
)

Fetch an item from either the GET array or the POST.

public

Parameters:
string The index key
bool XSS cleaning
Returns:
string

Definition at line 292 of file Input.php.

References post().

00293         {               
00294                 if ( ! isset($_POST[$index]) )
00295                 {
00296                         return $this->get($index, $xss_clean);
00297                 }
00298                 else
00299                 {
00300                         return $this->post($index, $xss_clean);
00301                 }               
00302         }

Here is the call graph for this function:

CI_Input::ip_address (  ) 

Fetch the IP Address.

public

Returns:
string

Definition at line 342 of file Input.php.

References server(), and valid_ip().

00343         {
00344                 if ($this->ip_address !== FALSE)
00345                 {
00346                         return $this->ip_address;
00347                 }
00348 
00349                 if ($this->server('REMOTE_ADDR') AND $this->server('HTTP_CLIENT_IP'))
00350                 {
00351                          $this->ip_address = $_SERVER['HTTP_CLIENT_IP'];
00352                 }
00353                 elseif ($this->server('REMOTE_ADDR'))
00354                 {
00355                          $this->ip_address = $_SERVER['REMOTE_ADDR'];
00356                 }
00357                 elseif ($this->server('HTTP_CLIENT_IP'))
00358                 {
00359                          $this->ip_address = $_SERVER['HTTP_CLIENT_IP'];
00360                 }
00361                 elseif ($this->server('HTTP_X_FORWARDED_FOR'))
00362                 {
00363                          $this->ip_address = $_SERVER['HTTP_X_FORWARDED_FOR'];
00364                 }
00365 
00366                 if ($this->ip_address === FALSE)
00367                 {
00368                         $this->ip_address = '0.0.0.0';
00369                         return $this->ip_address;
00370                 }
00371 
00372                 if (strstr($this->ip_address, ','))
00373                 {
00374                         $x = explode(',', $this->ip_address);
00375                         $this->ip_address = end($x);
00376                 }
00377 
00378                 if ( ! $this->valid_ip($this->ip_address))
00379                 {
00380                         $this->ip_address = '0.0.0.0';
00381                 }
00382                 
00383                 return $this->ip_address;
00384         }

Here is the call graph for this function:

CI_Input::post ( index = '',
xss_clean = FALSE 
)

Fetch an item from the POST array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 277 of file Input.php.

References _fetch_from_array().

Referenced by get_post().

00278         {
00279                 return $this->_fetch_from_array($_POST, $index, $xss_clean);
00280         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::server ( index = '',
xss_clean = FALSE 
)

Fetch an item from the SERVER array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 329 of file Input.php.

References _fetch_from_array().

Referenced by ip_address().

00330         {
00331                 return $this->_fetch_from_array($_SERVER, $index, $xss_clean);
00332         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::user_agent (  ) 

User Agent.

public

Returns:
string

Definition at line 433 of file Input.php.

00434         {
00435                 if ($this->user_agent !== FALSE)
00436                 {
00437                         return $this->user_agent;
00438                 }
00439 
00440                 $this->user_agent = ( ! isset($_SERVER['HTTP_USER_AGENT'])) ? FALSE : $_SERVER['HTTP_USER_AGENT'];
00441 
00442                 return $this->user_agent;
00443         }

CI_Input::valid_ip ( ip  ) 

Validate IP Address.

Updated version suggested by Geert De Deckere

public

Parameters:
string 
Returns:
string

Definition at line 397 of file Input.php.

Referenced by ip_address().

00398         {
00399                 $ip_segments = explode('.', $ip);
00400 
00401                 // Always 4 segments needed
00402                 if (count($ip_segments) != 4)
00403                 {
00404                         return FALSE;
00405                 }
00406                 // IP can not start with 0
00407                 if ($ip_segments[0][0] == '0')
00408                 {
00409                         return FALSE;
00410                 }
00411                 // Check each segment
00412                 foreach ($ip_segments as $segment)
00413                 {
00414                         // IP segments must be digits and can not be 
00415                         // longer than 3 digits or greater then 255
00416                         if ($segment == '' OR preg_match("/[^0-9]/", $segment) OR $segment > 255 OR strlen($segment) > 3)
00417                         {
00418                                 return FALSE;
00419                         }
00420                 }
00421 
00422                 return TRUE;
00423         }

Here is the caller graph for this function:

CI_Input::xss_clean ( str,
is_image = FALSE 
)

XSS Clean.

Sanitizes data so that Cross Site Scripting Hacks can be prevented. This function does a fair amount of work but it is extremely thorough, designed to prevent even the most obscure XSS attempts. Nothing is ever 100% foolproof, of course, but I haven't been able to get anything passed the filter.

Note: This function should only be used to deal with data upon submission. It's not something that should be used for general runtime processing.

This function was based in part on some code and ideas I got from Bitflux: http://blog.bitflux.ch/wiki/XSS_Prevention

To help develop this script I used this great list of vulnerabilities along with a few other hacks I've harvested from examining vulnerabilities in other programs: http://ha.ckers.org/xss.html

public

Parameters:
string 
Returns:
string

Definition at line 522 of file Input.php.

References _remove_invisible_characters(), log_message(), and xss_hash().

Referenced by _clean_input_data(), and _fetch_from_array().

00523         {
00524                 /*
00525                  * Is the string an array?
00526                  *
00527                  */
00528                 if (is_array($str))
00529                 {
00530                         while (list($key) = each($str))
00531                         {
00532                                 $str[$key] = $this->xss_clean($str[$key]);
00533                         }
00534         
00535                         return $str;
00536                 }
00537 
00538                 /*
00539                  * Remove Invisible Characters
00540                  */
00541                 $str = $this->_remove_invisible_characters($str);
00542 
00543                 /*
00544                  * Protect GET variables in URLs
00545                  */
00546                  
00547                  // 901119URL5918AMP18930PROTECT8198
00548                  
00549                 $str = preg_replace('|\&([a-z\_0-9]+)\=([a-z\_0-9]+)|i', $this->xss_hash()."\\1=\\2", $str);
00550 
00551                 /*
00552                  * Validate standard character entities
00553                  *
00554                  * Add a semicolon if missing.  We do this to enable
00555                  * the conversion of entities to ASCII later.
00556                  *
00557                  */
00558                 $str = preg_replace('#(&\#?[0-9a-z]{2,})[\x00-\x20]*;?#i', "\\1;", $str);
00559 
00560                 /*
00561                  * Validate UTF16 two byte encoding (x00) 
00562                  *
00563                  * Just as above, adds a semicolon if missing.
00564                  *
00565                  */
00566                 $str = preg_replace('#(&\#x?)([0-9A-F]+);?#i',"\\1\\2;",$str);
00567 
00568                 /*
00569                  * Un-Protect GET variables in URLs
00570                  */
00571                 $str = str_replace($this->xss_hash(), '&', $str);
00572 
00573                 /*
00574                  * URL Decode
00575                  *
00576                  * Just in case stuff like this is submitted:
00577                  *
00578                  * <a href="http://%77%77%77%2E%67%6F%6F%67%6C%65%2E%63%6F%6D">Google</a>
00579                  *
00580                  * Note: Use rawurldecode() so it does not remove plus signs
00581                  *
00582                  */
00583                 $str = rawurldecode($str);
00584         
00585                 /*
00586                  * Convert character entities to ASCII 
00587                  *
00588                  * This permits our tests below to work reliably.
00589                  * We only convert entities that are within tags since
00590                  * these are the ones that will pose security problems.
00591                  *
00592                  */
00593 
00594                 $str = preg_replace_callback("/[a-z]+=([\'\"]).*?\\1/si", array($this, '_convert_attribute'), $str);
00595          
00596                 $str = preg_replace_callback("/<\w+.*?(?=>|<|$)/si", array($this, '_html_entity_decode_callback'), $str);
00597 
00598                 /*
00599                  * Remove Invisible Characters Again!
00600                  */
00601                 $str = $this->_remove_invisible_characters($str);
00602                 
00603                 /*
00604                  * Convert all tabs to spaces
00605                  *
00606                  * This prevents strings like this: ja  vascript
00607                  * NOTE: we deal with spaces between characters later.
00608                  * NOTE: preg_replace was found to be amazingly slow here on large blocks of data,
00609                  * so we use str_replace.
00610                  *
00611                  */
00612                 
00613                 if (strpos($str, "\t") !== FALSE)
00614                 {
00615                         $str = str_replace("\t", ' ', $str);
00616                 }
00617                 
00618                 /*
00619                  * Capture converted string for later comparison
00620                  */
00621                 $converted_string = $str;
00622                 
00623                 /*
00624                  * Not Allowed Under Any Conditions
00625                  */
00626                 
00627                 foreach ($this->never_allowed_str as $key => $val)
00628                 {
00629                         $str = str_replace($key, $val, $str);   
00630                 }
00631         
00632                 foreach ($this->never_allowed_regex as $key => $val)
00633                 {
00634                         $str = preg_replace("#".$key."#i", $val, $str);   
00635                 }
00636 
00637                 /*
00638                  * Makes PHP tags safe
00639                  *
00640                  *  Note: XML tags are inadvertently replaced too:
00641                  *
00642                  *      <?xml
00643                  *
00644                  * But it doesn't seem to pose a problem.
00645                  *
00646                  */
00647                 if ($is_image === TRUE)
00648                 {
00649                         // Images have a tendency to have the PHP short opening and closing tags every so often
00650                         // so we skip those and only do the long opening tags.
00651                         $str = str_replace(array('<?php', '<?PHP'),  array('&lt;?php', '&lt;?PHP'), $str);
00652                 }
00653                 else
00654                 {
00655                         $str = str_replace(array('<?php', '<?PHP', '<?', '?'.'>'),  array('&lt;?php', '&lt;?PHP', '&lt;?', '?&gt;'), $str);
00656                 }
00657                 
00658                 /*
00659                  * Compact any exploded words
00660                  *
00661                  * This corrects words like:  j a v a s c r i p t
00662                  * These words are compacted back to their correct state.
00663                  *
00664                  */
00665                 $words = array('javascript', 'expression', 'vbscript', 'script', 'applet', 'alert', 'document', 'write', 'cookie', 'window');
00666                 foreach ($words as $word)
00667                 {
00668                         $temp = '';
00669                         
00670                         for ($i = 0, $wordlen = strlen($word); $i < $wordlen; $i++)
00671                         {
00672                                 $temp .= substr($word, $i, 1)."\s*";
00673                         }
00674 
00675                         // We only want to do this when it is followed by a non-word character
00676                         // That way valid stuff like "dealer to" does not become "dealerto"
00677                         $str = preg_replace_callback('#('.substr($temp, 0, -3).')(\W)#is', array($this, '_compact_exploded_words'), $str);
00678                 }
00679                 
00680                 /*
00681                  * Remove disallowed Javascript in links or img tags
00682                  * We used to do some version comparisons and use of stripos for PHP5, but it is dog slow compared
00683                  * to these simplified non-capturing preg_match(), especially if the pattern exists in the string
00684                  */
00685                 do
00686                 {
00687                         $original = $str;
00688         
00689                         if (preg_match("/<a/i", $str))
00690                         {
00691                                 $str = preg_replace_callback("#<a\s+([^>]*?)(>|$)#si", array($this, '_js_link_removal'), $str);
00692                         }
00693         
00694                         if (preg_match("/<img/i", $str))
00695                         {
00696                                 $str = preg_replace_callback("#<img\s+([^>]*?)(\s?/?>|$)#si", array($this, '_js_img_removal'), $str);
00697                         }
00698         
00699                         if (preg_match("/script/i", $str) OR preg_match("/xss/i", $str))
00700                         {
00701                                 $str = preg_replace("#<(/*)(script|xss)(.*?)>#si", '[removed]', $str);
00702                         }
00703                 }
00704                 while($original != $str);
00705 
00706                 unset($original);
00707 
00708                 /*
00709                  * Remove JavaScript Event Handlers
00710                  *
00711                  * Note: This code is a little blunt.  It removes
00712                  * the event handler and anything up to the closing >,
00713                  * but it's unlikely to be a problem.
00714                  *
00715                  */
00716                 $event_handlers = array('[^a-z_\-]on\w*','xmlns');
00717 
00718                 if ($is_image === TRUE)
00719                 {
00720                         /*
00721                          * Adobe Photoshop puts XML metadata into JFIF images, including namespacing, 
00722                          * so we have to allow this for images. -Paul
00723                          */
00724                         unset($event_handlers[array_search('xmlns', $event_handlers)]);
00725                 }
00726 
00727                 $str = preg_replace("#<([^><]+?)(".implode('|', $event_handlers).")(\s*=\s*[^><]*)([><]*)#i", "<\\1\\4", $str);
00728 
00729                 /*
00730                  * Sanitize naughty HTML elements
00731                  *
00732                  * If a tag containing any of the words in the list
00733                  * below is found, the tag gets converted to entities.
00734                  *
00735                  * So this: <blink>
00736                  * Becomes: &lt;blink&gt;
00737                  *
00738                  */
00739                 $naughty = 'alert|applet|audio|basefont|base|behavior|bgsound|blink|body|embed|expression|form|frameset|frame|head|html|ilayer|iframe|input|isindex|layer|link|meta|object|plaintext|style|script|textarea|title|video|xml|xss';
00740                 $str = preg_replace_callback('#<(/*\s*)('.$naughty.')([^><]*)([><]*)#is', array($this, '_sanitize_naughty_html'), $str);
00741 
00742                 /*
00743                  * Sanitize naughty scripting elements
00744                  *
00745                  * Similar to above, only instead of looking for
00746                  * tags it looks for PHP and JavaScript commands
00747                  * that are disallowed.  Rather than removing the
00748                  * code, it simply converts the parenthesis to entities
00749                  * rendering the code un-executable.
00750                  *
00751                  * For example: eval('some code')
00752                  * Becomes:             eval&#40;'some code'&#41;
00753                  *
00754                  */
00755                 $str = preg_replace('#(alert|cmd|passthru|eval|exec|expression|system|fopen|fsockopen|file|file_get_contents|readfile|unlink)(\s*)\((.*?)\)#si', "\\1\\2&#40;\\3&#41;", $str);
00756                                         
00757                 /*
00758                  * Final clean up
00759                  *
00760                  * This adds a bit of extra precaution in case
00761                  * something got through the above filters
00762                  *
00763                  */
00764                 foreach ($this->never_allowed_str as $key => $val)
00765                 {
00766                         $str = str_replace($key, $val, $str);   
00767                 }
00768         
00769                 foreach ($this->never_allowed_regex as $key => $val)
00770                 {
00771                         $str = preg_replace("#".$key."#i", $val, $str);
00772                 }
00773 
00774                 /*
00775                  *  Images are Handled in a Special Way
00776                  *  - Essentially, we want to know that after all of the character conversion is done whether
00777                  *  any unwanted, likely XSS, code was found.  If not, we return TRUE, as the image is clean.
00778                  *  However, if the string post-conversion does not matched the string post-removal of XSS,
00779                  *  then it fails, as there was unwanted XSS code found and removed/changed during processing.
00780                  */
00781 
00782                 if ($is_image === TRUE)
00783                 {
00784                         if ($str == $converted_string)
00785                         {
00786                                 return TRUE;
00787                         }
00788                         else
00789                         {
00790                                 return FALSE;
00791                         }
00792                 }
00793                 
00794                 log_message('debug', "XSS Filtering completed");
00795                 return $str;
00796         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::xss_hash (  ) 

Random Hash for protecting URLs.

public

Returns:
string

Definition at line 806 of file Input.php.

Referenced by xss_clean().

00807         {
00808                 if ($this->xss_hash == '')
00809                 {
00810                         if (phpversion() >= 4.2)
00811                                 mt_srand();
00812                         else
00813                                 mt_srand(hexdec(substr(md5(microtime()), -8)) & 0x7fffffff);
00814 
00815                         $this->xss_hash = md5(time() + mt_rand(0, 1999999999));
00816                 }
00817 
00818                 return $this->xss_hash;
00819         }

Here is the caller graph for this function:


Member Data Documentation

CI_Input::$allow_get_array = FALSE

Definition at line 34 of file Input.php.

CI_Input::$ip_address = FALSE

Definition at line 32 of file Input.php.

CI_Input::$never_allowed_regex

Initial value:

 array(
                                                                                "javascript\s*:"        => '[removed]',
                                                                                "expression\s*\("       => '[removed]', // CSS and IE
                                                                                "Redirect\s+302"        => '[removed]'
                                                                        )

Definition at line 49 of file Input.php.

CI_Input::$never_allowed_str

Initial value:

 array(
                                                                        'document.cookie'       => '[removed]',
                                                                        'document.write'        => '[removed]',
                                                                        '.parentNode'           => '[removed]',
                                                                        '.innerHTML'            => '[removed]',
                                                                        'window.location'       => '[removed]',
                                                                        '-moz-binding'          => '[removed]',
                                                                        '<!--'                          => '&lt;!--',
                                                                        '-->'                           => '--&gt;',
                                                                        '<![CDATA['                     => '&lt;![CDATA['
                                                                        )

Definition at line 37 of file Input.php.

CI_Input::$use_xss_clean = FALSE

Definition at line 30 of file Input.php.

CI_Input::$user_agent = FALSE

Definition at line 33 of file Input.php.

CI_Input::$xss_hash = ''

Definition at line 31 of file Input.php.


The documentation for this class was generated from the following file: