CI_Input Class Reference

List of all members.


Public Member Functions

 CI_Input ()
 Constructor.
 _sanitize_globals ()
 Sanitize Globals.
 _clean_input_data ($str)
 Clean Input Data.
 _clean_input_keys ($str)
 Clean Keys.
 _fetch_from_array (&$array, $index= '', $xss_clean=FALSE)
 Fetch from array.
 get ($index= '', $xss_clean=FALSE)
 Fetch an item from the GET array.
 post ($index= '', $xss_clean=FALSE)
 Fetch an item from the POST array.
 get_post ($index= '', $xss_clean=FALSE)
 Fetch an item from either the GET array or the POST.
 cookie ($index= '', $xss_clean=FALSE)
 Fetch an item from the COOKIE array.
 server ($index= '', $xss_clean=FALSE)
 Fetch an item from the SERVER array.
 ip_address ()
 Fetch the IP Address.
 valid_ip ($ip)
 Validate IP Address.
 user_agent ()
 User Agent.
 filename_security ($str)
 Filename Security.
 xss_clean ($str, $is_image=FALSE)
 XSS Clean.
 xss_hash ()
 Random Hash for protecting URLs.
 _remove_invisible_characters ($str)
 Remove Invisible Characters.
 _compact_exploded_words ($matches)
 Compact Exploded Words.
 _sanitize_naughty_html ($matches)
 Sanitize Naughty HTML.
 _js_link_removal ($match)
 JS Link Removal.
 _js_img_removal ($match)
 JS Image Removal.
 _convert_attribute ($match)
 Attribute Conversion.
 _html_entity_decode_callback ($match)
 HTML Entity Decode Callback.
 _html_entity_decode ($str, $charset='UTF-8')
 HTML Entities Decode.
 _filter_attributes ($str)
 Filter Attributes.

Public Attributes

 $use_xss_clean = FALSE
 $xss_hash = ''
 $ip_address = FALSE
 $user_agent = FALSE
 $allow_get_array = FALSE
 $never_allowed_str
 $never_allowed_regex

Detailed Description

Definition at line 29 of file Input.php.


Member Function Documentation

CI_Input::_clean_input_data ( str  ) 

Clean Input Data.

This is a helper function. It escapes data and standardizes newline characters to

private

Parameters:
string 
Returns:
string

Definition at line 160 of file Input.php.

References _clean_input_keys(), and xss_clean().

Referenced by _sanitize_globals().

00161         {
00162                 if (is_array($str))
00163                 {
00164                         $new_array = array();
00165                         foreach ($str as $key => $val)
00166                         {
00167                                 $new_array[$this->_clean_input_keys($key)] = $this->_clean_input_data($val);
00168                         }
00169                         return $new_array;
00170                 }
00171 
00172                 // We strip slashes if magic quotes is on to keep things consistent
00173                 if (get_magic_quotes_gpc())
00174                 {
00175                         $str = stripslashes($str);
00176                 }
00177 
00178                 // Should we filter the input data?
00179                 if ($this->use_xss_clean === TRUE)
00180                 {
00181                         $str = $this->xss_clean($str);
00182                 }
00183 
00184                 // Standardize newlines
00185                 if (strpos($str, "\r") !== FALSE)
00186                 {
00187                         $str = str_replace(array("\r\n", "\r"), "\n", $str);
00188                 }
00189                 
00190                 return $str;
00191         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_clean_input_keys ( str  ) 

Clean Keys.

This is a helper function. To prevent malicious users from trying to exploit keys we make sure that keys are only named with alpha-numeric text and a few other items.

private

Parameters:
string 
Returns:
string

Definition at line 206 of file Input.php.

Referenced by _clean_input_data().

00207         {
00208                  if ( ! preg_match("/^[a-z0-9:_\/-]+$/i", $str))
00209                  {
00210                         exit('Disallowed Key Characters.');
00211                  }
00212 
00213                 return $str;
00214         }

Here is the caller graph for this function:

CI_Input::_compact_exploded_words ( matches  ) 

Compact Exploded Words.

Callback function for xss_clean() to remove whitespace from things like j a v a s c r i p t

public

Parameters:
type 
Returns:
type

Definition at line 863 of file Input.php.

00864         {
00865                 return preg_replace('/\s+/s', '', $matches[1]).$matches[2];
00866         }

CI_Input::_convert_attribute ( match  ) 

Attribute Conversion.

Used as a callback for XSS Clean

public

Parameters:
array 
Returns:
string

Definition at line 939 of file Input.php.

00940         {
00941                 return str_replace(array('>', '<'), array('&gt;', '&lt;'), $match[0]);
00942         }

CI_Input::_fetch_from_array ( &$  array,
index = '',
xss_clean = FALSE 
)

Fetch from array.

This is a helper function to retrieve values from global arrays

private

Parameters:
array 
string 
bool 
Returns:
string

Definition at line 229 of file Input.php.

References xss_clean().

Referenced by cookie(), get(), post(), and server().

00230         {
00231                 if ( ! isset($array[$index]))
00232                 {
00233                         return FALSE;
00234                 }
00235 
00236                 if ($xss_clean === TRUE)
00237                 {
00238                         return $this->xss_clean($array[$index]);
00239                 }
00240 
00241                 return $array[$index];
00242         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_filter_attributes ( str  ) 

Filter Attributes.

Filters tag attributes for consistency and safety

public

Parameters:
string 
Returns:
string

Definition at line 1030 of file Input.php.

Referenced by _js_img_removal(), and _js_link_removal().

01031         {
01032                 $out = '';
01033 
01034                 if (preg_match_all('#\s*[a-z\-]+\s*=\s*(\042|\047)([^\\1]*?)\\1#is', $str, $matches))
01035                 {
01036                         foreach ($matches[0] as $match)
01037                         {
01038                                 $out .= "{$match}";
01039                         }                       
01040                 }
01041 
01042                 return $out;
01043         }

Here is the caller graph for this function:

CI_Input::_html_entity_decode ( str,
charset = 'UTF-8' 
)

HTML Entities Decode.

This function is a replacement for html_entity_decode()

In some versions of PHP the native function does not work when UTF-8 is the specified character set, so this gives us a work-around. More info here: http://bugs.php.net/bug.php?id=25670

private

Parameters:
string 
string 
Returns:
string

Definition at line 989 of file Input.php.

Referenced by _html_entity_decode_callback().

00990         {
00991                 if (stristr($str, '&') === FALSE) return $str;
00992 
00993                 // The reason we are not using html_entity_decode() by itself is because
00994                 // while it is not technically correct to leave out the semicolon
00995                 // at the end of an entity most browsers will still interpret the entity
00996                 // correctly.  html_entity_decode() does not convert entities without
00997                 // semicolons, so we are left with our own little solution here. Bummer.
00998 
00999                 if (function_exists('html_entity_decode') && (strtolower($charset) != 'utf-8' OR version_compare(phpversion(), '5.0.0', '>=')))
01000                 {
01001                         $str = html_entity_decode($str, ENT_COMPAT, $charset);
01002                         $str = preg_replace('~&#x(0*[0-9a-f]{2,5})~ei', 'chr(hexdec("\\1"))', $str);
01003                         return preg_replace('~&#([0-9]{2,4})~e', 'chr(\\1)', $str);
01004                 }
01005 
01006                 // Numeric Entities
01007                 $str = preg_replace('~&#x(0*[0-9a-f]{2,5});{0,1}~ei', 'chr(hexdec("\\1"))', $str);
01008                 $str = preg_replace('~&#([0-9]{2,4});{0,1}~e', 'chr(\\1)', $str);
01009 
01010                 // Literal Entities - Slightly slow so we do another check
01011                 if (stristr($str, '&') === FALSE)
01012                 {
01013                         $str = strtr($str, array_flip(get_html_translation_table(HTML_ENTITIES)));
01014                 }
01015 
01016                 return $str;
01017         }

Here is the caller graph for this function:

CI_Input::_html_entity_decode_callback ( match  ) 

HTML Entity Decode Callback.

Used as a callback for XSS Clean

public

Parameters:
array 
Returns:
string

Definition at line 955 of file Input.php.

References $CFG, and _html_entity_decode().

00956         {
00957                 global $CFG;
00958                 $charset = $CFG->item('charset');
00959 
00960                 return $this->_html_entity_decode($match[0], strtoupper($charset));
00961         }

Here is the call graph for this function:

CI_Input::_js_img_removal ( match  ) 

JS Image Removal.

Callback function for xss_clean() to sanitize image tags This limits the PCRE backtracks, making it more performance friendly and prevents PREG_BACKTRACK_LIMIT_ERROR from being triggered in PHP 5.2+ on image tag heavy strings

private

Parameters:
array 
Returns:
string

Definition at line 922 of file Input.php.

References _filter_attributes().

00923         {
00924                 $attributes = $this->_filter_attributes(str_replace(array('<', '>'), '', $match[1]));
00925                 return str_replace($match[1], preg_replace("#src=.*?(alert\(|alert&\#40;|javascript\:|charset\=|window\.|document\.|\.cookie|<script|<xss|base64\s*,)#si", "", $attributes), $match[0]);
00926         }

Here is the call graph for this function:

CI_Input::_js_link_removal ( match  ) 

JS Link Removal.

Callback function for xss_clean() to sanitize links This limits the PCRE backtracks, making it more performance friendly and prevents PREG_BACKTRACK_LIMIT_ERROR from being triggered in PHP 5.2+ on link-heavy strings

private

Parameters:
array 
Returns:
string

Definition at line 904 of file Input.php.

References _filter_attributes().

00905         {
00906                 $attributes = $this->_filter_attributes(str_replace(array('<', '>'), '', $match[1]));
00907                 return str_replace($match[1], preg_replace("#href=.*?(alert\(|alert&\#40;|javascript\:|charset\=|window\.|document\.|\.cookie|<script|<xss|base64\s*,)#si", "", $attributes), $match[0]);
00908         }

Here is the call graph for this function:

CI_Input::_remove_invisible_characters ( str  ) 

Remove Invisible Characters.

This prevents sandwiching null characters between ascii characters, like Java.

public

Parameters:
string 
Returns:
string

Definition at line 825 of file Input.php.

Referenced by xss_clean().

00826         {
00827                 static $non_displayables;
00828                 
00829                 if ( ! isset($non_displayables))
00830                 {
00831                         // every control character except newline (10), carriage return (13), and horizontal tab (09),
00832                         // both as a URL encoded character (::shakes fist at IE and WebKit::), and the actual character
00833                         $non_displayables = array(
00834                                                                                 '/%0[0-8]/', '/[\x00-\x08]/',                   // 00-08
00835                                                                                 '/%11/', '/\x0b/', '/%12/', '/\x0c/',   // 11, 12
00836                                                                                 '/%1[4-9]/', '/%2[0-9]/', '/%3[0-1]/',  // url encoded 14-31
00837                                                                                 '/[\x0e-\x1f]/');                                               // 14-31
00838                         
00839                 }
00840 
00841                 do
00842                 {
00843                         $cleaned = $str;
00844                         $str = preg_replace($non_displayables, '', $str);
00845                 }
00846                 while ($cleaned != $str);
00847 
00848                 return $str;
00849         }

Here is the caller graph for this function:

CI_Input::_sanitize_globals (  ) 

Sanitize Globals.

This function does the following:

Unsets $_GET data (if query strings are not enabled)

Unsets all globals if register_globals is enabled

Standardizes newline characters to

private

Returns:
void

Definition at line 89 of file Input.php.

References _clean_input_data(), and log_message().

Referenced by CI_Input().

00090         {
00091                 // Would kind of be "wrong" to unset any of these GLOBALS
00092                 $protected = array('_SERVER', '_GET', '_POST', '_FILES', '_REQUEST', '_SESSION', '_ENV', 'GLOBALS', 'HTTP_RAW_POST_DATA',
00093                                                         'system_folder', 'application_folder', 'BM', 'EXT', 'CFG', 'URI', 'RTR', 'OUT', 'IN');
00094 
00095                 // Unset globals for security. 
00096                 // This is effectively the same as register_globals = off
00097                 foreach (array($_GET, $_POST, $_COOKIE, $_SERVER, $_FILES, $_ENV, (isset($_SESSION) && is_array($_SESSION)) ? $_SESSION : array()) as $global)
00098                 {
00099                         if ( ! is_array($global))
00100                         {
00101                                 if ( ! in_array($global, $protected))
00102                                 {
00103                                         unset($GLOBALS[$global]);
00104                                 }
00105                         }
00106                         else
00107                         {
00108                                 foreach ($global as $key => $val)
00109                                 {
00110                                         if ( ! in_array($key, $protected))
00111                                         {
00112                                                 unset($GLOBALS[$key]);
00113                                         }
00114                         
00115                                         if (is_array($val))
00116                                         {
00117                                                 foreach($val as $k => $v)
00118                                                 {
00119                                                         if ( ! in_array($k, $protected))
00120                                                         {
00121                                                                 unset($GLOBALS[$k]);
00122                                                         }
00123                                                 }
00124                                         }
00125                                 }
00126                         }
00127                 }
00128 
00129                 // Is $_GET data allowed? If not we'll set the $_GET to an empty array
00130                 if ($this->allow_get_array == FALSE)
00131                 {
00132                         $_GET = array();
00133                 }
00134                 else
00135                 {
00136                         $_GET = $this->_clean_input_data($_GET);
00137                 }
00138 
00139                 // Clean $_POST Data
00140                 $_POST = $this->_clean_input_data($_POST);
00141                 
00142                 // Clean $_COOKIE Data
00143                 $_COOKIE = $this->_clean_input_data($_COOKIE);
00144 
00145                 log_message('debug', "Global POST and COOKIE data sanitized");
00146         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::_sanitize_naughty_html ( matches  ) 

Sanitize Naughty HTML.

Callback function for xss_clean() to remove naughty HTML elements

private

Parameters:
array 
Returns:
string

Definition at line 879 of file Input.php.

00880         {
00881                 // encode opening brace
00882                 $str = '&lt;'.$matches[1].$matches[2].$matches[3];
00883                 
00884                 // encode captured opening or closing brace to prevent recursive vectors
00885                 $str .= str_replace(array('>', '<'), array('&gt;', '&lt;'), $matches[4]);
00886                 
00887                 return $str;
00888         }

CI_Input::CI_Input (  ) 

Constructor.

Sets whether to globally enable the XSS processing and whether to allow the $_GET array

public

Definition at line 63 of file Input.php.

References $CFG, _sanitize_globals(), load_class(), and log_message().

00064         {
00065                 log_message('debug', "Input Class Initialized");
00066 
00067                 $CFG =& load_class('Config');
00068                 $this->use_xss_clean    = ($CFG->item('global_xss_filtering') === TRUE) ? TRUE : FALSE;
00069                 $this->allow_get_array  = ($CFG->item('enable_query_strings') === TRUE) ? TRUE : FALSE;
00070                 $this->_sanitize_globals();
00071         }

Here is the call graph for this function:

CI_Input::cookie ( index = '',
xss_clean = FALSE 
)

Fetch an item from the COOKIE array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 306 of file Input.php.

References _fetch_from_array().

00307         {
00308                 return $this->_fetch_from_array($_COOKIE, $index, $xss_clean);
00309         }

Here is the call graph for this function:

CI_Input::filename_security ( str  ) 

Filename Security.

public

Parameters:
string 
Returns:
string

Definition at line 446 of file Input.php.

00447         {
00448                 $bad = array(
00449                                                 "../",
00450                                                 "./",
00451                                                 "<!--",
00452                                                 "-->",
00453                                                 "<",
00454                                                 ">",
00455                                                 "'",
00456                                                 '"',
00457                                                 '&',
00458                                                 '$',
00459                                                 '#',
00460                                                 '{',
00461                                                 '}',
00462                                                 '[',
00463                                                 ']',
00464                                                 '=',
00465                                                 ';',
00466                                                 '?',
00467                                                 "%20",
00468                                                 "%22",
00469                                                 "%3c",          // <
00470                                                 "%253c",        // <
00471                                                 "%3e",          // >
00472                                                 "%0e",          // >
00473                                                 "%28",          // (  
00474                                                 "%29",          // ) 
00475                                                 "%2528",        // (
00476                                                 "%26",          // &
00477                                                 "%24",          // $
00478                                                 "%3f",          // ?
00479                                                 "%3b",          // ;
00480                                                 "%3d"           // =
00481                                         );
00482 
00483                 return stripslashes(str_replace($bad, '', $str));
00484         }

CI_Input::get ( index = '',
xss_clean = FALSE 
)

Fetch an item from the GET array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 254 of file Input.php.

References _fetch_from_array().

00255         {
00256                 return $this->_fetch_from_array($_GET, $index, $xss_clean);
00257         }

Here is the call graph for this function:

CI_Input::get_post ( index = '',
xss_clean = FALSE 
)

Fetch an item from either the GET array or the POST.

public

Parameters:
string The index key
bool XSS cleaning
Returns:
string

Definition at line 284 of file Input.php.

References post().

00285         {               
00286                 if ( ! isset($_POST[$index]) )
00287                 {
00288                         return $this->get($index, $xss_clean);
00289                 }
00290                 else
00291                 {
00292                         return $this->post($index, $xss_clean);
00293                 }               
00294         }

Here is the call graph for this function:

CI_Input::ip_address (  ) 

Fetch the IP Address.

public

Returns:
string

Definition at line 334 of file Input.php.

References server(), and valid_ip().

00335         {
00336                 if ($this->ip_address !== FALSE)
00337                 {
00338                         return $this->ip_address;
00339                 }
00340 
00341                 if ($this->server('REMOTE_ADDR') AND $this->server('HTTP_CLIENT_IP'))
00342                 {
00343                          $this->ip_address = $_SERVER['HTTP_CLIENT_IP'];
00344                 }
00345                 elseif ($this->server('REMOTE_ADDR'))
00346                 {
00347                          $this->ip_address = $_SERVER['REMOTE_ADDR'];
00348                 }
00349                 elseif ($this->server('HTTP_CLIENT_IP'))
00350                 {
00351                          $this->ip_address = $_SERVER['HTTP_CLIENT_IP'];
00352                 }
00353                 elseif ($this->server('HTTP_X_FORWARDED_FOR'))
00354                 {
00355                          $this->ip_address = $_SERVER['HTTP_X_FORWARDED_FOR'];
00356                 }
00357 
00358                 if ($this->ip_address === FALSE)
00359                 {
00360                         $this->ip_address = '0.0.0.0';
00361                         return $this->ip_address;
00362                 }
00363 
00364                 if (strstr($this->ip_address, ','))
00365                 {
00366                         $x = explode(',', $this->ip_address);
00367                         $this->ip_address = end($x);
00368                 }
00369 
00370                 if ( ! $this->valid_ip($this->ip_address))
00371                 {
00372                         $this->ip_address = '0.0.0.0';
00373                 }
00374                 
00375                 return $this->ip_address;
00376         }

Here is the call graph for this function:

CI_Input::post ( index = '',
xss_clean = FALSE 
)

Fetch an item from the POST array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 269 of file Input.php.

References _fetch_from_array().

Referenced by get_post().

00270         {
00271                 return $this->_fetch_from_array($_POST, $index, $xss_clean);
00272         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::server ( index = '',
xss_clean = FALSE 
)

Fetch an item from the SERVER array.

public

Parameters:
string 
bool 
Returns:
string

Definition at line 321 of file Input.php.

References _fetch_from_array().

Referenced by ip_address().

00322         {
00323                 return $this->_fetch_from_array($_SERVER, $index, $xss_clean);
00324         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::user_agent (  ) 

User Agent.

public

Returns:
string

Definition at line 425 of file Input.php.

00426         {
00427                 if ($this->user_agent !== FALSE)
00428                 {
00429                         return $this->user_agent;
00430                 }
00431 
00432                 $this->user_agent = ( ! isset($_SERVER['HTTP_USER_AGENT'])) ? FALSE : $_SERVER['HTTP_USER_AGENT'];
00433 
00434                 return $this->user_agent;
00435         }

CI_Input::valid_ip ( ip  ) 

Validate IP Address.

Updated version suggested by Geert De Deckere

public

Parameters:
string 
Returns:
string

Definition at line 389 of file Input.php.

Referenced by ip_address().

00390         {
00391                 $ip_segments = explode('.', $ip);
00392 
00393                 // Always 4 segments needed
00394                 if (count($ip_segments) != 4)
00395                 {
00396                         return FALSE;
00397                 }
00398                 // IP can not start with 0
00399                 if (substr($ip_segments[0], 0, 1) == '0')
00400                 {
00401                         return FALSE;
00402                 }
00403                 // Check each segment
00404                 foreach ($ip_segments as $segment)
00405                 {
00406                         // IP segments must be digits and can not be 
00407                         // longer than 3 digits or greater then 255
00408                         if (preg_match("/[^0-9]/", $segment) OR $segment > 255 OR strlen($segment) > 3)
00409                         {
00410                                 return FALSE;
00411                         }
00412                 }
00413 
00414                 return TRUE;
00415         }

Here is the caller graph for this function:

CI_Input::xss_clean ( str,
is_image = FALSE 
)

XSS Clean.

Sanitizes data so that Cross Site Scripting Hacks can be prevented.  This function does a fair amount of work but it is extremely thorough, designed to prevent even the most obscure XSS attempts.  Nothing is ever 100% foolproof, of course, but I haven't been able to get anything passed the filter.

Note: This function should only be used to deal with data upon submission.  It's not something that should be used for general runtime processing.

This function was based in part on some code and ideas I got from Bitflux: http://blog.bitflux.ch/wiki/XSS_Prevention

To help develop this script I used this great list of vulnerabilities along with a few other hacks I've harvested from examining vulnerabilities in other programs: http://ha.ckers.org/xss.html

public

Parameters:
string 
Returns:
string

Definition at line 514 of file Input.php.

References _remove_invisible_characters(), log_message(), and xss_hash().

Referenced by _clean_input_data(), and _fetch_from_array().

00515         {
00516                 /*
00517                  * Is the string an array?
00518                  *
00519                  */
00520                 if (is_array($str))
00521                 {
00522                         while (list($key) = each($str))
00523                         {
00524                                 $str[$key] = $this->xss_clean($str[$key]);
00525                         }
00526         
00527                         return $str;
00528                 }
00529 
00530                 /*
00531                  * Remove Invisible Characters
00532                  */
00533                 $str = $this->_remove_invisible_characters($str);
00534 
00535                 /*
00536                  * Protect GET variables in URLs
00537                  */
00538                  
00539                  // 901119URL5918AMP18930PROTECT8198
00540                  
00541                 $str = preg_replace('|\&([a-z\_0-9]+)\=([a-z\_0-9]+)|i', $this->xss_hash()."\\1=\\2", $str);
00542 
00543                 /*
00544                  * Validate standard character entities
00545                  *
00546                  * Add a semicolon if missing.  We do this to enable
00547                  * the conversion of entities to ASCII later.
00548                  *
00549                  */
00550                 $str = preg_replace('#(&\#?[0-9a-z]+)[\x00-\x20]*;?#i', "\\1;", $str);
00551 
00552                 /*
00553                  * Validate UTF16 two byte encoding (x00) 
00554                  *
00555                  * Just as above, adds a semicolon if missing.
00556                  *
00557                  */
00558                 $str = preg_replace('#(&\#x?)([0-9A-F]+);?#i',"\\1\\2;",$str);
00559 
00560                 /*
00561                  * Un-Protect GET variables in URLs
00562                  */
00563                 $str = str_replace($this->xss_hash(), '&', $str);
00564 
00565                 /*
00566                  * URL Decode
00567                  *
00568                  * Just in case stuff like this is submitted:
00569                  *
00570                  * <a href="http://%77%77%77%2E%67%6F%6F%67%6C%65%2E%63%6F%6D">Google</a>
00571                  *
00572                  * Note: Use rawurldecode() so it does not remove plus signs
00573                  *
00574                  */
00575                 $str = rawurldecode($str);
00576         
00577                 /*
00578                  * Convert character entities to ASCII 
00579                  *
00580                  * This permits our tests below to work reliably.
00581                  * We only convert entities that are within tags since
00582                  * these are the ones that will pose security problems.
00583                  *
00584                  */
00585 
00586                 $str = preg_replace_callback("/[a-z]+=([\'\"]).*?\\1/si", array($this, '_convert_attribute'), $str);
00587          
00588                 $str = preg_replace_callback("/<\w+.*?(?=>|<|$)/si", array($this, '_html_entity_decode_callback'), $str);
00589 
00590                 /*
00591                  * Remove Invisible Characters Again!
00592                  */
00593                 $str = $this->_remove_invisible_characters($str);
00594                 
00595                 /*
00596                  * Convert all tabs to spaces
00597                  *
00598                  * This prevents strings like this: ja  vascript
00599                  * NOTE: we deal with spaces between characters later.
00600                  * NOTE: preg_replace was found to be amazingly slow here on large blocks of data,
00601                  * so we use str_replace.
00602                  *
00603                  */
00604                 
00605                 if (strpos($str, "\t") !== FALSE)
00606                 {
00607                         $str = str_replace("\t", ' ', $str);
00608                 }
00609                 
00610                 /*
00611                  * Capture converted string for later comparison
00612                  */
00613                 $converted_string = $str;
00614                 
00615                 /*
00616                  * Not Allowed Under Any Conditions
00617                  */
00618                 
00619                 foreach ($this->never_allowed_str as $key => $val)
00620                 {
00621                         $str = str_replace($key, $val, $str);   
00622                 }
00623         
00624                 foreach ($this->never_allowed_regex as $key => $val)
00625                 {
00626                         $str = preg_replace("#".$key."#i", $val, $str);   
00627                 }
00628 
00629                 /*
00630                  * Makes PHP tags safe
00631                  *
00632                  *  Note: XML tags are inadvertently replaced too:
00633                  *
00634                  *      <?xml
00635                  *
00636                  * But it doesn't seem to pose a problem.
00637                  *
00638                  */
00639                 if ($is_image === TRUE)
00640                 {
00641                         // Images have a tendency to have the PHP short opening and closing tags every so often
00642                         // so we skip those and only do the long opening tags.
00643                         $str = str_replace(array('<?php', '<?PHP'),  array('&lt;?php', '&lt;?PHP'), $str);
00644                 }
00645                 else
00646                 {
00647                         $str = str_replace(array('<?php', '<?PHP', '<?', '?'.'>'),  array('&lt;?php', '&lt;?PHP', '&lt;?', '?&gt;'), $str);
00648                 }
00649                 
00650                 /*
00651                  * Compact any exploded words
00652                  *
00653                  * This corrects words like:  j a v a s c r i p t
00654                  * These words are compacted back to their correct state.
00655                  *
00656                  */
00657                 $words = array('javascript', 'expression', 'vbscript', 'script', 'applet', 'alert', 'document', 'write', 'cookie', 'window');
00658                 foreach ($words as $word)
00659                 {
00660                         $temp = '';
00661                         
00662                         for ($i = 0, $wordlen = strlen($word); $i < $wordlen; $i++)
00663                         {
00664                                 $temp .= substr($word, $i, 1)."\s*";
00665                         }
00666 
00667                         // We only want to do this when it is followed by a non-word character
00668                         // That way valid stuff like "dealer to" does not become "dealerto"
00669                         $str = preg_replace_callback('#('.substr($temp, 0, -3).')(\W)#is', array($this, '_compact_exploded_words'), $str);
00670                 }
00671                 
00672                 /*
00673                  * Remove disallowed Javascript in links or img tags
00674                  * We used to do some version comparisons and use of stripos for PHP5, but it is dog slow compared
00675                  * to these simplified non-capturing preg_match(), especially if the pattern exists in the string
00676                  */
00677                 do
00678                 {
00679                         $original = $str;
00680         
00681                         if (preg_match("/<a/i", $str))
00682                         {
00683                                 $str = preg_replace_callback("#<a\s*([^>]*?)(>|$)#si", array($this, '_js_link_removal'), $str);
00684                         }
00685         
00686                         if (preg_match("/<img/i", $str))
00687                         {
00688                                 $str = preg_replace_callback("#<img\s*([^>]*?)(>|$)#si", array($this, '_js_img_removal'), $str);
00689                         }
00690         
00691                         if (preg_match("/script/i", $str) OR preg_match("/xss/i", $str))
00692                         {
00693                                 $str = preg_replace("#<(/*)(script|xss)(.*?)>#si", '[removed]', $str);
00694                         }
00695                 }
00696                 while($original != $str);
00697 
00698                 unset($original);
00699 
00700                 /*
00701                  * Remove JavaScript Event Handlers
00702                  *
00703                  * Note: This code is a little blunt.  It removes
00704                  * the event handler and anything up to the closing >,
00705                  * but it's unlikely to be a problem.
00706                  *
00707                  */
00708                 $event_handlers = array('on\w*','xmlns');
00709 
00710                 if ($is_image === TRUE)
00711                 {
00712                         /*
00713                          * Adobe Photoshop puts XML metadata into JFIF images, including namespacing, 
00714                          * so we have to allow this for images. -Paul
00715                          */
00716                         unset($event_handlers[array_search('xmlns', $event_handlers)]);
00717                 }
00718                 
00719                 $str = preg_replace("#<([^><]+)(".implode('|', $event_handlers).")(\s*=\s*[^><]*)([><]*)#i", "<\\1\\4", $str);
00720                 
00721                 /*
00722                  * Sanitize naughty HTML elements
00723                  *
00724                  * If a tag containing any of the words in the list
00725                  * below is found, the tag gets converted to entities.
00726                  *
00727                  * So this: <blink>
00728                  * Becomes: &lt;blink&gt;
00729                  *
00730                  */
00731                 $naughty = 'alert|applet|audio|basefont|base|behavior|bgsound|blink|body|embed|expression|form|frameset|frame|head|html|ilayer|iframe|input|layer|link|meta|object|plaintext|style|script|textarea|title|video|xml|xss';
00732                 $str = preg_replace_callback('#<(/*\s*)('.$naughty.')([^><]*)([><]*)#is', array($this, '_sanitize_naughty_html'), $str);
00733 
00734                 /*
00735                  * Sanitize naughty scripting elements
00736                  *
00737                  * Similar to above, only instead of looking for
00738                  * tags it looks for PHP and JavaScript commands
00739                  * that are disallowed.  Rather than removing the
00740                  * code, it simply converts the parenthesis to entities
00741                  * rendering the code un-executable.
00742                  *
00743                  * For example: eval('some code')
00744                  * Becomes:             eval&#40;'some code'&#41;
00745                  *
00746                  */
00747                 $str = preg_replace('#(alert|cmd|passthru|eval|exec|expression|system|fopen|fsockopen|file|file_get_contents|readfile|unlink)(\s*)\((.*?)\)#si', "\\1\\2&#40;\\3&#41;", $str);
00748                                         
00749                 /*
00750                  * Final clean up
00751                  *
00752                  * This adds a bit of extra precaution in case
00753                  * something got through the above filters
00754                  *
00755                  */
00756                 foreach ($this->never_allowed_str as $key => $val)
00757                 {
00758                         $str = str_replace($key, $val, $str);   
00759                 }
00760         
00761                 foreach ($this->never_allowed_regex as $key => $val)
00762                 {
00763                         $str = preg_replace("#".$key."#i", $val, $str);
00764                 }
00765 
00766                 /*
00767                  *  Images are Handled in a Special Way
00768                  *  - Essentially, we want to know that after all of the character conversion is done whether
00769                  *  any unwanted, likely XSS, code was found.  If not, we return TRUE, as the image is clean.
00770                  *  However, if the string post-conversion does not matched the string post-removal of XSS,
00771                  *  then it fails, as there was unwanted XSS code found and removed/changed during processing.
00772                  */
00773 
00774                 if ($is_image === TRUE)
00775                 {
00776                         if ($str == $converted_string)
00777                         {
00778                                 return TRUE;
00779                         }
00780                         else
00781                         {
00782                                 return FALSE;
00783                         }
00784                 }
00785                 
00786                 log_message('debug', "XSS Filtering completed");
00787                 return $str;
00788         }

Here is the call graph for this function:

Here is the caller graph for this function:

CI_Input::xss_hash (  ) 

Random Hash for protecting URLs.

public

Returns:
string

Definition at line 798 of file Input.php.

Referenced by xss_clean().

00799         {
00800                 if ($this->xss_hash == '')
00801                 {
00802                         if (phpversion() >= 4.2)
00803                                 mt_srand();
00804                         else
00805                                 mt_srand(hexdec(substr(md5(microtime()), -8)) & 0x7fffffff);
00806 
00807                         $this->xss_hash = md5(time() + mt_rand(0, 1999999999));
00808                 }
00809 
00810                 return $this->xss_hash;
00811         }

Here is the caller graph for this function:


Member Data Documentation

CI_Input::$allow_get_array = FALSE

Definition at line 34 of file Input.php.

CI_Input::$ip_address = FALSE

Definition at line 32 of file Input.php.

CI_Input::$never_allowed_regex

Initial value:

 array(
                                                                                "javascript\s*:"        => '[removed]',
                                                                                "expression\s*\("       => '[removed]', // CSS and IE
                                                                                "Redirect\s+302"        => '[removed]'
                                                                        )

Definition at line 49 of file Input.php.

CI_Input::$never_allowed_str

Initial value:

 array(
                                                                        'document.cookie'       => '[removed]',
                                                                        'document.write'        => '[removed]',
                                                                        '.parentNode'           => '[removed]',
                                                                        '.innerHTML'            => '[removed]',
                                                                        'window.location'       => '[removed]',
                                                                        '-moz-binding'          => '[removed]',
                                                                        '<!--'                          => '&lt;!--',
                                                                        '-->'                           => '--&gt;',
                                                                        '<![CDATA['                     => '&lt;![CDATA['
                                                                        )

Definition at line 37 of file Input.php.

CI_Input::$use_xss_clean = FALSE

Definition at line 30 of file Input.php.

CI_Input::$user_agent = FALSE

Definition at line 33 of file Input.php.

CI_Input::$xss_hash = ''

Definition at line 31 of file Input.php.


The documentation for this class was generated from the following file: