On Web Development: August 2008

Tuesday, August 26, 2008

Quoting Strings in SQLite with PHP

Unlike MySQL, SQLite follows the quoting standards in SQL strictly and does not understand the backslash \ as an escape character. SQLite only understands escaping a single quote with another single quote.

For example, if you receive the input data 'cheeky ' string' and use the PHP function addslahes() to escape literal characters in the string then you will get 'cheeky \' string' which according to SQLite is not escaped properly. You need to escape the string so that it looks like 'cheeky '' string'.

If you have magic_quotes turned on then you are in even more trouble. This PHP setting escapes all HTTP variables received by PHP with an equivalent of addslshes(). So the correct way to escape strings in SQLite would be:

function sqlite_quote_string($str) {
 if (get_magic_quotes_gpc()) {
  $str = stripslashes($str);
 }
 return sqlite_escape_string($str);
}

This will remove the escape characters added by the magic_quotes setting, and escape strings with SQLites sqlite_escape_string() function which correctly escapes the string with '.

Creating a custom SQLite Function in PHP

SQLite is available in PHP5 either by compiling PHP5 with SQLite support or enabling the SQLite extension dynamically from the PHP configuration (PHP.ini). A distinct feature of SQLite is that it is an embedded database, and thus offers some features a Server/Hosted database such as the popular MySQL database doesn't.

Creating Custom Functions in SQLite

One of the really cool features of SQLite in PHP is that you can create custom PHP functions, that will be called by SQLite in your queries. Thus you can extend the SQLite functions using PHP.

A custom Regexp function for SQLite in PHP

// create a regex match function for sqlite
sqlite_create_function($db, 'REGEX_MATCH', 'sqlite_regex_match', 2);
function sqlite_regex_match($str, $regex) {
 if (preg_match($regex, $str, $matches)) {
  return $matches[0];
 }
 return false;
}

The above PHP code will create a custom function called REGEX_MATCH for the SQLite connection referenced by $db. The REGEX_MATCH SQLite function is implemented by the sqlite_regex_match user function we define in PHP.

Here is an example query that makes use of the custom function we created. Notice that in the SQLite query, we call our custom function REGEX_MATCH:

$query = 'SELECT REGEX_MATCH(link, \'|http://[^/]+/|i\') AS domain, link, COUNT(link) AS total'
 .' FROM links WHERE domain != 0'
 .' GROUP BY domain'
 .' LIMIT 10';
$result = sqlite_query($db, $query);

This will make SQLite call the PHP function sqlite_regex_match for each database table row that is goes over when performing the select query, sending it the link field value as the first parameter, and the regular expression string as the second parameter. PHP will then process the function and return its results to SQLite, which continues to the next table row.

Custom Functions in SQLite compared to MySQL

In comparison with MySQL, you cannot create a custom function in PHP that mysql will use. MySQL allows creation of custom functions, but they have to be written in MySQL. Thus you cannot extend MySQL's query functionality with PHP.

I believe the reason for this is simply because having a callback function called on the client, by the database, over a Client-Server model for each row that has to be processed would be just inefficient. Imaging processing 100,000 rows in a MySQL database and having MySQL make a callback to PHP over a TCP connection, the overhead of sending the data back and forth for the callback would be way too much.
With an embedded database like SQLite, this isn't the case since making the actual communication between the language and the embedded database does not pose such a high overhead.

Monday, August 25, 2008

PHP Email Address validation through SMTP

Here is a PHP class written for PHP4 and PHP5 that will validate email addresses by querying the SMTP (Simple Mail Transfer Protocol) server. This is meant to complement validation of the syntax of the email address, which should be used before validating the email via SMTP, which is more resource and time consuming.

Update: Sept 8, 2008
The class has been updated to work with Windows MTA's such as Hotmail and many other fixes have been made. See changes. The class will no longer get you blacklisted by Hotmail due to improper HELO procedure.
Update: Sept 10, 2008
Window Support Added through Net_DNS (pear DNS class). Added support for validating multiple emails on the same domain through a single Socket. Improved the Email Parsing to support literal @ signs.
Update: Sept 29, 2008
The code for this project has been moved to Google Code. The latest source can be grabbed from SVN.
Update: Nov 22, 2008
SMTP Email Validation Class has been added to the Yii PHP Framework. http://www.yiiframework.com/. Yii is a high-performance component-based PHP framework for developing large-scale Web applications.

<?php
 
 /**
 * Validate Email Addresses Via SMTP
 * This queries the SMTP server to see if the email address is accepted.
 * @copyright http://creativecommons.org/licenses/by/2.0/ - Please keep this comment intact
 * @author gabe@fijiwebdesign.com
 * @contributers adnan@barakatdesigns.net
 * @version 0.1a
 */
class SMTP_validateEmail {

 /**
  * PHP Socket resource to remote MTA
  * @var resource $sock 
  */
 var $sock;

 /**
  * Current User being validated
  */
 var $user;
 /**
  * Current domain where user is being validated
  */
 var $domain;
 /**
  * List of domains to validate users on
  */
 var $domains;
 /**
  * SMTP Port
  */
 var $port = 25;
 /**
  * Maximum Connection Time to an MTA 
  */
 var $max_conn_time = 30;
 /**
  * Maximum time to read from socket
  */
 var $max_read_time = 5;
 
 /**
  * username of sender
  */
 var $from_user = 'user';
 /**
  * Host Name of sender
  */
 var $from_domain = 'localhost';
 
 /**
  * Nameservers to use when make DNS query for MX entries
  * @var Array $nameservers 
  */
 var $nameservers = array(
 '192.168.0.1'
);
 
 var $debug = false;

 /**
  * Initializes the Class
  * @return SMTP_validateEmail Instance
  * @param $email Array[optional] List of Emails to Validate
  * @param $sender String[optional] Email of validator
  */
 function SMTP_validateEmail($emails = false, $sender = false) {
  if ($emails) {
   $this->setEmails($emails);
  }
  if ($sender) {
   $this->setSenderEmail($sender);
  }
 }
 
 function _parseEmail($email) {
  $parts = explode('@', $email);
 $domain = array_pop($parts);
 $user= implode('@', $parts);
 return array($user, $domain);
 }
 
 /**
  * Set the Emails to validate
  * @param $emails Array List of Emails
  */
 function setEmails($emails) {
  foreach($emails as $email) {
  list($user, $domain) = $this->_parseEmail($email);
  if (!isset($this->domains[$domain])) {
    $this->domains[$domain] = array();
  }
  $this->domains[$domain][] = $user;
 }
 }
 
 /**
  * Set the Email of the sender/validator
  * @param $email String
  */
 function setSenderEmail($email) {
 $parts = $this->_parseEmail($email);
 $this->from_user = $parts[0];
 $this->from_domain = $parts[1];
 }
 
 /**
 * Validate Email Addresses
 * @param String $emails Emails to validate (recipient emails)
 * @param String $sender Sender's Email
 * @return Array Associative List of Emails and their validation results
 */
 function validate($emails = false, $sender = false) {
  
  $results = array();

  if ($emails) {
   $this->setEmails($emails);
  }
  if ($sender) {
   $this->setSenderEmail($sender);
  }

  // query the MTAs on each Domain
  foreach($this->domains as $domain=>$users) {
   
  $mxs = array();
  
   // retrieve SMTP Server via MX query on domain
   list($hosts, $mxweights) = $this->queryMX($domain);

   // retrieve MX priorities
   for($n=0; $n < count($hosts); $n++){
    $mxs[$hosts[$n]] = $mxweights[$n];
   }
   asort($mxs);
 
   // last fallback is the original domain
   array_push($mxs, $this->domain);
   
   $this->debug(print_r($mxs, 1));
   
   $timeout = $this->max_conn_time/count($hosts);
    
   // try each host
   while(list($host) = each($mxs)) {
    // connect to SMTP server
    $this->debug("try $host:$this->port\n");
    if ($this->sock = fsockopen($host, $this->port, $errno, $errstr, (float) $timeout)) {
     stream_set_timeout($this->sock, $this->max_read_time);
     break;
    }
   }
  
   // did we get a TCP socket
   if ($this->sock) {
    $reply = fread($this->sock, 2082);
    $this->debug("<<<\n$reply");
    
    preg_match('/^([0-9]{3}) /ims', $reply, $matches);
    $code = isset($matches[1]) ? $matches[1] : '';
 
    if($code != '220') {
     // MTA gave an error...
     foreach($users as $user) {
      $results[$user.'@'.$domain] = false;
  }
  continue;
    }

    // say helo
    $this->send("HELO ".$this->from_domain);
    // tell of sender
    $this->send("MAIL FROM: <".$this->from_user.'@'.$this->from_domain.">");
    
    // ask for each recepient on this domain
    foreach($users as $user) {
    
     // ask of recepient
     $reply = $this->send("RCPT TO: <".$user.'@'.$domain.">");
     
      // get code and msg from response
     preg_match('/^([0-9]{3}) /ims', $reply, $matches);
     $code = isset($matches[1]) ? $matches[1] : '';
  
     if ($code == '250') {
      // you received 250 so the email address was accepted
      $results[$user.'@'.$domain] = true;
     } elseif ($code == '451' || $code == '452') {
   // you received 451 so the email address was greylisted (or some temporary error occured on the MTA) - so assume is ok
   $results[$user.'@'.$domain] = true;
     } else {
      $results[$user.'@'.$domain] = false;
     }
    
    }
    
    // quit
    $this->send("quit");
    // close socket
    fclose($this->sock);
   
   }
  }
 return $results;
 }


 function send($msg) {
  fwrite($this->sock, $msg."\r\n");

  $reply = fread($this->sock, 2082);

  $this->debug(">>>\n$msg\n");
  $this->debug("<<<\n$reply");
  
  return $reply;
 }
 
 /**
  * Query DNS server for MX entries
  * @return 
  */
 function queryMX($domain) {
  $hosts = array();
 $mxweights = array();
  if (function_exists('getmxrr')) {
   getmxrr($domain, $hosts, $mxweights);
  } else {
   // windows, we need Net_DNS
  require_once 'Net/DNS.php';

  $resolver = new Net_DNS_Resolver();
  $resolver->debug = $this->debug;
  // nameservers to query
  $resolver->nameservers = $this->nameservers;
  $resp = $resolver->query($domain, 'MX');
  if ($resp) {
   foreach($resp->answer as $answer) {
    $hosts[] = $answer->exchange;
    $mxweights[] = $answer->preference;
   }
  }
  
  }
 return array($hosts, $mxweights);
 }
 
 /**
  * Simple function to replicate PHP 5 behaviour. http://php.net/microtime
  */
 function microtime_float() {
  list($usec, $sec) = explode(" ", microtime());
  return ((float)$usec + (float)$sec);
 }

 function debug($str) {
  if ($this->debug) {
   echo htmlentities($str);
  }
 }

}

 
?>

Using the PHP SMTP Email Address Validation Class

Example Usage:

// the email to validate
$email = 'joe@gmail.com';
// an optional sender
$sender = 'user@example.com';
// instantiate the class
$SMTP_Valid = new SMTP_validateEmail();
// do the validation
$result = $SMTP_Valid->validate($email, $sender);
// view results
var_dump($result);
echo $email.' is '.($result ? 'valid' : 'invalid')."\n";

// send email? 
if ($result) {
  //mail(...);
}

Code Status

This is a very basic, and alpha version of this php class. I just wrote it to demonstrate an example. There are a few limitations. One, it is not optimized. Each email you verify will create a new MX DNS query and a new TCP connection to the SMTP server. The DNS query and TCP socket is not cached for the next query at all, even if they are to the same host or the same SMTP server.
Second, this will only work on Linux. Windwos does not have the DNS function needed. You could replace the DNS queries with the Pear Net_DNS Library if you need it on Windows.

Limitations of verifying via SMTP

Not all SMTP servers are configured to let you know that an email address does not exist on the server. If the SMTP server does respond with an "OK", it does not mean that the email address exists. It just means that the SMTP server will accept the email address and not bounce it. What it does with the actual email is different. It may deliver it to the recipient, or it may just send it to a blackhole.
If you get an invalid response from the SMTP server however, you can be pretty sure your email will bounce if you actually send it.
You should also NOT use this class to try and guess emails, for spamming purposes. You will quickly get blacklisted on Spamhaus or a similar list.

Good uses of verifying via SMTP

If you have forms such as registration forms, where users enter their email addresses. It may be a good idea to first check the syntax of the email address, to see if it is valid as per the SMTP protocol specifications. Then if it is valid, you may want to verify that the email will be accepted (will not bounce). This can allow you to notify the user of a problem with their email address, in case they made a typo, knowingly entered an invalid email. This could increase the number of successful registrations.

How it works

If you're interested in how it works, it is quite simple. The class will first take an email, and separate it to the user and host portions. The host portion, tells us which domain to send the email to. However, a domain may have an SMTP server on a different domain so we retrieve a list of SMTP servers that are available for the domain by doing a DNS query of type MX on that domain. We receive a list of SMTP servers, so we iterate through each trying to make a connection. Once connected, we send SMTP commands to the SMTP server, first saying "HELO", then setting our sender, then our recipient. If the recipient is rejected, we know an actual sending of an email will fail. Thus, we close the TCP connection to the SMTP server and quit.

Thursday, August 21, 2008

XSS (Cross Site Scripting) and stealing passwords

XSS (Cross Site Scripting) would be viewed by most web developers as the stealing of users session cookies by injecting JavaScript into a web page through URL. You do not associate it with stealing passwords, but worse then stealing session cookies, it can steal a users username and password directly from the browser.

Many users choose to have the browser remember their login credentials. So when ever they visit a login form, their username and password fields are pre-populated by the browser. Now if there is an XSS vulnerability on that login page, then a remote attacker can successfully retrieve the users username and password.

Hello World in XSS

You have a page that has an XSS vulnerability. Let say a website has a PHP page, mypage.php with the code:

<?php

// the variable is returned raw to the browser
echo $_GET['name'];

?>

Because the variable $_GET['name'] is not encoded into HTML entities, or stripped of HTML, it has an XSS vulnerability. Now all an attacker has to do is create a URL that a victim will click, that exploits the vulnerability.

mypage.php?name=%3Cscript%3Ealert(document.cookie);%3C/script%3E

This basically will make PHP write <script>alert(document.cookie);</script> onto the page, which displays a modal dialog with the value of the saved cookies for that domain.

How Does stealing passwords with XSS work?

The example above displays the cookies on the domain the webpage is on. Now imagine the same page has a login form, and the user chose to have their passwords remembered by the browser. Lets say the PHP page looks like this:

<?php

// the variable is returned raw to the browser
echo $_GET['name'];

?>

<form action="login.php">
<input type="text" name="username" />
<input type="password" name="password" />
<input type="submit" value="Login" />
</form>

Now an attacker just needs to craft a URL that retrieves the username and password. Here is an example that retrieves the password:

mypage.php?name=%3Cscript%3Ewindow.onload=function(){alert(document.forms[0].password);}%3C/script%3E

As you can see, it is just a normal XSS exploit, except it is applied to the username and password populated by the browser after the window.onload event.

Password stealing XSS vs Session Cookie stealing XSS

Well, they are both suck from a developers perspective. According to Wikipedia, 70% or so of websites are vulnerable to XSS attacks.

As a developer, I've always thought of XSS as an exploit on a users session, just as CSRF/XSRF (Cross Site Request Forgery), which requires an active session. Now, as you can see, XSS of the type described does NOT require an active session. The user does not have to be logged into the site. They could have logged out 10 years ago, but as long as the browser remembers their login credentials, the XSS exploit can steal those login credentials.

Due to its ability to be executed without having the user logged into a website, this exploit should be regarded worse then session based XSS.

Proof of Concept

Fill in the form below with dummy values and click the "Login" button.

Now return to the same page, to simulate logging out. Now click the Exploit. This will simulate an XSS exploit on this page, and alert the saved password.

I've set up a proof of concept based on an actual XSS exploit here: http://xss-password.appjet.net/.

Preventing Stealing Passwords via XSS

The only way I can think of right now is to give your username and password fields unique names so that the browser does not remember their values. In PHP you can do this with the time() function. eg:

<input type="password" name="pass[<?php echo sha1(time().rand().'secret'); ?>]" />

The unique names prevents the browser from remembering the password field. This should work universally in all browsers.