XSS exploits are similar to SQL injection, an original form of code injection. This type of attack exploits an application’s output function that references poorly sanitized user input. However, SQL injection targets the query function that interacts with the database, whereas XSS exploits target the HTML output function that sends data to the browser. The basic idea of XSS injection is to use special characters to cause Web browser interpreters to switch from a data context to a code context.1 For example, when an HTML page references a user input as data, an attacker might include the tag <script>, which can invoke the Java-Script interpreter. If the application does not filter such special characters, XSS injection is successful, and the attacker can perform exploits such as account hijacking, cookie poisoning, denial of service (DoS), and Web content manipulation. Typical input sources that attackers manipulate include HTML forms, cookies, URLs, and external files. At-tackers often favor JavaScript, but other kinds of client-side scripts such as VBScript and Flash, which browsers can interpret, could cause XSS

Injection methods Depending on the ways HTML pages reference user inputs, XSS exploits can be broadly classified as reflected, stored, or DOM-based. Reflected or nonpersistent XSS holes are present in a Web application server program where it references accessed user input in the outgoing web-page. This type of XSS exploit is common in error messages and search results. The XSS ed project (http://xssed.com) recently reported multiple reflected XSS holes in McAfee that attackers could exploit to trick users into downloading viruses. Stored or persistent XSS holes exist when a server program stores user input containing injected code in a persistent data store such as a database and then references it in a webpage. Attacks against social networking sites commonly exploit this type of XXS flaw.  An example  is  the  Samy  worm  (www.securityfocus.com/brief/18), which, with-in less than 24 hours after its release on 4 October 2005, caused an exponential growth of friend lists for 1 million Myspace users, effectively creating a DoS attack. Both reflected and stored XSS holes result from improper handling of user inputs in server-side scripts. In contrast, DOM-based XSS holes appear in the Web application  when  client-side  scripts  reference user inputs, dynamically obtained from the Document Object Model structure, without proper validation. Bugzilla’s bug 272620 (https://bugzilla.mozilla.org/show_bug.cgi?id=272620) is an example of a DOM-based XSS exploit.

XSS DEFENSES

XSS defenses can be broadly classified into four types: defensive coding practices, XSS testing, vulnerability detection, and runtime attack prevention. Table 1 compares various current techniques,  which  each  have  strengths  and  weaknesses. Defensive coding Because  XSS  arises  from  the  improper  handling  of  inputs, using defensive coding practices that validate and sanitize inputs is the best way to eliminate XSS vulnerabili-ties.1,2 Input validation ensures that user inputs conform to a required input format. There are four basic input sanitization options. Re-placement and removal methods search for known bad characters (blacklist comparison); the former replaces them with non-malicious characters, whereas the latter simply  removes  them.  Escaping  methods  search  for  characters that have special meanings for client-side interpreters and remove those meanings. Restriction techniques limit inputs to known good inputs (whitelist comparison). Checking blacklisted characters in the inputs is more scalable, but blacklist comparisons often fail as it is difficult to anticipate every attack signature variant. Whitelist comparisons are considered more secure, but they can result in the rejection of many unlisted valid inputs. OWASP has issued rules that define proper escaping schemes for inputs referenced in different HTML output locations.1 For example, all three vulnerable statements in Figure 1 can be secured by applying proper escaping methods to the input variables at proper places, as in the following (escape() is the JavaScript library function that could encode HTML entities)

Vulnerability detection

Other XSS defenses focus on identifying vulnerabilities in server-side scripts. Static-analysis-based approaches can prove the absence of vulnerabilities, but they tend to generate many false positives. Recent approaches com-bine static analysis with dynamic analysis techniques to improve accuracy. Static analysis. These techniques identify tainted inputs accessed from external data sources, track the flow of tainted data, and check if any reached sinks such as SQL statements and HTML output statements. Benjamin Liv-shits and Monica Lam used binary decision diagrams to apply points-to analysis to server-side scripts; their approach requires users to specify vulnerability patterns in Program Query Language.5 Yichen Xie and Alex Aiken proposed a static analysis technique that obtains block and function summary information from symbolic execution. Static-analysis-based techniques quickly detect potential XSS vulnerabilities in source code and are relatively easy for security personnel to implement and adopt. However, they cannot check the correctness of input sanitization functions and, instead, generally assume that unhandled or unknown functions return unsafe data. These approaches also miss DOM-based XSS vulnerabilities as they do not target client-side scripts. Static string analysis. Gary Wassermann and Zhen-dong Su enhanced the original taint-based approaches with string analysis.8 Their technique uses context-free grammars (CFGs) to represent the values a string variable can hold at a certain program point, which facilitates the checking of blacklisted string values in sensitive program statements. The enhancement provides more accuracy as it can analyze string operations’ effects on inputs. However, when conducting static string analysis, it is difficult to model complex operations such as string-numeric interaction; thus, this approach can result in false positives if analysts make conservative approximations when handling such operations. Static string analysis also suffers from the limitations of blacklist comparisons.

Combined static and dynamic analysis. Motivated by static-analysis-based approaches’ inability to identify faulty  sanitization  functions,  Davide  Balzarotti  and  colleagues developed the Saner tool, which checks the adequacy of sanitization functions for defending against XSS attacks.7 This successor to Pixy uses a static string analysis method similar to that proposed by Wassermann and Su to first identify the potentially faulty sanitization methods, then simulates the identified methods with a set of test inputs that contain attack strings and checks if any attack could still reach the sinks.

59MARCH 2012Note that the program’s control flow structure (line 9) dictates that the variable new tip must contain at least 100 characters, but the CFG for new tip results in the expression .* because Wassermann and Su’s approach cannot handle string-numeric interactions. Combined static and dynamic analysis. Motivated by static-analysis-based approaches’ inability to identify faulty  sanitization  functions,  Davide  Balzarotti  and  colleagues developed the Saner tool, which checks the adequacy of sanitization functions for defending against XSS attacks.7 This successor to Pixy uses a static string analysis method similar to that proposed by Wassermann and Su to first identify the potentially faulty sanitization methods, then simulates the identified methods with a set of test inputs that contain attack strings and checks if any attack could still reach the sinks. Lam and colleagues carried out points-to analysis to track the flow of tainted data in a program and then used this information to instrument the program for model-checking purposes.5 Applying the QED model checker based on Java Pathfinder (http://babelfish.arc.nasa.gov/trac/jpf), they simulated the instrumented program with inputs likely to lead to a match with user-specified vulnerability patterns. This approach’s effectiveness depends on the completeness of the vulnerability specifications and QED’s ability to explore as many different paths as possible. Building  on  the  work  by  Wassermann  and  col-leagues,  a  team  led  by  Adam  Kiezun  used  concolic(concrete + symbolic) execution to capture program path constraints and a constraint solver to generate test inputs that explored various program paths.9 Upon reaching the sinks, they exercised two sets of inputs—one of ordinary valid strings and the other of attack strings from a library (http;//ha.ckers.org/xss.html)—and checked the differences between the resulting program behaviours.

LEAVE A REPLY

Please enter your comment!
Please enter your name here