Projects:Pascal-Louis Perez, Josh Wiseman
OverviewThe web 2.0 trend has shifted users' expectations: web applications have to be as interactive as typical desktop applications. Old refresh-style webmails have been wiped out by GMail, Windows Live Mail or Yahoo! Mail. In this setting, JavaScript is becoming a crucial web development language and is getting attention of the research community. Recently, Siek and Taha used JavaScript as a mean to introduce the concept of Gradual Typing. Drossopoulou et al. are proposing techniques for doing type inferencing in JavaScript. JavaScript is a widely used language which is used for complex AJAX-driven applications, and starting to be used as a server-side scripting language (Zimki). Our goal is to create a JavaScript compiler and use it as an analysis tool. Static JavaScript analysis is an interesting topic, given the quirks of the language and the lack of open source analysis tools. As a proof of concept, the first analysis we will implement is the identification of browser incompatibilities, using as much static information as possible. This analysis would help cut back the annoyances of cross-browser development, especially as JavaScript code bases continue to increase in size. The compiler's design is meant to be simple, clean and modular allowing other people to extend it easily. Browser IncompatibilitiesThe JavaScript language has been implemented by various organizations in different browser versions. Early in the language's existence, no standard was set forth and implementors pionneered APIs. Typically, the same semantic interaction (such as 'get the width of the displayed page') are accessible in different ways. Programmers use control-flow structures to dynamically select which value to access depending on the environment in which the code is ran. These techniques use a common feature of JavaScript that returns a special 'undefined' value when accessing non-existing properties. In comparison, using unsafe Java reflection one can call a non-existing property (or method) but the runtime throws a NoSuchFieldException (respectively NoSuchMethodException). For instance, the inner width viewport property is supported differently on Internet Explorer (IE) version 5, IE6, Mozilla (Firefox), and Safari. Here is typical browser independent function to get the inner width:
Here, the first Other browser detection techniques are used such as arithmetic combination of partial results, use of exceptions to alter the control-flow or context sensitive dynamic script loading. Our tool is not designed to detect or handle these. Browser Compatibility VerificationCompiler StructureOur compiler is structured as a standard optimizing compiler. Parsing is delegated to Rhino which produces a very raw abstract-syntax tree, difficult to analyse and to reason about. This raw AST is thus transformed into a clean AST resembling the concrete syntax of a typical functional programming language. For efficiency, this clean AST is never created in memory but we instead use pattern matching to give an abstract view of this AST. With the help of this representation, a control-flow graph per function is created. The control-flow graph has one instruction per basic block to ease the implementation of optimizations. A dataflow analyis framework has been written and constant folding and dead-code elimination have been implemented. These two optimizations greatly reduce the size of the CFG due to the simple generation phase we use and expose additional expressions for analysis. In addition to this interprocedural analysis framework, we have implemented a call graph factory and a tailored intraprocedural analysis. AnalysesIn this section, we detail all the analyses used for the browser compatibility verification. SelectionsA "selection" is a field or method request in JavaScript, for example:
Selections are the basic unit in our browser compatibility matrices. Selection Folding (intraprocedural, forward data-flow analysis)This analysis folds selections to make them apparent at any point in the program. It is comparable to constant folding.
Program-Point Context (intraprocedural, forward data-flow analysis)We start by computing the set of browsers supported by every program point.
CFG (or Function) Context (intraprocedural, backward data-flow analysis)Using program-point context information (context(n)) we backward-propagate the information and make a summary for the control-flow graph. The result of this analysis is thus only interesting at the input of entry node of the CFG.
Global Context (interprocedural)The final analysis considers the whole program and finalises the results gathered by the previous CFG context analysis. If a function Example: the program consists of four functions m1 through m4. After the intraprocedural CFG context analysis, the function m1 is reported compatible with {A,B,C}, m2 compatible with {A,B,C}, m3 compatible with {B} and finally m4 compatible with {A,B}. ![]() Since m4 calls m3, it is as compatible as m3 thus only compatible with {B}. This result propagates to m1 since it is the caller of m4. The implementation of this analysis is rather elegant. We model the call graph as a CFG and reuse the previous analysis on this model. By backpopulating the results, we have an interprocedural analysis. Of course, building the call graph is the tricky part and we discuss this point later. ExamplesTo test our browser incompatibility analysis, we used Google codesearch to find real-world JavaScript code. We discovered two interesting examples by searching for browser viewport fields that were included in our tool's compatibility matrices. Our tool is run from the main project directory as follows:
The output is a report detailing the browser compatibility of each control-flow graph in MyCode.js for the 3 browsers our tools supports: Firefox (Mozilla), Internet Explorer 6, and Safari. Note that the tool does not distinguish between various browser modes (for example, IE6 can run in 'strict' mode or 'quirks' mode, as explained on quirksmode). In addition, whenever a browser incompatibility is detected during the forward analysis, the relevant line number and problematic browser field is reported. open-xchange nmail_spell_main.htmWe ran our tool on nmail_spell_main.htm, from the open-xchange project. The output was the following:
The problematic code is the function
The use of ipplan layersmenu-library.jsWe then ran our tool on layersmenu-libary.js from the IPplan project. The output was the following:
Some of the problematic code is the function
The use of ResultsWe successfully built a JavaScript compiler and static analysis tool to perform the following tasks:
We applied the tool to several examples. The tool worked as we expected, but due to a shortcoming in our analysis reported false positives. Given the time constraints of the project, we were very satisfied with the result. Even though the browser incompatibility detection wasn't perfect, we did produce a clean, general purpose static analysis framework for JavaScript. This framework could easily be extended to other forms of analysis, and was a great learning experience. BugsThere were some bugs discovered as the project deadline approached, and we didn't have time to fix them: On certain code examples, the tool gets stuck in an infinite loop while generating the control-flow graphs. The union operation performed during our backwards analysis turned out to be insufficient. When the expression of the
This was the case we considered while developing the analysis, but it became clear that the union operation wasn't always correct. Consider the following:
Assume all browsers are compatible going into the Future Work
References
SourceThe Java project can be downloaded from http://www.stanford.edu/~joshwise/jsc.zip. Project LogMilestone 1 - Parsing
Milestone 2 - CFG
Milestone 3 - Interpreter
Milestone 4 - Interprocedural Analysis
Milestone 5 - Optimizations
Milestone 6 - Browser Compatibility Analysis
Milestone 7 - Run in the wild
Last modified June 12, 2007 5:18 pm / Skin by Kevin Hughes
![]() |