For web applications, test-input generation poses unique problems. With traditional software, testers typically have well-defined interfaces that specify named parameters and their domains. In contrast, the interfaces of web applications are implicitly defined by the set of named parameters they access at run-time and the operations performed on the parameters's values. These implicit definitions mean that to effectively test web applications, testers must first identify interface information before they can generate test-inputs. Typical techniques that web developers use to identify an application's interface information, such as web crawling, are often inaccurate or incomplete, which means that parts of the application could be untested.
To address the interface identification problem, I developed a novel static analysis technique called wam that analyzes the code of a web application to identify interface information. wam generates a list of parameters that comprise each interface of the application and identifies domain information about each of those parameters. To identify interfaces, the wam analysis uses iterative data-flow analysis to group parameters that are accessed along the same path of execution. For domain information, wam uses definition-use analysis to identify domain-defining operations performed on those parameters. To evaluate the usefulness of wam, I generated test suites using interface information from wam and a web crawling based technique. The wam-based test suites were able to achieve 30% higher block coverage and 48% higher branch coverage than those based on web crawling information. In my current work, I am improving the precision of wam by leveraging symbolic execution techniques. Preliminary results have shown that the precision lead to an even higher increase in structural coverage while requiring an order of magnitude lower number of test cases.