Alex Orso

MINTS: A General Framework and Tool for Supporting Test-suite Minimization

H.-Y. Hsu and A. Orso

Test-suite minimization techniques aim to eliminate redundant test cases from a test-suite based on some criteria, such as coverage or fault-detection capability. Most existing test-suite minimization techniques have two main limitations: they perform minimization based on a single criterion and produce suboptimal solutions. In this paper, we propose a test-suite minimization framework that overcomes these limitations by allowing testers to (1) easily encode a wide spectrum of test-suite minimization problems, (2) handle problems that involve any number of criteria, and (3) compute optimal solutions by leveraging modern integer linear programming solvers. We implemented our framework in a tool, called MINTS, that is freely-available and can be interfaced with a number of different state-of-the-art solvers. Our empirical evaluation shows that MINTS can be used to instantiate a number of different test-suite minimization problems and efficiently find an optimal solution for such problems using different solvers.

[ To Appear ]

Automated Identification of Interface Mismatches in Web Applications

W. Halfond and A. Orso

Quality assurance techniques for web applications have become increasingly important as web applications have gained in popularity and become an essential part of our daily lives. To integrate content and data from multiple sources, the components of a web application communicate extensively among themselves. Unlike traditional program modules, the components communicate through interfaces and invocations that are not explicitly declared. Because of this, the communication between two components can fail due to a parameter mismatch between the interface invoked by a calling component and the interface provided by the called component. Parameter mismatches can cause serious errors in the web application and are difficult to identify using traditional testing and verification techniques. To address this problem, we propose a static-analysis based approach for identifying parameter mismatches. We also present an empirical evaluation of the approach, which we performed on a set of real web applications. The results of the evaluation are promising; our approach discovered 133 parameter mismatches in the subject applications.

[ To Appear ]

BERT: BEhavioral Regression Testing

A. Orso and Tao Xie

During maintenance, it is common to run the new version of a program against its existing test suite to check whether the modifications in the program introduced unforeseen side effects. Although this kind of regression testing can be effective in identifying some change-related faults, it is limited by the quality of the existing test suite. Because generating tests for real programs is expensive, developers build test suites by finding acceptable tradeoffs between cost and thoroughness of the tests. Such test suites necessarily target only a small subset of the program's functionality and may miss many regression faults. To address this issue, we introduce the concept of behavioral regression testing, whose goal is to identify behavioral differences between two versions of a program through dynamic analysis. Intuitively, given a set of changes in the code, behavioral regression testing works by (1) generating a large number of test cases that focus on the changed parts of the code, (2) running the generated test cases on the old and new versions of the code and identifying differences in the tests' outcome, and (3) analyzing the identified differences and presenting them to the developers. By focusing on a subset of the code and leveraging differential behavior, our approach can provide developers with more (and more focused) information than traditional regression testing techniques. This paper presents our approach and performs a preliminary assessment of its feasibility.

[ To Appear ]

Pooled ANOVA

M. Last, G. Luta, A. Orso, A. Porter and S. Young

We introduce Pooled ANOVA, a greedy algorithm to sequentially select the rare important factors from a large set of factors. Problems such as computer simulations and software performance tuning involve a large number of factors, few of which have an important effect on the outcome or performance measure. We pool multiple factors together, and test the pool for significance. If the pool has a significant effect we retain the factors for deconfounding. If not, we either declare that none of the factors are important, or retain them for follow-up decoding, depending on our assumptions and stage of testing. The sparser important factors are, the bigger the savings. Pooled ANOVA requires fewer assumptions than other, similar methods (e.g. sequential bifurcation), such as not requiring all important effects to have the same sign. We demonstrate savings of 25%-35% when compared to a conventional ANOVA, and also the ability to work in a setting where Sequential Bifurcation fails.

[ Abstract, BibTeX ref., PDF ]

WASP: Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation

W. Halfond, A. Orso, and P. Manolios

Many software systems have evolved to include a web-based component that makes them available to the public via the Internet and can expose them to a variety of web-based attacks. One of these attacks is SQL injection, which can give attackers unrestricted access to the databases underlying web applications and has become increasingly frequent and serious. This paper presents a new, highly automated approach for protecting web applications against SQL injection that has both conceptual and practical advantages over most existing techniques. From a conceptual standpoint, the approach is based on the novel idea of positive tainting and on the concept of syntax-aware evaluation. From a practical standpoint, our technique is precise and efficient and has minimal deployment requirements. We also present an extensive empirical evaluation of our approach performed using WASP, a tool that implements our technique. In the evaluation, we used WASP to protect a wide range of web applications while subjecting them to a large and varied set of attacks and legitimate accesses. WASP was able to stop all attacks and did not generate any false positives. Our studies also show that the overhead imposed by WASP was negligible in most cases.

[ BibTeX ref., PDF ]

Effective Memory Protection Using Dynamic Tainting

J. Clause, I. Doudalis, A. Orso, and M. Prvulovic

Programs written in languages that provide direct access to memory through pointers often contain memory-related faults, which may cause non-deterministic failures and even security vulnerabilities. In this paper, we present a new technique based on dynamic tainting for protecting programs from illegal memory accesses. When memory is allocated, at runtime, our technique taints both the memory and the corresponding pointer using the same taint mark. Taint marks are then suitably propagated while the program executes and are checked every time a memory address m is accessed through a pointer p; if the taint marks associated with m and p differ, the execution is stopped and the illegal access is reported. To allow for a low-overhead, hardware-assisted implementation of the approach, we make several key technical and engineering decisions in the definition of our technique. In particular, we use a configurable, low number of reusable taint marks instead of a unique mark for each area of memory allocated, which reduces the overhead of the approach without limiting its flexibility and ability to target most memory-related faults and attacks known to date. We also define the technique at the binary level, which lets us handle the (very) common case of applications that use third-party libraries whose source code is unavailable. To investigate the effectiveness and practicality of our approach, we implemented it for heap-allocated memory and performed a preliminary empirical study on a set of programs. Our results show that (1) our technique can identify a large class of memory-related faults, even when using only two unique taint marks, and (2) a hardware-assisted implementation of the technique could achieve overhead in the single digits.

[ BibTeX ref., PDF 220 Kb, ]

SCARPE: A Technique and Tool for Selective Record and Replay of Program Executions

S. Joshi and A. Orso

Because of software's increasing dynamism and the heterogeneity of execution environments, the results of in-house testing and maintenance are often not representative of the way the software behaves in the field. To alleviate this problem, we present a technique for capturing and replaying partial executions of deployed software. Our technique can be used for various applications, including generation of test cases from user executions and post-mortem dynamic analysis. We present our technique and tool, some possible applications, and a preliminary empirical evaluation that provides initial evidence of the feasibility of our approach.

[ BibTeX ref., PDF 1208 Kb, Slides 2288 Kb ]

Improving Test Case Generation for Web Applications Using Automated Interface Discovery

W. Halfond and A. Orso

With the growing complexity of web applications, identifying web interfaces that can be used for testing such applications has become increasingly challenging. Many techniques that worked effectively when applied to simple web applications fall short when used on modern dynamic web applications. To address this issue, we present a novel technique for analyzing web applications and discovering their interfaces based on a novel static analysis algorithm. We also present the results of an evaluation that shows the usefulness of our approach. We compare our approach against conventional techniques and show that our approach (1) discovers a higher number of interfaces and (2) can help generate test inputs that lead to higher structural coverage results.

[ BibTeX ref., PDF 472 Kb, Slides 612 Kb ]

Dytan: A Generic Dynamic Taint Analysis Framework

J. Clause, W. Li, and A. Orso

Dynamic taint analysis is gaining momentum. Techniques based on dynamic tainting have been successfully used in the context of application security, and now their use is also being explored in different areas, such as program understanding, software testing, and debugging. Unfortunately, most existing approaches for dynamic tainting are defined in an ad-hoc manner, which makes it difficult to extend them, experiment with them, and adapt them to new contexts. Moreover, most existing approaches are focused on data-flow based tainting only and do not consider tainting due to the control flow within a program, which limits their applicability outside the security domain. To address these limitations and foster experimentation with dynamic tainting techniques, we defined and developed a general framework for dynamic tainting that (1) is highly flexible and customizable, (2) allows for performing both data-flow and control-flow based tainting in a conservative way, and (3) does not rely on any special run-time system. We also present DYTAN, an implementation of our framework that works on x86 executables, and a set of preliminary studies that show how DYTAN can be used to implement several tainting-based approaches with limited effort. In the studies, we also show that DYTAN can be used on real software, by using Firefox as one of our subjects, and illustrate how the specific characteristics of the tainting approach used can affect efficiency and accuracy of the taint analysis, which further justifies the use of our framework to experiment with different variants of an approach.

[ BibTeX ref. ]

Using Component Metadata to Regression Test Component-based Software

A. Orso, H. Do, G. Rothermel, M.J. Harrold and D. Rosenblum

Increasingly, modern-day software systems are being built by combining externally-developed software components with application-specific code. For such systems, existing program-analysis-based software-engineering techniques may not directly apply, due to lack of information about components. To address this problem, the use of component metadata has been proposed. Component metadata are metadata and metamethods provided with components, that retrieve or calculate information about those components. In particular, two component-metadata-based approaches for regression test selection are described: one using code-based component metadata and the other using specification-based component metadata. The results of empirical studies that illustrate the potential of these techniques to provide savings in re-testing effort are provided.

[ BibTeX ref. ]

A Technique for Enabling and Supporting Debugging of Field Failures

J. Clause and A. Orso

It is difficult to fully assess the quality of software in-house, outside the actual time and context in which it will execute after deployment. As a result, it is common for software to manifest field failures, failures that occur on user machines due to untested behavior. Field failures are typically difficult to recreate and investigate on developer platforms, and existing techniques based on crash reporting provide only limited support for this task. In this paper, we present a technique for recording, reproducing, and minimizing failing executions that enables and support in-house debugging of field failures. We also present a tool that implements our technique and an empirical study that evaluates the technique on a widely used e-mail client.

[ BibTeX ref. ]

Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks

Murali Haran, Alan Karr, Michael Last, Alessandro Orso, Adam A. Porter, Ashish Sanil, and Sandro Fouché

There is an increasing interest in techniques that support analysis and measurement of fielded software systems. These techniques typically deploy numerous instrumented instances of a software system, collect execution data when the instances run in the field, and analyze the remotely collected data to better understand the system's in-the-field behavior. One common need for these techniques is the ability to distinguish execution outcomes (e.g., to collect only data corresponding to some behavior or to determine how often and under which condition a specific behavior occurs). Most current approaches, however, do not perform any kind of classification of remote executions and either focus on easily observable behaviors (e.g., crashes) or assume that outcomes' classifications are externally provided (e.g., by the users). To address the limitations of existing approaches, we have developed three techniques for automatically classifying execution data as belonging to one of several classes. In this paper, we introduce our techniques and apply them to the binary classification of passing and failing behaviors. Our three techniques impose different overheads on program instances and, thus, each is appropriate for different application scenarios. We performed several empirical studies to evaluate and refine our techniques and to investigate the trade-offs among them. Our results show that 1) the first technique can build very accurate models, but requires a complete set of execution data; 2) the second technique produces slightly less accurate models, but needs only a small fraction of the total execution data; and 3) the third technique allows for even further cost reductions by building the models incrementally, but requires some sequential ordering of the software instances' instrumentation.

[ BibTeX ref., PDF 1352 Kb, ]

Type-dependence Analysis and Program Transformation for Symbolic Execution

S. Anand, A. Orso, and M.J. Harrold

Symbolic execution can be problematic when applied to real applications. This paper addresses two of these problems: (1) the constraints generated during symbolic execution may be of a type not handled by the underlying decision procedure, and (2) some parts of the application may be unsuitable for symbolic execution (e.g., third-party libraries). The paper presents type-dependence analysis, which performs a context- and field-sensitive interprocedural static analysis to identify program entities that may store symbolic values at run-time. This information is used to identify the above two problematic cases and assist the user in addressing them. The paper also presents a technique to transform real applications for efficient symbolic execution. Instead of transforming the entire application, which can be inefficient and infeasible (mostly for pragmatic reasons), our technique leverages the results of type-dependence analysis to transform only parts of the program that may interact with symbolic values. Finally, the paper discusses the implementation of our analysis and transformation technique in a tool, stinger, and an empirical evaluation performed on two real applications. The results of the evaluation show the effectiveness of our approach

[ BibTeX ref., PDF 248 Kb ]

JDiff: A Differencing Technique and Tool for Object--Oriented Programs

T. Apiwattanapong, A. Orso, and M.J. Harrold

During software evolution, information about changes between different versions of a program is useful for a number of software engineering tasks. For example, configuration-management systems can use change information to assess possible conflicts among updates from different users. For another example, in regression testing, knowledge about which parts of a program are unchanged can help in identifying test cases that need not be rerun. For many of these tasks, a purely syntactic differencing may not provide enough information for the task to be performed effectively. This problem is especially relevant in the case of object-oriented software, for which a syntactic change can have subtle and unforeseen effects. In this paper, we present a technique for comparing object-oriented programs that identifies both differences and correspondences between two versions of a program. The technique is based on a representation that handles object-oriented features and, thus, can capture the behavior of object-oriented programs. We also present JDiff, a tool that implements the technique for Java programs. Finally, we present the results of four empirical studies, performed on many versions of two medium-sized subjects, that show the efficiency and effectiveness of the technique when used on real programs.

[ BibTeX ref. ]

Using Positive Tainting and Syntax-Aware Evaluation to Counter SQL Injection Attacks

W. Halfond, A. Orso, and P. Manolios

SQL injection attacks pose a serious threat to the security of Web applications because they can give attackers unrestricted access to databases that contain sensitive information. In this paper, we propose a new, highly automated approach for protecting existing Web applications against SQL injection. Our approach has both conceptual and practical advantages over most existing techniques. From the conceptual standpoint, the approach is based on the novel idea of positive tainting and the concept of syntax-aware evaluation. From the practical standpoint, our technique is at the same time precise and efficient and has minimal deployment requirements. The paper also describes WASP, a tool that implements our technique, and a set of studies performed to evaluate our approach. In the studies, we used our tool to protect several Web applications and then subjected them to a large and varied set of attacks and legitimate accesses. The evaluation was a complete success: WAS P successfully and efficiently stopped all of the attacks without generating any false positives.

[ BibTeX ref., PDF 600 Kb, Slides 1652 Kb ]

Command-Form Coverage for Testing Database Applications

W. Halfond and A. Orso

The testing of database applications poses new challenges for software engineers. In particular, it is difficult to thoroughly test the interactions between an application and its underlying database, which typically occur through dynamically-generated database commands. Because traditional code-based coverage criteria focus only on the application code, they are often inadequate in exercising these commands. To address this problem, we introduce a new test adequacy criterion that is based on coverage of the database commands generated by an application and specifically focuses on the application-database interactions. We describe the criterion, an analysis that computes the corresponding testing requirements, and an efficient technique for measuring coverage of these requirements. We also present a tool that implements our approach and a preliminary study that shows the approach's potential usefulness and feasibility.

[ Abstract, BibTeX ref., PDF 496 Kb Slides 3172 Kb ]

MaTRIX: Maintenance-Oriented Testing Requirements Identifier and Examiner

T. Apiwattanapong, R. Santelices, P. Kumar Chittimalli, A. Orso, and M.J. Harrold

This paper presents a new test-suite augmentation technique for use in regression testing of software. Our technique combines dependence analysis and symbolic evaluation and uses information about the changes between two versions of a program to (1) identify parts of the program affected by the changes, (2) compute the conditions under which the effects of the changes are propagated to such parts, and (3) create a set of testing requirements based on the computed information. Testers can use these requirements to assess the effectiveness of the regression testing performed so far and to guide the selection of new test cases. The paper also presents MaTRIX, a tool that partially implements our technique, and its integration into a regression-testing environment. Finally, the paper presents a preliminary empirical study performed on two small programs. The study provides initial evidence of both the effectiveness of our technique and the shortcomings of previous techniques in assessing the adequacy of a test suite with respect to exercising the effect of program changes.

[ Abstract, BibTeX ref., PDF 256 Kb Slides 2240 Kb ]

Capture and Replay of User Executions to Improve Software Quality

S. Joshi and A. Orso

Because today's software is increasingly dynamic and runs in heterogeneous environments, it is difficult to assess software systems outside the actual context in which they execute. Therefore, the results of in-house quality-assurance activities, such as testing, are often not representative of the way the software will behave in the field. To help addressing this problem, we present a technique for capturing and replaying partial executions of deployed software. Given an application, the technique allows for selecting a subsystem, capturing the dynamic interactions between such subsystem and the rest of the application, and replaying the recorded interactions on the subsystem in isolation. Our technique is designed to be efficient, in that we only capture relevant dynamic information and disregard all data that, although flowing through the boundary of the subsystem of interest, do not affect the execution. After describing our technique, we discuss several possible applications of the approach, including generation of test cases from users' executions and post-mortem dynamic analysis. In the paper, we also present SCARPE, a prototype tool that implements our technique for Java programs. Finally, we present an empirical evaluation of the technique performed using SCARPE. The results of the studies show the feasibility of the approach.

[ Technical Report 1748 Kb ]

Preventing SQL Injection Attacks Using AMNESIA

W. Halfond and A. Orso

AMNESIA is a tool that detects and prevents SQL injection attacks by combining static analysis and runtime monitoring. Empirical evaluation has shown that AMNESIA is both effective and efficient against SQL injection.

[ BibTeX ref., PDF 208 Kb, Slides 676 Kb ]

Isolating relevant Component Interactions with JINSI

A. Orso, S. Joshi, M. Burger, and A. Zeller

When a component in a large system fails, developers encounter two problems: (1) reproducing the failure, and (2) investigating the causes of such a failure. Our JINSI tool lets developers capture and replay the interactions between a component and its environment, thus allowing for reproducing the failure at will. In addition, JINSI uses delta debugging to automatically isolate the subset of the interactions that is relevant for the failure. In a first study, JINSI has successfully isolated the relevant interaction of a JAVA component: "Out of the 32 interactions with the VendingMachine component, seven interactions suffice to produce the failure."

[ BibTeX ref., PDF 312 Kb, ]

Recognizing Behavioral Patterns at Runtime using Finite Automata

L. Wendehals and A. Orso

During reverse engineering, developers often need to understand the undocumented design of a software. In particular, recognizing design patterns in the software can provide reverse engineers with considerable insight on the software structure and its internal characteristics. Researchers have therefore proposed techniques based on static analysis to automatically recover design patterns in a program. Unfortunately, most design patterns comprise not only structural, but also significant behavioral aspects. Although static analysis is well suited for the recognition of structural aspects, it is typically limited and imprecise in analyzing behavior. To address this limitation, we present a new technique that complements our existing static analysis with a dynamic analysis, so as to perform a more accurate design-pattern recognition. The dynamic analysis is based on (1) transforming behavioral aspects of design patterns into finite automata, (2) identifying and instrumenting relevant method calls, and (3) monitoring relevant calls at runtime and matching them against the automata. The results of the dynamic analysis are then used to compute the likelihood of a pattern to be in the code. This paper describes our technique and presents a preliminary empirical study performed to assess the technique.

[ BibTeX ref., PDF 308 Kb, ]

A Classification of SQL Injection Attacks and Prevention Techniques

W. Halfond, J. Viegas and A. Orso

SQL injection attacks pose a serious security threat to Web applications: they allow attackers to obtain unrestricted access to the databases underlying the applications and to the potentially sensitive information these databases contain. Although researchers and practitioners have proposed various methods to address the SQL injection problem, current approaches either fail to address the full scope of the problem or have limitations that prevent their use and adoption. Many researchers and practitioners are familiar with only a subset of the wide range of techniques available to attackers who are trying to take advantage of SQL injection vulnerabilities. As a consequence, many solutions proposed in the literature address only some of the issues related to SQL injection. To address this problem, we present an extensive review of the different types of SQL injection attacks known to date. For each type of attack, we provide descriptions and examples of how attacks of that type could be performed. We also present and analyze existing detection and prevention techniques against SQL injection attacks. For each technique, we discuss its strengths and weaknesses in addressing the entire range of SQL injection attacks.

[ BibTeX ref., PDF 136 Kb, Slides 1364 Kb ]

InsECTJ: A Generic Instrumentation Framework for Collecting Dynamic Information within Eclipse

A. Seesing and A. Orso

The heterogeneity and dynamism of today's software systems make it difficult to assess the performance, correctness, or security of a system outside the actual time and context in which it executes. As a result, there is an increasing interest in techniques for monitoring and dynamically analyzing the runtime behavior of an application. These techniques are usually implemented using ad-hoc solutions that result in tools that are hard to develop, maintain, and reuse. To address this problem, we present a generic framework that enables the collection of different kinds of runtime information, such as data about the execution of various code entities and constructs. The framework lets users easily define how to process the collected information and is extensible to collect new types of information. We implemented the framework for the Java language as a set of Eclipse plug-ins. The plug-ins provide an intuitive GUI through which the functionality of the framework can be exploited. In the paper, we also present an example of use of our framework and plug-ins that helps illustrate their functionality and shows their ease of use.

[ BibTeX ref., PDF 160 Kb, Slides 2060 Kb ]

AMNESIA: Analysis and Monitoring for NEutralizing SQL-Injection Attacks

W. Halfond and A. Orso

The use of web applications has become increasingly popular in our routine activities, such as reading the news, paying bills, and shopping on-line. As the availability of these services grows, we are witnessing an increase in the number and sophistication of attacks that target them. In particular, SQL injection, a class of code-injection attacks in which specially crafted input strings result in illegal queries to a database, has become one of the most serious threats to web applications. In this paper we present and evaluate a new technique for detecting and preventing SQL injection attacks. Our technique uses a model-based approach to detect illegal queries before they are executed on the database. In its static part, the technique uses program analysis to automatically build a model of the legitimate queries that could be generated by the application. In its dynamic part, the technique uses runtime monitoring to inspect the dynamically-generated queries and check them against the statically-built model. We developed a tool, AMNESIA, that implements our technique and used the tool to evaluate the technique on seven web applications. In the evaluation we targeted the subject applications with a large number of both legitimate and malicious inputs and measured how many attacks our technique detected and prevented. The results of the study show that our technique was able to stop all of the attempted attacks without generating any false positives.

[ BibTeX ref., PDF 296 Kb, Slides 744 Kb ]

MonDe: Safe Updating through Monitored Deployment of New Component Versions

J. Cook and A. Orso

Safely updating software at remote sites is a cautious balance of enabling new functionality and avoiding adverse effects on existing functionality. A useful first step in this process would be to evaluate the performance of a new version of a component on the current workload before enabling its functionality. This step would let the engineers assess the component's performance over more (and more realistic) data points than by simply performing regression testing in-house. In this paper we propose to evaluate the performance of a new version of a component by (1) deploying it to remote sites, (2) running it in a controlled environment with the actual workloads being generated at that site, and (3) reporting the results back to the development engineers. Running the new version can either be done on-line, alongside the current system, or offline, using capture-replay techniques. By running at the remote site and reporting concise results, issues of data security, protection, and confidentiality are diminished, yet the new version can be evaluated on real workloads.

[ BibTeX ref., PDF 104 Kb, Slides 1820 Kb ]

Applying Classification Techniques to Remotely-Collected Program Execution Data

M. Haran, A. Karr, A. Orso, A. Porter, and A. Sanil

There is an increasing interest in techniques that support measurement and analysis of fielded software systems. One of the main goals of these techniques is to better understand how software actually behaves in the field. In particular, many of these techniques require a way to distinguish, in the field, failing from passing executions. So far, researchers and practitioners have only partially addressed this problem: they have simply assumed that program failure status is either obvious (i.e., the program crashes) or provided by an external source (e.g., the users). In this paper, we propose a technique for automatically classifying execution data, collected in the field, as coming from either passing or failing program runs. (Failing program runs are executions that terminate with a failure, such as a wrong outcome.) We use statistical learning algorithms to build the classification models. Our approach builds the models by analyzing executions performed in a controlled environment (e.g., test cases run in-house) and then uses the models to predict whether execution data produced by a fielded instance were generated by a passing or failing program execution. We also present results from an initial feasibility study, based on multiple versions of a software sub ject, in which we investigate several issues vital to the applicability of the technique. Finally, we present some lessons learned regarding the interplay between the reliability of classification models and the amount and type of data collected.

[ BibTeX ref., PDF 152 Kb, Slides 704 Kb ]

Selective Capture and Replay of Program Executions

A. Orso and B. Kennedy

In this paper, we present a technique for selective capture and replay of program executions. Given an application, the technique allows for (1) selecting a subsystem of interest, (2) capturing at runtime all the interactions between such subsystem and the rest of the application, and (3) replaying the recorded interactions on the subsystem in isolation. The technique can be used in several scenarios. For example, it can be used to generate test cases from users' executions, by capturing and collecting partial executions in the field. For another example, it can be used to perform expensive dynamic analyses offline. For yet another example, it can be used to extract subsystem or unit tests from system tests. Our technique is designed to be efficient, in that we only capture information that is relevant to the considered execution. To this end, we disregard all data that, although flowing through the boundary of the subsystem of interest, do not affect the execution. In the paper, we also present a preliminary evaluation of the technique performed using SCARPE, a prototype tool that implements our approach.

[ BibTeX ref., PDF 132 Kb, Slides 1780 Kb ]

Combining Static Analysis and Runtime Monitoring to Counter SQL-Injection Attacks

W. Halfond and A. Orso

Our dependence on web applications has increased considerably in the last years, and we continue to integrate them into our everyday routine activities. When we are making reservations, paying bills, and shopping on-line, we expect these web applications to be secure and reliable. However, as the availability of these services has increased, there has been a corresponding increase in the number and sophistication of attacks that target them. One of the most serious types of attack against web applications is SQL injection. SQL injection refers to a class of code-injection attacks in which user input is included in a SQL query in such a way that part of the input is treated as code. Using SQL injection, attackers can leak confidential information, such as credit card numbers, from web applications' databases and even corrupt the database. In this paper, we propose a novel technique against SQL-injection. The technique combines conservative static analysis and runtime monitoring to detect and stop illegal queries before they are executed on the database. In its static part, the technique builds a conservative model of the legitimate queries that could be generated by the application. In its dynamic part, the technique inspects the dynamically generated queries for compliance with the statically-built model. We also present a preliminary evaluation of the technique performed on two small web applications. The results of the evaluation are promising in that our technique was able to prevent the attacks that we performed on the two applications.

[ BibTeX ref., PDF 464 Kb, Slides 596 Kb ]

Efficient and Precise Dynamic Impact Analysis Using Execute-After Sequences

T. Apiwattanapong, A. Orso, and M.J. Harrold

As software evolves, impact analysis estimates the potential effects of changes, before or after they are made, by identifying which parts of the software may be affected by such changes. Traditional impact-analysis techniques are based on static analysis and tend to identify most of the software as affected by the changes, due to their conservative assumptions. More recently, researchers have begun to investigate dynamic impact-analysis techniques, which rely on dynamic, rather than static, information about software behavior. Existing dynamic impact-analysis techniques are either very expensive---in terms of execution overhead or amount of dynamic information collected---or imprecise. In this paper, we present a new technique for dynamic impact analysis that is almost as efficient as the most efficient existing technique and is as precise as the most precise existing technique. The technique is based on a novel algorithm that collects (and analyzes) only the essential dynamic information required for the analysis. We discuss our technique, prove its correctness, and present a set of empirical studies in which we compare our new technique with two existing techniques, in terms of performance and precision.

[ BibTeX ref., PDF 168 Kb, Slides 356 Kb ]

Scaling Regression Testing to Large Software Systems

A. Orso, N. Shi, and M.J. Harrold

When software is modified, during development and maintenance, it is regression tested to provide confidence that the changes did not introduce unexpected errors and that new features behave as expected. One important problem in regression testing is how to select a subset of test cases, from the test suite used for the original version of the software, when testing a modified version of the software. Regression-test-selection techniques address this problem. Safe regression-test-selection techniques select every test case in the test suite that may behave differently in the original and modified versions of the software. Among existing safe regression testing techniques, efficient techniques are often too imprecise and achieve little savings in testing effort, whereas precise techniques are too expensive when used on large systems. This paper presents a new regression-test-selection technique for Java programs that is safe, precise, and yet scales to large systems. It also presents a tool that implements the technique and studies performed on a set of subjects ranging from 70 to over 500 KLOC. The studies show that our technique can efficiently reduce the regression testing effort and, thus, achieve considerable savings.

[ BibTeX ref., PDF 256 Kb, Slides (PDF) 812 Kb ]

A Generic Instrumentation Framework for Collecting Dynamic Information

A. Chawla and A. Orso

Performing empirical research in software testing involves executing a set of subjects against one or more test suites and measuring some characteristics of these executions. Such measures are often collected using ad-hoc instrumentation, by inserting probes in the code that collect and report dynamic information at run-time. Another possible approach is to collect the needed information by leveraging capabilities of the runtime system. Both these approaches usually result in measurement tools that are not flexible and are hard to reuse and modify. To address this problem, we present a generic framework for collecting information on the runtime behavior of a Java program. The framework allows for easily collecting different kinds of dynamic information for a set of executions of the program, such as coverage and profiling of various code entities and program traces at different levels. The framework also lets users easily define how to process the collected information. In the paper, we also present a case study that we performed to evaluate the framework, and that shows its effectiveness and efficiency.

[ BibTeX ref., PDF 92 Kb ]

A Differencing Algorithm for Object-oriented Programs

T. Apiwattanapong, A. Orso, and M.J. Harrold

During software evolution, information about changes between different versions of a program is useful for a number of software engineering tasks. For example, in regression testing, knowing which parts of a program are unchanged can help identifying test cases that need not be rerun. For many of these tasks, a purely syntactic differencing may not provide enough information for the task to be performed effectively. This problem is especially relevant in the case of object-oriented software, for which a syntactic change can have subtle and unforeseen effects. In this paper, we present a technique for comparing object-oriented programs that identifies both differences and correspondences between two versions of a program. The technique is based on a representation that handles object-oriented features and, thus, can capture the behavior of object-oriented programs. We also present JDiff, a tool that implements the technique for Java programs, and empirical results that show the efficiency and effectiveness of the technique on a real program.

[ BibTeX ref., PDF 428 Kb Slides (PDF) 1868 Kb ]

Classifying Data Dependences in the Presence of Pointers for Program Comprehension, Testing, and Debugging

A. Orso, S. Sinha, and M.J. Harrold

Understanding data dependences in programs is important for many software-engineering activities, such as program understanding, impact analysis, reverse engineering, and debugging. The presence of pointers can cause subtle and complex data dependences that can be difficult to understand. For example, in languages such as C, an assignment made through a pointer dereference can assign a value to one of several variables, none of which may appear syntactically in that statement. In the first part of this paper, we describe two techniques for classifying data dependences in the presence of pointer dereferences. The first technique classifies data dependences based on definition type, use type, and path type. The second technique classifies data dependences based on span. We present empirical results to illustrate the distribution of data-dependence types and spans for a set of real C programs. In the second part of the paper, we discuss two applications of the classification techniques. First, we investigate different ways in which the classification can be used to facilitate data-flow testing and verification. We outline an approach that uses types and spans of data dependences to determine the appropriate verification technique for different data dependences; we present empirical results to illustrate the approach. Second, we present a new slicing approach that computes slices based on types of data dependences. Based on the new approach, we define an incremental slicing technique that computes a slice in multiple steps. We present empirical results to illustrate the sizes of incremental slices and the potential usefulness of incremental slicing for debugging.

[ BibTeX ref. ]

Gammatella: Visualizing Program-Execution Data for Deployed Software

J. Jones, A. Orso, and M.J. Harrold

Software systems are often released with missing functionality, errors, or incompatibilities that may result in failures in the field, inferior performances, or, more generally, user dissatisfaction. In previous work, some of the authors presented the Gamma approach, whose goal is to improve software quality by augmenting software-engineering tasks with dynamic information collected from deployed software. The Gamma approach enables analyses that (1) rely on actual field data instead of synthetic in-house data and (2) leverage the vast and heterogeneous resources of an entire user community instead of limited, and often homogeneous, in-house resources.

When monitoring a large number of deployed instances of a software product, however, a significant amount of data is collected. Such raw data are useless in the absence of suitable data-mining and visualization techniques that support exploration and understanding of the data. In this paper, we present a new technique for collecting, storing, and visualizing program-execution data gathered from deployed instances of a software product. We also present a prototype toolset, Gammatella, that implements the technique. Finally, we show how the visualization capabilities of Gammatella facilitate effective investigation of several kinds of execution-related information in an interactive fashion, and discuss our initial experience with a semi-public display of Gammatella.

[ BibTeX ref. ]

An Empirical Comparison of Dynamic Impact Analysis Algorithms

A. Orso, T. Apiwattanapong, J. Law, G. Rothermel, and M.J. Harrold

Impact analysis - determining the potential effects of changes on a software system - plays an important role in software engineering tasks such as maintenance, regression testing, and debugging. In previous work, two new dynamic impact analysis techniques, CoverageImpact and PathImpact, were presented. These techniques perform impact analysis based on data gathered about program behavior relative to specific inputs, such as inputs gathered from field data, operational profile data, or test-suite executions. Due to various characteristics of the algorithms they employ, CoverageImpact and PathImpact are expected to differ in terms of cost and precision; however, there have been no studies to date examining the extent to which such differences may emerge in practice. Since cost-precision tradeoffs may play an important role in technique selection and further research, we wished to examine these tradeoffs. We therefore designed and performed an empirical study, comparing the execution and space costs of the techniques, as well as the precisions of the impact analysis results that they report. This paper presents the results of this study.

[ BibTeX ref., PDF 280 Kb ]

Automated Support for Development, Maintenance, and Testing in the Presence of Implicit Control Flow

S. Sinha, A. Orso, and M.J. Harrold

Although object-oriented languages can improve programming practices, their characteristics may introduce new problems for software engineers. One important problem is the presence of implicit control flow caused by exception handling and polymorphism. Implicit control flow causes complex interactions in programs, and can thus complicate software-engineering tasks. To address this problem, we present a systematic and structured approach for supporting these tasks, based on the static and dynamic analyses of constructs that cause implicit control flow. Our approach provides software engineers with information for supporting and guiding development and maintenance tasks. We also present empirical results to illustrate the potential usefulness of our approach. Our studies show that, for the subjects considered, complex implicit control flow is always present and is generally not adequately exercised.

[ BibTeX ref., PDF 244 Kb ]

Improving Dynamic Analysis through Partial Replay of Users' Executions (presentation)

A. Orso and B. Kennedy

Software products are often released with missing functionality, errors, or incompatibilities that may result in failures, inferior performances, or, more generally, user dissatisfaction. In previous work, we presented the Gamma approach, which facilitates remote analysis and measurement of deployed software and allows for gathering program-execution data from the field.
In this work, we present a technique to capture "partial" users' executions in the field and to replay them in-house. The capturing functionality works by (1) identifying a subset of the system whose dynamic behavior we want to capture, (2) identifying the boundary of the selected subsystem, and (3) capturing information that flows through the boundary using dynamic bytecode rewriting. The replaying functionality works by (1) automatically generating a driver for the subsystem, (2) modifying the subsystem to be replayed (again, through rewriting), and (3) replaying the subsystem using the captured information.
The technique can be used in several scenarios. For example, it can be used to replay partial executions in-house and perform dynamic analyses that are too expensive, in terms of overhead, to be performed in the field. For another example, it can be used to extract subsystem or unit tests from system tests. We have implemented a prototype tool that implements the technique and are currently performing experiments to evaluate the approach for different scenarios.

[ BibTeX ref., Slides (PDF) 1248 Kb ]

Leveraging Field Data for Impact Analysis and Regression Testing

A. Orso, T. Apiwattanapong, and M.J. Harrold

Software products are often released with missing functionality, errors, or incompatibilities that may result in failures, inferior performances, or, more generally, user dissatisfaction. In previous work, we presented the Gamma approach, which facilitates remote analysis and measurement of deployed software and allows for gathering program-execution data from the field. In this paper, we investigate the use of the Gamma approach to support and improve two fundamental tasks performed by software engineers during maintenance: impact analysis and regression testing. We present a new approach that leverages field data to perform these two tasks. We also present a set of empirical studies that we performed to assess the usefulness of the approach. The studies were performed on a real subject and on a real user population. The results of the studies show that the use of field data is effective and, for the cases considered, can considerably affect the results of dynamic analyses. Moreover, the empirical studies show that the approach is also efficient: the kind of field data that we consider requires very limited space and little instrumentation to be collected.

[ BibTeX ref., PDF 284 Kb, Slides (PDF) 1984 Kb ]

Visualization of Program-Execution Data for Deployed Software

A. Orso, J. Jones, and M.J. Harrold

Software products are often released with missing functionality, errors, or incompatibilities that may result in failures in the field, inferior performances, or, more generally, user dissatisfaction. In previous work, we presented the Gamma technology, which facilitates remote analysis and measurement of deployed software and allows for gathering program-execution data from the field. When monitoring a high number of deployed instances of a software product, however, a large amount of data is collected. Such raw data are useless in the absence of a suitable data-mining and visualization technique that supports exploration and understanding of the data. In this paper, we present a new technique for collecting, storing, and visualizing program-execution data gathered from deployed instances of a software product. We also present a prototype toolset, Gammatella, that implements the technique. We show how the visualization capabilities of Gammatella allows for effectively investigating several kinds of execution-related information in an interactive fashion.

[ BibTeX ref., PDF 684 Kb Slides (PDF) 2792 Kb ]

Interclass Testing of Object Oriented Software

V. Martena, A. Orso, and M. Pezzè

The characteristics of object-oriented software affect type and relevance of faults. In particular, the state of the objects may cause faults that cannot be easily revealed with traditional testing techniques. This paper proposes a new technique for interclass testing, that is, the problem of deriving test cases for suitably exercising interactions among clusters of classes. The proposed technique uses data-flow analysis for deriving a suitable set of test case specifications for interclass testing. The paper then shows how to automatically generate feasible test cases that satisfy the derived specifications using symbolic execution and automated deduction. Finally, the paper demonstrates the effectiveness of the proposed technique by deriving test cases for a microscope controller developed for the European Space Laboratory of the Columbus Orbital Facility.

[ BibTeX ref., PDF 104 Kb ]

Monitoring Deployed Software Using Software Tomography

J. Bowring, A. Orso, and M.J. Harrold

Software products are often released with missing functionality or errors that result in failures in the field. In previous work, we presented the GAMMA technology, which facilitates remote monitoring of deployed software and allows for a prompt reaction to failures. In this paper, we investigate one of the principal technologies on which GAMMA is based: software tomography. Software tomography splits monitoring tasks across many instances of the software, so that partial information can be (1) collected from users by means of light-weight instrumentation and (2) merged to gather the overall monitoring information. After describing the technology, we illustrate an instance of software tomography for a specific monitoring task. We also present two case studies that we performed to evaluate the presented technique on a real program. The results of the studies show that software tomography can be successfully applied to collect accurate monitoring information using only minimal instrumentation on each deployed program instance.

[ BibTeX ref., PDF 124 Kb ]

A Technique for Dynamic Updating of Java Software

A. Orso, A. Rao, and M.J. Harrold

During maintenance, systems are updated to correct faults, improve functionality, and adapt the software to changes in its execution environment. The typical software-update process consists of stopping the system to be updated, performing the update of the code, and restarting the system. For systems such as banking and telecommunication software, however, the cost of downtime can be prohibitive. The situation is even worse for systems such as air-traffic controllers and life-support software, for which a shut-down is in general not an option. In those cases, the use of some form of on-the-fly program modification is required. In this paper, we present a new technique for dynamic updating of Java software. Our technique is based on the use of proxy classes and requires no support from the runtime system. The technique allows for updating a running Java program by substituting, adding, and deleting classes. We also present DUSC (Dynamic Updating through Swapping of Classes), a tool that we developed and that implements our technique. Finally, we describe an empirical study that we performed to validate the technique on a real Java subject. The results of the study show that our technique can be effectively applied to Java software with only little overhead in both execution time and program size.

[ BibTeX ref., PDF 128 Kb, PostScript.gz 60 Kb ]

Gamma System: Continuous Evolution of Software after Deployment

A. Orso, D. Liang, M.J. Harrold, and R. Lipton

In this paper, we present the GAMMA system, which facilitates remote monitoring of deployed software using a new approach that exploits the opportunities presented by a software product being used by many users connected through a network. GAMMA splits monitoring tasks across different instances of the software, so that partial information can be collected from different users by means of light-weight instrumentation, and integrated to gather the overall monitoring information. This system enables software producers (1) to perform continuous, minimally intrusive analyses of their software's behavior, and (2) to use the information thus gathered to improve and evolve their software.

[ BibTeX ref., PDF 124 Kb, PostScript.gz 76 Kb ]

Using Component Metacontents to Support the Regression Testing of Component-Based Software

A. Orso, M.J. Harrold, D. Rosenblum, G. Rothermel, M.L. Soffa, and H. Do

Interest in component-based software continues to grow with the recognition of its potential in managing the increasing complexity of software systems. However, the use of externally-provided components has serious drawbacks, in most cases due to the lack of information about the components, for a wide range of activities in the engineering of component-based applications. Consider the activity of regression testing, whose high cost has been, and continues to be, a problem. In the case of component-based applications, regression testing can be even more expensive. When a new version of one or more components is integrated into an application, the lack of information about such externally-developed components makes it difficult to effectively determine the test cases that should be rerun on the resulting application. In previous work, we proposed the use of metadata, which are additional data provided with a component, to support software engineering tasks. In this paper, we present two new metadata-based techniques that address the problem of regression test selection for component-based applications: a code-based approach and a specification-based approach. First, using an example, we illustrate the two techniques. Then, we present a case study that applies the code-based technique to a real component-based system. The results of the study indicate that, on average, 26% of the overall testing effort can be saved over seven releases of the component-based system studied, with a maximum savings of 99% of the testing effort for one version. This reduction demonstrates that metadata can produce benefits in regression testing by reducing the costs related to this activity.

[ BibTeX ref., PDF 116 Kb, PostScript.gz 52 Kb ]

Incremental Slicing Based on Data-Dependences Types

A. Orso, S. Sinha, and M.J. Harrold

Program slicing is useful for assisting with software-maintenance tasks, such as program understanding, debugging, impact analysis, and regression testing. The presence and frequent usage of pointers, in languages such as C, causes complex data dependences. To function effectively on such programs, slicing techniques must account for pointer-induced data dependences. Although many existing slicing techniques function in the presence of pointers, none of those techniques distinguishes data dependences based on their types. This paper presents a new slicing technique, in which slices are computed based on types of data dependences. This new slicing technique offers several benefits and can be exploited in different ways, such as identifying subtle data dependences for debugging purposes, computing reduced-size slices quickly for complex programs, and performing incremental slicing. In particular, this paper describes an algorithm for incremental slicing that increases the scope of a slice in steps, by incorporating different types of data dependences at each step. The paper also presents empirical results to illustrate the performance of the technique in practice. The experimental results show how the sizes of the slices grow for different small- and medium-sized subjects. Finally, the paper presents a case study that explores a possible application of the slicing technique for debugging.

[ BibTeX ref., PDF 172 Kb, PostScript.gz 116 Kb ]

Regression Test Selection for Java Software

M.J. Harrold, J. Jones, T. Li, D. Liang, A. Orso, M. Pennings, S. Sinha, S. Spoon, and A. Gujarathi

As software evolves during development and maintenance, regression testing is applied to modified versions of the software to provide confidence that the changed parts behave as intended and that the unchanged parts have not been adversely affected by the modifications. To reduce the cost of regression testing, test cases are selected from the test suite that was used to test the original version of the software---this process is called regression test selection. A safe regression-test-selection algorithm selects every test case in the test suite that may reveal a fault in the modified software. Safe regression-test-selection techniques can help to reduce the time required to perform regression testing because they select only a portion of the test suite for use in the testing but guarantee that the faults revealed by this subset will be the same as those revealed by running the entire test suite. This paper presents the first safe regression-test-selection technique that, based on the use of a suitable representation, handles the features of the Java language. Furthermore, unlike other safe regression test selection techniques, the presented technique also handles incomplete programs. The technique can thus be safely applied in the (very common) case of Java software that uses external libraries or components; the analysis of the external code is not required for the technique to select test cases for this kind of software. The paper also describes RETEST, a regression-test-selection system that implements our technique, and a set of empirical studies that demonstrate the potential savings that can be achieved using our technique. These studies show that, for the subjects we considered, the regression-test-selection algorithm can be effective in reducing the size of the test suite.

[ BibTeX ref., PDF 268 Kb ]

MASSA: Mobile Agents Security through Static/Dynamic Analysis

A. Orso, G. Vigna, and M.J. Harrold

Existing Mobile Agent Systems (MASs) suffer from security problems that must be solved, if mobile code is to be used in the development of mission-critical, real-world applications. In this paper we propose a framework that provides automated support for verification and analysis of MASs and allows for identifying security issues before the MASs are placed into action. The proposed approach is centered around an abstract reference model and combines static and dynamic security analysis techniques explicitly tailored to mobile agent systems. Our preliminary work shows that, in the cases we studied, appropriate analysis techniques facilitate the identification of security vulnerabilities in MASs.

[ BibTeX ref., PDF 72 Kb, PostScript.gz 32 Kb ]

Effects of Pointers on Data Dependences

A. Orso, S. Sinha, and M.J. Harrold

This paper presents a technique for computing and classifying data dependences that takes into account the complexities introduced by specific language constructs, such as pointers, arrays, and structures. The classification is finer-grained than previously proposed classifications. Moreover, unlike previous work, the paper presents empirical results that illustrate the distribution of data dependences for a set of C subjects. The paper also presents a potential application for the proposed classification---program slicing---and a technique that computes slices based on data-dependence types. This technique facilitates the use of slicing for program understanding because a user can either augment a slice incrementally, by incorporating data dependences based on their relevance, or focus on specific kinds of dependences. Finally, the paper presents a case study that shows how the incremental addition of data dependences allows for growing the size of the slices in steps.

[ BibTeX ref., PDF 104 Kb, PostScript.gz 340 Kb ]

Component Metadata for Software Engineering Tasks

This paper presents a framework that lets a component developer provide a component user with different kinds of information, depending on the specific context and needs. The framework is based on presenting this information in the form of metadata. Metadata describe static and dynamic aspects of the component, can be accessed by the user, and can be used for different tasks throughout the software engineering lifecycle. The framework is defined in a general way, so that the metadata can be easily extended if new types of data have to be provided. In our approach, we define a unique format and a unique tag for each kind of metadata provided. The tag lets the user of the component both treat the information provided as metadata in the correct way and query for a specific piece of information. We motivate the untapped potential of component metadata by showing the need for metadata in the context of testing and analysis of distributed component-based systems, and introduce our framework with the help of an example. We sketch a possible scenario consisting of an application developer who wants to perform two different software engineering tasks on her application: generating self-checking code and program slicing. A. Orso, M.J. Harrold, and D. Rosenblum

[ BibTeX ref., PDF 220 Kb, PostScript.gz 76 Kb ]

Automated Testing of Classes

U. buy, A. Orso and M. Pezzè

Programs developed with object technologies have unique features that often make traditional testing methods inadequate. Consider, for instance, the dependence between the state of an object and the behavior of that object: The outcome of a method executed by an object often depends on the state of the object when the method is invoked. It is therefore crucial that techniques for testing of classes exercise class methods when the method's receiver is in different states. The state of an object at any given time depends on the sequence of messages received by the object up to that time. Thus, methods for testing object-oriented software should identify sequences of method invocations that are likely to uncover potential defects in the code under test. However, testing methods for traditional software do not provide this kind of information. In this paper, we use data flow analysis, symbolic execution, and automated deduction to produce sequences of method invocations exercising a class under test. Since the static analysis techniques that we use are applied to different subproblems, the method proposed in this paper can automatically generate information relevant to testing even when symbolic execution and automated deduction cannot be completed successfully.

[ BibTeX ref., PDF 388 Kb ]

Integration Testing of Procedural Object-Oriented Languages with Polymorphism

A. Orso and M.Pezzè

Object oriented features like information hiding, inheritance, polymorphism, and dynamic binding, present new problems, that cannot be adequately solved with traditional testing approaches. In this paper we address the problem of integration testing of procedural object oriented systems in the presence of polymorphism. The kind of polymorphism addressed is inclusion polymorphism, as provided by languages like C++, Eiffel, CLOS, Ada95, and Java. This paper proposes a technique for testing combinations of polymorphic calls. The technique applies a data-flow approach to newly defined sets of polymorphic definitions and uses, i.e., definitions and uses that depend on polymorphic calls. The proposed testing criteria allow for selecting execution paths that can reveal failures due to incorrect combinations of polymorphic calls. The paper describes the technique and illustrates its efficacy through an example.

[ BibTeX ref., PDF 292 Kb, PostScript.gz 112 Kb ]

A Framework for Testing Object-Oriented Components

U. Buy, C. Ghezzi, A. Orso, M.Pezzè, and M. Valsasna

The terms "component" and "component-based software engineering" are relatively recent. Although there is broad agreement on the meaning of these terms, different authors have sometimes used slightly different interpretations. Following Brown and Wellnau, here we view a component as "a replaceable software unit with a set of contractually-specified interfaces and explicit context dependencies only." To date, many component systems are designed and developed using object technologies. The same is true of middleware architectures that support the deployment of components, such as Java Beans, CORBA, and DCOM. Although there is consensus that the issues in component-based software engineering cannot be solved merely by object technologies, these technologies will clearly play a fundamental role in the production of software components.
Programs developed with object technologies have unique features that often render traditional testing techniques inadequate. An important feature of these programs is that the behavior of a method may depend on the state of the method's receiver. Of course, the state of an object at a given time reflects the sequence of messages received by the object up to that time. As a result, erroneous behaviors may only be revealed by exercising specific sequences of messages. Component testing should identify sequences of messages whose execution is likely to show the existence of defects.
Our method uses various kinds of structural analyses in order to produce sequences of method invocations for the component under test. First, we rely on data-flow analysis in order to identify paths of interest in the component under test. Next, we apply a combination of symbolic execution and automated deduction techniques in effort to generate sequences of method invocations that exercise the paths identified earlier. A byproduct of our testing method is the definition of formal specifications in the form of preconditions and postconditions for the methods contained in the component under test. When performing integration testing, the method sequences resulting from the analysis of single components are then used jointly.
Evidently, developers and users of a component have different needs and expectations from testing activities. Our framework for component testing can benefit both parties involved. On the one hand, the method sequences we identify can be used as test cases to be applied during the development of the component under test. On the other hand, during integration testing component users can combine sequences generated for single components in order to validate groups of components and their interactions. In addition, the set of formal specifications produced by symbolic execution can be extremely valuable to prospective component users as they evaluate alternative component implementations for a given task.

[ BibTeX ref., PDF 152 Kb, PostScript.gz 44 Kb ]

Integration Testing of Object-Oriented Software

A. Orso

Object-oriented technology is becoming more and more popular in several different contexts. The Object-oriented paradigm has been applied in the areas of programming languages, databases, user interfaces, specification and design methodologies. Object-oriented languages are widely applied in industry, and several commercial applications are designed and developed with object-oriented technology. As a consequence, the attitude towards object-oriented software quality has undergone a rapid change during the last years. Initially, the object-oriented paradigm has been considered powerful enough to assure software quality without any additional effort. Today, both practitioners and researchers are aware that object-oriented software contains errors just like traditional code. Moreover, object-oriented systems present new and different problems with respect to traditional programs, due to their peculiarities.

While sharing some commonalities with traditional procedural languages, the object-oriented paradigm introduces novel aspects that have to be specifically addressed. Encapsulation and information hiding raise visibility problems, inheritance implies incremental testing concerns, polymorphism and dynamic binding introduce undecidability related issues. Moreover, the structure of object-oriented software is quite different from that of traditional programs. In object-oriented programs, procedures (methods) tend to be small and well understood. The complexity tends to move from within code modules to the interfaces between them. As a consequence, testing at the unit level tends to be less complex in the object-oriented case than for traditional procedural systems, and integration testing becomes necessarily more expensive.

Several techniques have been proposed in the literature for addressing object-oriented testing issues. Most of these techniques present interesting solutions for problems related to unit testing of object-oriented systems. Only a few papers, however, specifically address problems related to integration of object-oriented systems. This thesis proposes a new strategy for integration testing of object-oriented systems, and a new technique for testing interactions among classes in the presence of polymorphism. The architectural design of a tool supporting the application of the proposed approach is also presented.

After performing a preliminary analysis of traditional integration testing issues from an object-oriented perspective, we identify two main problems to be addressed as far as integration is concerned:

Choice of an integration strategy: Traditional integration testing strategies can still fit object-oriented systems, as long as we adapt them according to the new kind of relationships which can occur between classes and are not present in traditional systems. Association, aggregation, composition, and specialization introduce dependencies which must be considered, when choosing an integration order for object-oriented software systems. The combined effects of different relationships can result in complex and cyclic dependencies between the classes composing a system, which must be suitably addressed by the integration strategy.
Presence of polymorphic entities: Object-oriented programs can be seen as sets of objects which dynamically exchange messages. Polymorphism can introduce specific errors related to the impossibility of statically identifying the actual receiver of a message. Critical combinations of bindings can occur, along specific execution paths, that lead to failures. Exercising this kind of interactions exhaustively is infeasible. A technique must be defined, which allows for selecting meaningful test data to adequately exercise polymorphic interactions among classes.

For the solution of these problems, we propose an integration strategy that aims at exercising polymorphic interactions among the different classes composing a system. The technique is composed of two steps: the identification of an integration order, and the incremental testing of the polymorphic interactions while adding classes to the system. Starting from relationships between classes, we propose a method for defining a total order on the set of classes. This allows for identifying an integration order satisfying the following two properties: parent classes are always tested before their heirs; a given class is always integrated with all classes it depends on. The approach is based on the analysis of a graph representation of the system under test. Classes together with their relations define a directed graph. The analysis of the graph results in the definition of an integration order for either classes or groupings of classes (\emph{cluster}s). Although specially defined for polymorphism, the identified strategy of integration can be used in general, to incrementally build and test software systems starting from their components.

After the definition of the integration order for the system, we address the problem of selecting adequate test cases for testing combinations of polymorphic calls during the integration phase. The approach is based on a new data-flow test selection technique. We extend the traditional def and use sets by defining two new sets, def-p and use-p, which contain also information about possible dynamic bindings responsible for the definiton or the use of a given variable. Starting from these new sets, traditional data-flow test selection criteria can be suitably extended, and a set of new criteria can be obtained. The new criteria allow for selecting execution paths and bindings that might reveal failures due to incorrect combinations of polymorphic calls. Besides allowing for defining test selection criteria, the technique can be used to define test adequacy criteria for testing in the presence of polymorphism.

The proposed technique requires to be automated in order to be applicable on non-trivial examples. A tool for applying the approach has been designed and partially implemented. The tool analyzes Java-like code, provides the user with information about critical paths and bindings within the code, and incrementally evaluates the coverage achieved by test runs.

[ BibTeX ref., PDF 584 Kb, PostScript.gz 220 Kb ]

Open Issues and Research Directions in Object-Oriented Testing

A. Orso and S. Silva

To date, research in the field of object-orientation is mostly focused on the first phases of the traditional software life-cycle. A big effort has been spent on methodologies and techniques for specification, design and implementation of object-oriented systems. On the contrary, the research in the testing of this kind of system is still in an early phase. Often, the approach followed when testing object-oriented systems is just to apply well-known traditional testing techniques, neglecting the big differences related with this new paradigm of programming. Aspects like information hiding, encapsulation, polymorphism, dynamic binding, inheritance and genericity, typical when not peculiar of object-oriented languages, lead to the need of defining new approaches as far as testing is concerned. For example, information hiding makes it impossible to easily check the state of an object, polymorphism and dynamic binding introduce new problems to traditional coverage techniques, and inheritance causes problems when deciding what needs to be re-tested in a subclass. These ones are only a subset of the brand new problems introduced in the testing field by object-orientation. Moreover, even the traditional levels of testing lose their meaning in the object-oriented realm. It is difficult to precisely identify, say, what a unit is or what a module is when testing an object-oriented system. The main goal of this article is to classify the problems related with the testing of object-oriented system and to illustrate the approaches proposed so far in literature for solving them.

[ BibTeX ref., PDF 68 Kb, PostScript.gz 32 Kb ]

Introducing Formal Specification Methods in Industrial Practice

L. Baresi, A. Orso, and M. Pezzè

Formal specification methods are not often applied in industrial projects, despite their advantages and the maturity of theories and tools. The scarce familiarity of practitioners with formal notations and the difficulties of their use are main causes of the limited success of formal specification methods. Approaches based on the use of popular front-end notations formally defined with mappings on formal models can solve practical problems. However the absence of flexibility of the mappings proposed so far limits the applicability of such approaches to the few environments that match exactly these solutions. This paper presents an original solution based on formalisms to define mappings from front-end notations to formal models. The proposed framework works with different front-end notations and formal models and supports mappings of analysis results obtained on the formal model to the front-end notation chosen by the practitioners. The approach described in this paper has been validated in industrial environments using pilot applications. The paper presents some of the industrial results obtained so far and ongoing experimentations.

[ BibTeX ref., PDF 264 Kb, PostScript.gz 116 Kb ]

Customizable Notations for Kernel Formalisms

L. Baresi, A. Orso, and M. Pezzè

Rigorous formal methods and intuitive graphical notations can greatly enhance the development of complex computer systems. Formal methods guarantee non-ambiguity and support powerful analysis techniques. Intuitive graphical notations facilitate the communications between engineers preventing errors due to misunderstandings. Unfortunately, tools and techniques based on formal methods do not usually support adequate graphical notations; while tools and methods based on powerful graphical notations often lack formal foundations. This paper proposes a technique that allows kernel formalisms to be accessed through powerful graphical notations. The proposed technique allows graphical notations to be tailored to the needs of the specific application domain. This paper focuses on the tool support and describes the experiences gained so far in accessing different formal kernels through commercial and ad-hoc CASE tools.

[ BibTeX ref., PDF 300 Kb, PostScript.gz 80 Kb ]

HTML style by Antonio Carzaniga

Updated by Alex Orso on October 07, 2017 at 06:38:42.0000000000 EDT