CS7390 Report on Program Visualization
X. Hua Du
01/26/98
Program visualization is the visualization of actual program code or data structure in either dynamic or static form. With its help, software will be much easier to develop, maintain and understand by people. In this report, we will discuss several techniques in this field.
Program publication technique tries to set up a formal routine for programming, especially for documentation purpose. Rather than talk about just lines of codes, we will then talk about literate programming. Furthermore, programs are not only files, but indeed technical publications. In order to transfer programs into publications, some principles need to be adopted, so that the codes can be properly and nicely mapped into visible language constructs. Baecker's group has developed a visual compiler named SEE which can actually perform this kind of mapping for C program and produce high-quality output on printers. As for the newly proposed program publication prototype, it has a similar style to real books and also includes introduction, overview, chapters and indices in the end, etc. Therefore, writing or reading programs will be just the same as writing or reading books. Needless to say, this is absolutely necessary in documentation systems. Thus effort will be greatly reduced for people to understand and maintain a software system. Of course, in this way, programmers will be required to follow the new programming standard. It may be time-consuming at the beginning, but definitely worthwhile in the long run.
In order to visualize a software system, first we need to generate an appropriate model quickly and properly and then choose some visualization tool to view it. North, etc, has identified three kinds of layouts that are effective for system modeling, namely, hierarchical layout (for trees and directed acyclic graphs), orthogonal layout (for planar systems) and spring model (for sparse graphs). Since their corresponding heuristics are reasonably efficient, they can generate satisfactory models without too much effort. A so-called dot system has also been developed so as to visualize these graph models and it is said to be fairly popular for system modeling and visualization these days. Some successful examples are the CIAO system, the LDBX system, the Improvise system and the VPM system. With the CIAO, people can actually navigate different software to browse or query the structural connections among them. The LDBX just tends to add a graphical display window to the conventional text-style debugger. In the Improvise, people are further allowed to establish graph models with multimedia information embedded. All these applications are extended from the dot system and provide more flexibility for users to graphically model and visualize software systems.
There are also some other interesting techniques for viewing dynamic execution behavior. To better handle this dynamic feature of the system, some basic principles need to be considered, i.e. animation, metaphors, interconnection, interaction and dynamic scale. Animation is undoubtedly a very useful tool for dynamic viewing. And it is a practice that metaphors, especially simple and nature-based metaphors, will help to enhance people's comprehension dramatically. Interconnection tends to illustrate clearly the relationships between all the distinct aspects in the system. By interaction, we mean that people also should be involved to make a more user-friendly system. Certainly, the dynamic tracking and scaling feature is self-evidently crucial as well. In respect of visualizing techniques, they are itemized in Jeffery's paper as tree-map, circular tree, "Algae" metaphor, tree-ring and radar-sweep. The first three are really just different representations and views for trees. The tree-map and the circular tree try to use a space of fixed-size to contain all the related information and keep updating the view within that region. Compared to the traditional methods for drawing trees, they possess better scalability, but the update can still be of heavy-duty. With this observation, people proposed another option for tree animation, which is called "Algae" metaphor. In this model, some detailed information of the tree is omitted. Thus, the animation will be significantly speeded up. But this representation itself is definitely not sufficient. Tree-ring technique is another very natural but creative metaphor, in my opinion. It can be employed as a simple but vivid model especially for chronological sequences. Moreover, it can keep almost all the history information, while the radar-sweep technique is different. Because of adopting polar coordinate, the radar-sweep technique will keep track of the latest events but naturally erase old information after a period of time. Apparently, these techniques have their own advantages and applicable situations. We would expect them to achieve nice results under different circumstances.
In terms of automated drawing of data structure diagrams, there exists a new framework brought out by Ding's group. The bottom-line here is to collect from practice the aesthetic rules and factors for diagram drawing and then present them into computable formulas. After this, the drawing system will be able to apply these criteria to evaluate all the alternative diagrams and determine the best solution. Those folks really did a lot of work in rules collecting and formulating. They also proposed this framework for implementing an automated drawing system. In their framework, they defined a data structure to represent rules and factors and stored them all in a database. The structure contains all the attributes of a certain rule, such as its class, applicable condition, weight and badness function, etc. The class and applicable condition can help point out the appropriate situation for using certain rule. The weight just reflects the importance of the rule. As for the badness function, it is just the math formula for figure element's evaluation. At run time, system will pull out the applicable rules into work area, according to the abstraction of the data structure that needs to be represented. Hopefully, this abstraction can be obtained with the assistance of other analysis. Also based on this abstraction, a set of feasible diagrams will be generated in the work area. Then, a so-called driver in the system will evaluate all these alternative diagrams with the available rules in the work area and finally find the solution. This is the entire procedure that the system works. Still, there is some concern. First of all, we might have thousands of rules in the database, which means that it might become an optimization problem with huge number of constraints and can hardly be applied in real-time system. Another concern is about the acquisition of the data abstraction. According to the proposed implementation, this is really difficult and may be even impossible in real-time system. So the whole strategy becomes limited to only off-line usage, such as documentation systems.
All the program visualization techniques that we have discussed so far do help a lot in system development, maintenance and comprehension. However, they are meaningful only if we can model the software systems properly and comprehensibly. Otherwise, a nice visualization tool with an improper software model cannot really mean anything. Besides, visualization techniques are still far from complete, especially for dynamic viewing purpose. We do need to put more effort in finding better modeling strategies and more efficient algorithms for visualization. Currently, a lot of principles in the field of program visualization heavily rely on people's experience and preference, thus they are of subjectivity and very hard to present in a mathematical way. This is perhaps another potential problem we would like to work on, so as to be able to systemize the software visualization technology.