Graphics, Visualization, and Usability Center

System Monitoring for SolarisTM NEO


Introduction

The Sun Microsystems, Inc. NEO product family (formerly Project DOE) is an ...
... object-oriented development and runtime environment for implementing networked-based shared services for programming your network.
Software systems such as Solaris NEO, the networking object extension to the Solaris operating system and the runtime environment in the NEO product family, are becoming larger and more complex. The entire environment seems opaque to the user in the sense that it is often difficult to know and understand the behaviors of different services and objects. Errors, anomalies, and other information are not conveyed to the user in appropriate ways or in timely manners.

Textual statistical presentations are often utilized to help understand these systems. However, they do not allow the user to gain a high-level, overall understanding of the system quickly. Concurrency issues and the size of the tracing information complicate the access to accurate details. When the user wants to interact with the system, he or she needs also to use a textual tool that often has a complex syntax and is error-prone (for example, neoadmin in Solstice NEO).

In order to better understand, monitor, and evaluate Solaris NEO and NEO applications, we are developing a new approach to make the system clearer to the user. We are using on-line visualizations and audio cues to help portray the execution of the object services. Since humans can perceive and comprehend visual and auditory information quickly, multimedia visualizations can relieve the users, programmers, and administrators from cognitive burdens in understanding the software system. Graphical tools also can help users interact with the system.


Current Status

In the first several months of design, we did not have access to the NEO software. Consequently, we built a simple monitoring application using random generated statistics as a simulation of NEO. The following is a snap-shot of the program:

Program Snap-shot

The top half of the application window is a generic overview of a NEO workgroup. Each row represents a host and each column represents a service (the service name under the mouse pointer is displayed at the bottom of the top area).

If a service is not registered on a host, the corresponding cell in the table is blank; if the service is registered but there is not a server process created to handle this service, a circle is displayed in the table cell; if there is a server process for this service, the number of NEO objects in the server is displayed as horizontal lines in the table cell, and a grey core is shown to represent the percent of CPU time this server uses.

When the user clicks on one of the overview table cells, the magnified view in the lower half of the window displays more detailed information about the corresponding server process.

In the center of the magnified window a grey area represents the total process image size and a black core area represents the resident process image size. The arrows on the left and right sides represent recent message receiving and sending rates (messages per second), respectively. The top of the magnifier window shows the percent of system memory used by this process.

Certainly, more work on the visulization is required. We have installed Solaris NEO and WorkShop NEO locally and hope to progress further soon.


System and/or Process Statistics To Be Visualized

The table below section lists a number of important statistics and values related to NEO and how they are represented in our visualization.
Design Specification
Statistic Presentation Implementation
Workgroup Hosts Left-most column in the overview window. Through neoadmin
Services available Color-coded index bar on top of each column (except the hosts column): the user will be allowed to change the color bindings; when the user moves the mouse pointer over a overview table cell, the name of the service corresponds to the column is shown at the bottom of the overview. Through neoadmin
Host Services available In the overview table, a blank cell means that the service is not available on the host; a cell with a circle means that the service is available on the host but there is not a server process to serve that service; other cells represents existing server processes. Through neoadmin
System load average Colored background of the host name in the hosts list: the more green it is, the lower the load average is; the more red it is, the higher the load average is; red means that the load average is above 1.0. Read kernel memory
Server Number of NEO objects in the implementation Number of short horizontal lines in the overview table cell. Through neoadmin
Percent of CPU time used recently Background color of the overview table cell: the more green it is, the lower the CPU time is; the more red it is, the higher the CPU time is; red means that the CPU time is almost 100%. Through the proc interface
Percent of system memory used recently Height of the grey bar in the center of the overview table cell. Through the proc interface
Total and resident set size of process image Process image bar in the magnifier view. The full height is the maximum process image size among the existing processes. The height of the grey area corresponds to the total image size; the height of the core dark area corresponds to the resident set size. Through the proc interface
Recent message send and receive rates On the left and right of the process image bar in the magnifier view. The number of arrows represents the value. The left side represents message receive rate; the right side represents message send rate. Through the proc interface
Recent minor and major page fault rates; recent swap rate; recent input and output rates; recent voluntary and involuntary context switch rates; recent system call rate Respective bars in the magnifier view. Through the proc interface
Object and memory leakages; process crash count Respective bars in the magnifier view. ?
Process call graph Click on the "Call Graph" button in the magnifier view. ?
Other availabe statistics Textual listing in the magnifier view. Varies

Note: the user can define thresholds for various statistics and bind sounds to them -- if one statistic goes above (or below) its threshold, the tool plays the corresponding sound.


Open Problems


Appendix: System and Process Statistics Available

Directly Available Through Normal Solaris System and/or Process Information

Directly Available Through Solaris Microstate Accounting Information

NEO Server and/or Client Specific Information

Information We Would Like to Get

Statistics Which We can Calculate


Software Visualization Contact Information