As part of a broader organisational restructure, data networking research at Swinburne University of Technology has moved from the Centre for Advanced Internet Architecture (CAIA) to the Internet For Things (I4T) Research Lab.

Although CAIA no longer exists, this website reflects CAIA's activities and outputs between March 2002 and February 2017, and is being maintained as a service to the broader data networking research community.

L3DGEWorld Cluster-node Monitoring (LCMON) 1.0

August 9th, 2007
By Carl Javier (CAIA winter intern, 2007)
(website co-authored with Grenville Armitage)

Overview

L3DGEWorld Cluster-node Monitoring (LCMON) 1.0 demonstrates the use of L3DGEWorld 2.1 to provide real-time visualisation of the Swinburne Supercomputer cluster. Cluster nodes are represented within an interactive 3D environment by unique entities floating in space. Each entity is animated to represent their associated cluster node's real-time state (such as current CPU load or memory usage). In a typical scenario an LCMON 1.0 server provides a virtual world into which LCMON clients connect and view the supercomputer cluster's state. Multiple LCMON clients may connect at the same time, independently 'move around' the virtual world as they chose, and receive simultaneous real-time views of the cluster state from their own chosen perspectives.

LCMON 1.0's core, L3DGEWorld 2.1, is being developed as a network monitoring and control application based on Open Arena (a GPL'd game based on the Quake III Arena Q3A game engine). LCMON 1.0 shows how L3DGEWorld 2.1 may be more broadly utilised to create interactive 3D virtual worlds within which arbitrary real-time state information is represented.

Building on Open Arena means that LCMON 1.0 (and L3DGEWorld 2.1) are easy to install (and rebuild if desired) under Windows, Mac OSX, FreeBSD and Linux.

Swinburne Supercomputer

Run by the Centre for Astrophysics and Supercomputing, the Swinburne Supercomputer (as of July 2007) consists of over 1160 processors across 145 cluster nodes and has a theoretical peak processing capacity of 10 Teraflops. Each cluster node contains 2 quad-core Clovertown processors running at 2.33 GHz.  The nodes are controlled by a head node which distributes jobs to the cluster via a queue system (itself controlled by Moab cluster management software).  The head node is named 'green', and cluster statistics are currently provided through a web interface known as ganglia. Cluster nodes are named 'shrek001.ssi.swin.edu.au', 'shrek002.ssi.swin.edu.au', and so-on. (Internal users may access more details about the cluster here.)

LCMON provides a different real-time perspective into the cluster's state. In LCMON 1.0's virtual world each cluster node is represented as a 3D star, with the entire cluster laid out as an amphitheatre of stars. Each star rotates, bounces on the spot, changes colour and/or size proportional to the CPU load, memory in use, and network traffic in/out of their associated cluster node.

Visual appearance, at a glance

The following four static images capture the view seen by a single user after logging into an LCMON 1.0 server and flying around the virtual environment.  When a user first logs into the LCMON server they are placed 'floating' above and facing the amphitheatre of stars representing the cluster itself. Figure 1 shows an idle supercomputer - all stars are small and stationary - from the perspective of a user who has just logged in to the LCMON server. Figure 2 shows a snapshot of the supercomputer with a distribution of activity - some cluster nodes have quite different memory usage at the time (stars of various sizes), and some cluster nodes have noticable network traffic in and out (their stars are caught in mid-bounce). Figures 3 and 4 provide a snapshot of an active cluster from different positions in the virtual environment.

Screen shot    
   



Rotation rate &colour
CPU Load (%)

Scale size
Memory Usage (%)

Bounce height
Traffic in (PPS)
Figure 1: Supercomputer cluster node overview with no state information
Figure 2: Snapshot of cluster nodes bouncing in the virtual world
Screen shot

Figure 3: Snapshot from the back of the virtual world
Figure 4: Snapshot of the side view in the virtual world

Detailed images and examples

Additional images and examples of the LCMON 1.0 user interface can be found here.

Example video

Of course, LCMON's main advantage isn't the static view. The following video (hosted on YouTube) illustrates LCMON's dynamic representation of cluster state.

We begin with a single user's view of the cluster changing from idle to quite busy as more and more nodes become active. At about 1min20sec we show how the user can learn detailed information about particular cluster nodes either by flying up close to the node's star, or by 'shooting' a node's star from a distance. At 1min54sec we show how a second user (who are themselves inspecting cluster nodes) would appear within the virtual environment. (Note that the cluster behaviour shown here has been synthesised for demonstration purposes.)



Functional overview and live demonstration server

As illustrated in Figure 5, the major components of an LCMON 1.0 system are:
  • LCMON Server
    •  An instance of L3DGEWorld 2.1 server running a specific 'map' to represent the LCMON 1.0 virtual world.
    • Gpoll ('ganglia poll') daemon - a separate process regularly polling and parsing the state of Swinburne cluster nodes for the L3DGEWorld 2.1 server.
  • LCMON Client
    • An instance of L3DGEworld 2.1 client, used to enter and interact with the LCMON virtual world.
As of August 2007 we have a live LCMON Server running on l3dgeworld.caia.swin.edu.au:27960 (i.e. available on port 27960 of l3dgeworld.caia.swin.edu.au). This server is actively polling the Swinburne cluster every 60 seconds, and LCMON Clients may connect from anywhere inside the Swinburne University network. (Note: Currently l3dgeworld.caia.swin.edu.au is not accessible from outside  the Swinburne network.)

Gpoll uses telnet to establish a TCP connection to a monitoring node within the supercomputer cluster every 60 seconds (step 1). The state of all cluster nodes is handed back in XML-encoded form (step 2), which Gpoll then parses (step 3) to extract selected data (such as CPU load, memory in use, and network traffic in/out).  Gpoll uses a generalised interface the L3DGEWorld 2.1 engine (described further in [1]) to update the virtual environment entities (in this case stars) representing each cluster node (step 4).

Multiple LCMON Clients, from different OS platforms, may connect at any time to the LCMON server to view the states of the cluster nodes (step 5).

Figure 5: Information flow and client-server relationship in LCMON 1.0

(For people familiar with Quake III Arena, LCMON is essentially a custom map running on a modified Quake III Arena game engine. The client and dedicated server executables have both been modified to enable L3DGEWorld 2.1 to communicate with external daemons. As with a normal Quake III Arena game, one or more clients may be connected to a single server, each one rendering a separately controlled view of the virtual environment. Clients may connect and disconnect from the server at any time without disrupting the server's virtual environment, and may do so from where ever there is UDP/IP connectivity to the LCMON Server.)

LCMON 1.0 may be used to monitor other supercomputer clusters (or even entirely different systems) by modifying Gpoll (and optionally redesigning the Quake III Arena 'map' [2] used to represent the virtual environment shown in Figures 1 to 4).

System Requirements

LCMON 1.0's underlying L3DGEWorld 2.1 core (both client and server sides) has been verified to run on FreeBSD 6.2, Mac OS X 10.4.9, Linux (Ubuntu 7.04) and Windows XP Platforms (with the addition of cygwin). Gpoll (Figure 5) has only been verified to run on FreeBSD 6.2 (although we believe it should be portable to other platforms).

LCMON 1.0's system requirements are the same as L3DGEWorld 2.1's requirements.
Licensing

L3DGEWorld 2.1 and LCMON 1.0 are copyright (C) 2007, the Centre for Advanced Internet Architectures, Swinburne University of Technology, and distributed under version 2 of the GNU General Public Licence.

Download

LCMON plus L3DGEWorld package:

Add on module to L3DGEWorld (for those who already have L3DGEWorld):

The .map file for LCMON 1.0
  • greenmachine.map (19KB) (optional, only if you wish to create a modified virtual environment [2])
    • Note: If you're having trouble with the sky and/or floor of the map 'smearing' around the edges, set "\r_fastsky 1" at the client console to turn off the fancy star field being used as a background sky.

Authors and Acknowledgments

  • LCMON 1.0 was developed from L3DGEWorld 2.1 by Carl Javier during his winter internship at CAIA, July - August 2007, under the supervision of Grenville Armitage
  • We appreciate the co-operation from Dr Jarrod Hurley and Professor Matthew Bailes from the Centre for Astrophysics and Supercomputer.
  • L3DGEWorld 2.1 was developed by Lucas Parry.
  • The Gpoll input daemon was developed by Carl Javier and Lucas Parry.
  • We have received a lot of valuable feedback, website editing and system testing by Grenville Armitage.
  • Thanks to the OpenArena team - their free textures and artwork on the Quake III Arena codebase made it possible for us to distribute LCMON as a complete package.

References

  1. L.Parry "L3DGEWorld 2.1 Input & Output Specifications", CAIA Tech Report 070808A, August 2007
  2. C.Javier "Map & Entity Modeling for L3DGEWorld, CAIA Tech Report 070809A, August 2007


Go back to the L3DGE project main page


Last Updated: Wednesday 15-Aug-2007 13:47:44 AEST | Maintained by: Grenville Armitage (garmitage@swin.edu.au) | Authorised by: Grenville Armitage (garmitage@swin.edu.au)