The 3rd Cracow Grid Workshop

Cracow, Poland
O c t o b e r 27 - 29, 2003

Main Organizer:

back

A b s t r a c t s
and
P r e s e n t a t i o n s
( in alphabetical order )

Bartosz Balis, Marian Bubak, Wlodzimierz Funika, Marcin Radecki, Tomasz Szepieniec, Roland Wismueller, Tomasz Arodz, and Marcin Kurdziel

A Concept of a Monitoring Infrastructure for
Workflow-Based Grid Applications

Abstract:

The main goal of this work is to design a Grid service for on-line performance monitoring of Grid workflow-based applications. The service is meant to provide information about running applications, which concerns the current status of applications and Grid infrastructure, availability of resources, correct operation of services the application comprises. The information could be used by systems that ensure fault-tolerant execution of the application, scheduling systems, and the user who observes the execution of the application and takes relevant decisions on its operation [1].

In case of workflow-based Grid applications [1], two aspects of performance monitoring are important: first - monitoring the status of Grid services composing the workflow and interactions between them, second - monitoring of the internal performance of individual services. The latter aspect is already well addressed by such approaches as CrossGrid's OCM-G/G-PM [2].

Our focus in this paper is moved towards monitoring of applications built as a workflow of interacting Grid services. Support for such applications requires a completely new approach. There are new, grid-service-related metrics to be introduced like e.g., the overhead due to communication between services or the computing power utilization for a service.

We propose the following architecture for the monitoring system for workflow-based applications. First, individual Grid services which are to be monitored should be extended with a monitoring interface via which a monitoring information about a component can be retrieved. Additionally, this interface may also enable instrumentation or other manipulations on the target component. This implies that parts of the monitoring infrastructure must be integrated into the grid services themselves. In addition to the local monitoring interfaces in each component, a global monitoring service should be available which will itself be a grid service, and via which it will be possible to extract global monitoring propeties combining monitoring information from a set of components which are parts of a single application. An example of a global monitoring property is a "total communication volume between all components of an application".

Currently, we anticipate several clients of the monitoring system. First, the performance analysis and prediction tool may use the data to visualize the application behavior. Second, the workflow composition system might make use of the monitoring services in a decision support. Finally, the information about resources could be used to ensure fault-tolerant execution of the application, by scheduling systems, etc.

References:

The Taverna Project http://taverna.sourceforge.net
B. Balis, M. Bubak, W. Funika, T. Szepieniec, and R. Wismueller: Monitoring and Performance Analysis of Grid Applications. In: P.M.A. Sloot et al., editors, Computational Science - ICCS 2003, vol. 2657 of Lecture Notes in Computer Science, pages 214-224, St. Petersburg, Russia, June 2003. Springer-Verlag

Presentation
(PowerPoint: 148 KB)