A b s t r a c t s
P r e s e n t a t i o n s
( in alphabetical order )
Bartosz Balis, Marian Bubak, Wlodzimierz Funika, Marcin Radecki,
Tomasz Szepieniec, Roland Wismueller, Tomasz Arodz, and Marcin Kurdziel
A Concept of a Monitoring Infrastructure for
Workflow-Based Grid Applications
The main goal of this work is to design a Grid service for on-line performance monitoring of Grid workflow-based applications. The service is meant to provide information about running applications,
which concerns the current status of applications and Grid infrastructure, availability of
resources, correct operation of services the application comprises. The information could be used by systems that ensure fault-tolerant execution of the application, scheduling systems, and the user who observes the execution of the application and takes relevant decisions on its operation .
In case of workflow-based Grid applications , two aspects of performance monitoring are important:
first - monitoring the status of Grid services composing the workflow and interactions between them,
second - monitoring of the internal performance of individual services. The latter aspect is already well
addressed by such approaches as CrossGrid's OCM-G/G-PM .
Our focus in this paper is moved towards monitoring of applications built as a workflow of interacting Grid services. Support for such applications requires a completely new approach. There are new, grid-service-related metrics to be introduced like e.g., the overhead due to communication between services or the computing power utilization for a service.
We propose the following architecture for the monitoring system for workflow-based applications. First,
individual Grid services which are to be monitored should be extended with a monitoring interface via
which a monitoring information about a component can be retrieved. Additionally, this interface may
also enable instrumentation or other manipulations on the target component. This implies that parts
of the monitoring infrastructure must be integrated into the grid services themselves. In addition
to the local monitoring interfaces in each component, a global monitoring service should be available
which will itself be a grid service, and via which it will be possible to extract global monitoring
propeties combining monitoring information from a set of components which are parts of a single
application. An example of a global monitoring property is a "total communication volume between all
components of an application".
Currently, we anticipate several clients of the monitoring system. First, the performance analysis and
prediction tool may use the data to visualize the application behavior. Second, the workflow composition
system might make use of the monitoring services in a decision support. Finally, the information about
resources could be used to ensure fault-tolerant execution of the application, by scheduling systems,
- The Taverna Project http://taverna.sourceforge.net
- B. Balis, M. Bubak, W. Funika, T. Szepieniec, and R. Wismueller: Monitoring and Performance
Analysis of Grid Applications. In: P.M.A. Sloot et al., editors, Computational Science - ICCS 2003,
vol. 2657 of Lecture Notes in Computer Science, pages 214-224, St. Petersburg, Russia, June 2003.
Antonio Fuentes, Eduardo Huedo, Rubén S. Montero,
Ignacio M. Llorete
A Grid Scheduling Algorithm Considering Dynamic Interconnecting Network Quality
For certain application domains the traditional concept of computing based on a homogeneous, and
centrally managed environment is being displaced by Grid computing, based on the exchange of
information and the sharing of distributed resources by applications. The Globus toolkit has become a
de facto standard in Grid computing. Globus is a core Grid middleware that supports the submission of
applications to remote hosts by providing resource discovery, monitoring, resource allocation and job
However, application execution on Grids continues requiring a high level of expertise due to its
complex nature. The user is responsible for manually performing all the job submission stages in
order to achieve any functionality, namely: system selection and preparation, submission, monitoring,
migration and termination. Moreover, computational Grids are dynamic environments, characterized by
highly variable conditions: high failure rate, and resource availability, performance and cost.
Therefore, adaptation to changing conditions is needed to achieve a reasonable degree of both
application performance and fault tolerance. In the present work, job adaptation is achieved by
implementing automatic application migration following performance degradation, better resource
discovery, requirement change, owner decisions or remote resource failure.
The most important step in job scheduling is resource selection, which in turn relies completely in
the information gathered from the grid. Resource selection usually takes into account the
performance offered by the available resources, but it should also consider the quality of the
interconnecting network in terms of latency and bandwidth between the Grid resources. For example,
the bandwidth is very important because the size of the files involved in some applications domains,
like Particle Physics or Bioinformatics, is very large. This fact is specially relevant in the case
of adaptive job execution, since job migration requires the transfer of large restart files between
the compute hosts. Therefore the quality of the interconnection network has a decisive impact on
the overhead induced by the job migration.
In this work, we present a new job scheduling algorithm that takes into account the interconnecting
network quality, dynamically evaluating decisive communication parameters. Our scheduler gathers
dynamic information of remote resources and the network (bandwidth, latency,.. ) in order to choose
the better available resource for job submission. The final contribution will include experimental
results of the scheduling algorithm in a research testbed.
Marian Bubak, Marcin Radecki, Tomasz Szepieniec
A Proposal of Application Failure Detection
and Recovery in the Grid
The large in scale and long-running applications are common in the Grid. The cost in case such a job
collapses is high. What is more, the probability that software or hardware component involved in the
application will fail during the computation is relatively high. Therefore, providing some mechanisms
of fault detection and enabling procedures to preserve the job from total collapse seems to be a must.
The failures that could occur have diversity characteristics. The source of problems could lead in
network, node, system configuration or application itself. The problem could occur timely or be
permanent. Also the severity and danger of fault for the job differs e.g. in master-slave application
the collapse of a slave's node is not as dangerous as the collapse of the master process. Failures
ranges from cutting off the whole site to decrease of performance due to overloaded network link.
Detailed recognition of fault characteristics is crucial for choosing the most suitable recovery
For each kind of failure appropriate recovery scenario should be work out. The simplest scenario
is just to ignore the failure if danger is low. As a last resort we could kill and restart the whole
job. More sophisticated scenario is, for example, migration of some processes or restart application
using previous check-pointing. The right decision can not be taken without consideration of application
programing paradigm while, for example, MPI-based applications allow different recovery techniques
than Grid-service-based ones.
The service that could face so broadly defined fault recovery, should use all available methods to
monitor Grid and to perform recovery actions over application. The list of Grid services that should
be integrated in this activity contains application and infrastructure monitors, check-pointing
and migration services, schedulers and others. These services should be coordinated by a fault recovery
manager to profit from their gathered abilities to support application fault tolerancy.
- Bubak, M., Zbik, D., Dick van Albada, et al, "Portable library of migratable sockets"
Scientific Programming, Volume 9, Number 4/2001, pp. 211 - 222
- Balis, B., Bubak, M., Funika, W., Szepieniec, T., and Wismueller,R.: An Infrastructure for Grid
Application Monitoring In: Kranzlmueller, D., Kacsuk, P., Dongarra, J., Volker, J. (Eds.): Recent Ad
vances in Parallel Virtual Machine and Message Passing Interface, Proc. 9th European PVM/MPI Users'
Group Meeting, Linz, Austria, September/October 2002, LNCS 2474, pp. 41-49, 2002
Aleksander Nowinski, Krzysztof Nowinski, Jaroslaw Pytlinski,
Piotr Bala, Krzysztof Benedyczak
Advanced Visualization Capabilities in the UNICORE
UNICORE becomes one of the important grid middleware.
The main advantage of the UNICORE environment is powerful Graphical User Interface which allows for
job preparation, job submission and job management.
Presented work allows for build intoo the the UNICORE capabilities for the graphical postprocessing
of the results. Dedicated extensions to the UNICORE client
(plugins) have been developed. The graphical capabilities
were introduced in the form of the generic visualization plugin, as well as for the postprocessing of the
quantum chemistry code - Gaussian output.
Vaidy Sunderam, Dawid Kurzyniec
Alternative Frameworks for Cooperative Distributed Computing
Computational resource sharing across multiple administrative
domains is gaining widespread use, driven by the benefits of
aggregation and service-based computing. This project, comprising
the Harness II and H2O systems, introduces a novel model for
cooperative fault-tolerant distributed computing that emphasizes
statelessness, (re)configurability, and interoperability. Sharing
resources or services across administrative boundaries
involves numerous technical challenges, notably those concerning
security and resource allocation. The H2O architecture is designed
to specify sharing relationships on a pairwise basis, thereby
localizing all constraints and agreements between any provider-client
pair and minimizing or even eliminating global distributed state.
Upon this fabric, the Harness II infrastructure serves as an
integrated platform for efficient high performance computing
by aggregating distributed resources.
Current distributed systems and grids are rigid, complex, and brittle.
The H2O and Harness II frameworks, through their minimal-state design
and component architecture that permits reconfiguration of resources as
needed, aim to be flexible and robust. The architectural model that
realizes this mode of distributed computing is described below.
The lowest layer in the Harness II framework is the component hosting
and interaction substrate termed H2O. This layer is characterized by a
lightweight but secure kernel, into which "pluglets" realizing different
functions are loaded - either by the hosting provider, authorized
clients, or third-part resellers. Examples of such pluglets are
high-speed transport modules, parallel programming libraries such as
FT-MPI, or specialized numerical solvers and application components.
Pluglets interact across kernels using a specially developed
communications layer called RMIX, that offers a well-understood and rich
method invocation interface while permitting the use of interoperable
transports. Its programming interface is based on Remote Method
Invocation paradigm but is language neutral; features such as one-way
and asynchronous method invocations have been added to better support
the needs of high performance distributed computing. The RMIX layer also
ensures secure communications and has inbuilt support for resilience to
Within H2O kernels, pluglets conform to a well-defined interface to
interact with the kernel, and are controlled by the kernel's security
policies and resource consumption restrictions. This design makes it
possible for clients to load domain- or platform-specific pluglets into
provider kernels to cater to application needs, while protecting
resource providers from damage or excess consumption of their resources.
Resource providers specify these constraints when launching kernels via
a portable schema that is XML-based, thereby permitting the publication
and discovery of resources using standardized mechanisms and tools - or
by using simpler and more appropriate means as the situation dictates.
The companion poster to this abstract depicts the architectural and
philosophical foundations of the H2O and Harness II frameworks,
highlights the salient features of the resource sharing model, and
outlines prototype implementation and use-case scenarios.
Renata Slota, Darin Nikolow, Jacek Kitowski, Jerzy M. Zaczek
Architecture of the Virtual Storage System for Grid-based Accessing
One of the important problems for grid computing is development of the
middleware layer, that consolidates different kinds of national or
international resources. This is especially important while dealing with
data distributed amongst different locations. Therefore, the data
management is essential for grid data access.
This paper describes the architecture of the virtual storage system (VSS)
for grid-based accessing being developed as one of the task for the
SGIgrid project . Contrary to the software developed up-to-date, the
architecture is kept at the simplest level, easy to operate and maintain.
The virtual storage system is aimed at integrating the mass storage
facilities being used in the computing centers taking part in the project.
HSM type of software is used in these computing centers to allow accessing
the mass storage hardware like tape libraries and optical jukeboxes.
The architecture consists of the following main modules:
- VFM - Virtual File Manager, which is responsible for the VSS session
authorisation, managing the files stored in the VSS, resolving virtual
file names to physical replica instances, negotiating the data transfer
between LFM and client application,
- LFM - Local File Manager, which manages the physical files residing on
HSM systems and transfers data between the HSM system and client
application and estimates the data access time for the specified physical file
- HSM - Hierarchical Storage Manager, HSM software allowing access to data
residing on tertiary storage
- MDB - Meta Database,
- OR - Optimalization of Replicas, which is responsible for the replica
selection based on the criteria of minimizing the data access time
- API - Aplication Programing Interface, allowing the client application
to comunicate with the VSS
Different kinds of our previous achievements are incorporated, like time
estimation for tertiary storage system  and index based retrieving of
video sequences . Overview of the existing approaches to the problem
will be depicted in the paper.
- SGIgrid: Large-scale computing and visualization for virtual
laboratory using SGI cluster (in Polish), KBN Project, http://www.wcss.wroc.pl/pb/sgigrid/
- Nikolow, D., Slota, R., Dziewierz, M., Kitowski, J., Access Time
Estimation for Tertiary Storage Systems, in: Monien, B., Feldman, R.
(Eds.), Euro-Par 2002 Parallel Processing, 8th International Euro-Par
Conference Paderborn, Germany, August 27-30, 2002 Proceedings , no. 2400,
Lecture Notes in Computer Science, Springer, 2002, pp. 873-880.
- Nikolow, D., Slota, R., Kitowski, J., Nyczyk, P., Otfinowski, J.,
"Tertiary Storage System for Index-Based Retrieving of Video Seqences", in:
Hertberger, B., Hoekstra, B., Williams, R. (Eds.), Proc. Int. Conf. High
Performance Computing and Networking, Amsterdam, June 25-27, 2001, Lecture
Notes in Computer Science 2110, pp. 62-71, Springer, 2001.
ASKALON: A Tool Set for Cluster and Grid Computing
Lukasz Dutka, Jacek Kitowski
Automatic Application Builder for Grid Workflow Orchestration
Grid web services are direct corollaries of component architectures. In reality they
are good examples of application of component ideas  to large
scale systems. Thus, the selection of grid web services could be treated as
selection of components deployed in a grid environment. However, at present
there are no many solutions existing of component selection on the fly without
One of the rare examples of expert system application for semi-automatic workflow
building is , although the problem addresses decision making for
commercial proposes. Other examples are presented in [3-4].
In this paper the Automatic Application Builder is proposed for supporting
the user in selection of workflow elements (services or program components). The
selection of a component or a service from the repository is based
on a rule-based expert system that incorporates requirements and state of the
application to be build. The rules are developed by a human expert. The
advantage of the system is its flexibility in development (since the components
or services can be prepared independently) and reasoning concerning the choice
of components or services according to the knowledge of the application requirements.
The proposed approach has already been successfully implemented for large-scale
WWW-based information systems  as well as for optimization of access to grid
- C. Szyperski, Component Software: Beyond Object-Oriented Programming. ACM
Press and Addison-Wesley, New York, NY, 1998.
- C. Duvel, Establishing rule-based models to implement workflow within
construction organizations, PhD Thesis, UMI 9976532, University of Florida,
- C. Duvel, and R. R. A. Issa, The application of expert systems to controlling
workflow within the construction management environment, Artificial
Intelligence Applications in Civil and Structural Engineering, Eds. (B. Kumar
and B.H.V. Topping), Civil-Comp Press, 1999, pp. 61-72
- L. Dutka and J. Kitowski, Flexible Component Architecture for In-formation
WEB Portals, in: P. Sloot, D. Abramson, A. Bogdanov, J. Dongarra, A. Zomaya,
Y. Gorbachev (Eds.), Proc. Computational Science - ICCS 2003, Int.Conf. St.
Petersburg Russian Federation, Melbourne Australia, June 2-4, 2003,
LNCS, vol. 2657, 2003, pp. 629-638
- L. Dutka and J. Kitowski, Application of Component-Expert Tech-nology for
Selection of Data-Handlers in CrossGrid, in: D. Kranzlm=FCller, P. Kacsuk,
J. Dongarra, J. Volkert (Eds.), Proc. 9th European PVM/MPI Users=92 Group
Meeting, Sept.29, Oct.2, 2002, Linz, Austria, LNCS, vol.2474, Springer,
- L. Dutka, R. Slota, D. Nikolow and J. Kitowski, Optimization of Data
Access for Grid Environment, 1st European Across Grids Conference February,
13-14, 2003 Universidad de Santiago de Compostela, Spain, LNCS, Springer,
- K. Stockinger, H. Stockinger, L. Dutka, R. Slota, D. Nikolow, J. Kitowski,
Access Cost Estimation for Unified Grid Storage Systems, Supercomputing
2003 IEEE Conf., Nov. 2003, accepted
Ariel Garcia, Marcus Hardt, Yannick Patois, Ulrich Schwickerath
Collaborative Development Tools
Groups of software developers, especially when spread over different physical locations, need a whole wealth of tools for managing mailinglists, bug trackers, code versioning, website hosting, nightly builds,... . The opensource area offers a whole wealth of tools allowing to set up a solution for every requirement. However, this consumes manpower - usually a rare resource.
The aim of the savannah project is to provide a software framework running on one central server, managed by a few experts, providing the necessary tools to its users. All services are available with the same sign-on. Currently the following services are provided:
- General information page showing latest news, contact to project members and status about the project activity
- News service: Allows for posting news and for discussion about news
- Mailing forums: similar to Mailinglists but with a web based forum archive
- Download service: for file distribution. File collections can be dynamically created on a daily basis
- Bugtracker: a customizable bug tracking system similar to Bugzilla
- Support Trackers: providing a simple feedback function to the user community
- Patch Manager: for allowing anybody to suggest patches to the sourcecode
- Task Manager: to structure the project into subtasks that can depend upon each other
- CVS: Management of the poject assigned CVS repository
- Autobuild: A tool for nightly builds of the code in the CVS repository
On http://gridportal.fzk.de one instance of savannah has been installed within the CrossGrid project. This server is being dedicated to the support of Grid and HEP related software development.
V.N. Alexandrov and S. Mehmood Hasan
Collaborative Tools for the Grid
Our efforts described in this paper are aimed at harnessing the capabilities of Collaborative Computing with Grid Computing. The work aims to complement the efforts in Grid Computing by providing human-centred techniques and technologies for facilitating collaborative, computer-based cooperative work.
The notion is to provide interactive and real time visualisation, joint analysis of results and seamless access to data repositories. This cooperative work must be conducted within a coherent and inclusive collaborative environment. This can be achieved by the construction of a virtual work environment on multiple computer systems connected over the grid. The virtual work environment will use the Grid as its underlying infrastructure, using grid security mechanisms to authenticate participants in the collaboratory.
In this setting, participants interact with each other, simultaneously access and operate computer applications, refer to global data repositories or archives, collectively create and manipulate documents, perform computational transformations, and collaboratively visualise the results. The collaborative experience will be enhanced by providing integral support for human audio/video communication.
We will discuss recent work on grid enabled collaborative tools, and outline our plans to investigate and explore innovative enabling technologies to support collaborative, distributed, grid-based problem solving.
Commodity Computing Clusters - Next Generation Supercomputers?
Florian Schintke and Jan Wendler
Computational Fluid Dynamics in the Grid Using FlowGrid
We present the architecture of FlowGrid, a software package to enable Computational Fluid Dynamics
(CFD) applications in the Grid. FlowGrid revolutionizes the way CFD simulations are set up, executed
and monitored. In a network of Grid-enabled CFD centers across Europe, the development and validation
of software and knowledge for Grid-based CFD computations takes place. This CFD Virtual Organization
provides easy and flexible access to CFD resources for the industrial end users.
Computational Grids are ideal for CFD simulations since in general the computational resources planned
for such simulations become either insufficient or underutilized most of the licensed time. The primary
advantage of bundling such resources into a CFD Virtual Organization is the flexibility in providing
on-demand computational power. A special property of CFD simulations is the need for synchronous
communications between the subjobs, making it challenging to execute jobs on the Grid.
The FlowGrid architecture consists of:
- a user client called GenIUS, which performs the task partitioning and operates the Grid for the user
- the middleware FlowServe, which aggregates available resources, allocates and distributes job
requests to computing resources
- the backend executes the parallel CFD simulation code on clusters and high performance computers
- a database that stores meta information about resource availability and costs
- the portal which provides software to subscribers, allows resource providers to set policies and
prices, and let the administrator manage the system manually
With GenIUS running on Microsoft Windows and FlowServe running on Linux, these two operating
systems are combined within a single Grid environment. We describe the protocol between GenIUS and
FlowServe, which also covers interfaces of FlowServe to other user frontends to allow an easy integration
of other CFD programs into the Grid.
With FlowServe, preliminary results are provided to the user during runtime, so that he can see how
the simulation converges. Thereby he can discover problems in the simulation during runtime.
Current non Grid-aware CFD applications allow adaptations to the calculations during runtime. Such kind
of adaptations will be also supported by FlowServe.
Marian Bubak, Michal Turala
CrossGrid Project in Its Halfway: Achievements and Challenges
Piotr Nyczyk, Andrzej Ozieblo, Marcin Radecki
CrossGrid Testbed Cluster at ACK CYFRONET AGH
The testbed cluster at ACK Cyfronet AGH in Kraków is a part of the CrossGrid Project testbed network, which involves 16 partners from 9 countries across Europe. Our testbed hardware is based on a rack cluster of 1U Intel Dual processor units produced by RackSaver Inc. The initial configuration contains four Intel Dual P III nodes augmented by 23 Dual Xeon units with two 2.4MHz Xeon processors, 1GB memory, 40GB disk and 1000Mb Ethernet ports on each node. Communication between nodes is assured by a HP Switch with forty 100Mb ports and three 1000Mb uplink ports. Currently the bandwidth of the connection to the national research network is 622 Mbits/s but it will be increased to 10Gbit/s in a few months. A dedicated KVM (keybord, mouse, monitor) 1U unit, incorporated into the cluster rack, is in use for monitoring all elements. Disk space has been increased by additional 640MB 1U units (Quardian 4400) and a 4 GB 4U disk array. 10 additional Dual Xeon nodes will be added in a few weeks.
Currently, we are running on EDG 1.4, the separately-installed LCG-1 testbed software with Globus 2 for Grid services and OpenPBS for scheduling. Installation is automatized through an LCFng Installation and Configuration System with configuration profiles synchronized with a central repository. Several software packages have been installed: ATLAS for HEP applications, Gaussian, Mathematica and a developement environment (C/C++, Java, Fortran). For monitoring the entire cluster, a GANGLIA distributed system with additional temperature sensors is used.
Marian Bubak, Tomasz Gubala, Maciej Malawski and Katarzyna Rycerz
Design of Distributed Grid Workflow Composition System
The Grid is a complex environment with many resources distributed geographically which may be
connected together in order to execute Grid applications. These Grid applications consist of many
independent and possibly heterogeneous modules, connected together in order to achieve required
functionality. Discovering and joining such elements distributed throughout vast and frequently
changing Grid environment can be difficult.
We propose a new, novel design of the Grid application workflow composition system, intended to
support the user and other systems.
First, we shortly describe our previous prototype solution to the workflow composition problem,
called Application Flow Composer system (AFC).
The brief description of system architecture and internal mechanisms is followed by a discussion on
advantages and disadvantages of this very first approach . Taking these into consideration we
propose a new composition system, based on Grid services concept. It consists of fully distributed
registry storing valuable descriptions of available services, including information about service
semantics. The second part is a semi-automatic composition system (temporary called AFC2), and it uses
the registry in order to generate Grid workflows requested by the user. The AFC2 system is based on
peer-to-peer technology with mechanism of weak peer migration. The concept of peer specialization
enables more efficient service discovery and introduces the base for system learning capability.
The main part of the paper presents the proposal of a new system design, and describes it on
various levels of abstraction. Starting from a conceptual diagrams the discussion steps into more detailed
description, trying to show both static and dynamic aspect of system behavior.
Afterwards, we discuss possible technologies which can be applied for system implementation; it is
accompanied by the list of particular technology advantages.
We conclude with a discussion of expected improvements the new system should have demonstrate.
- Bubak, M., T. Gubala, M. Malawski, K. Zajac: Automatic Flow Building for Component Grid
Applications. Presented at PPAM 2003 Conference, Czestochowa, Poland, September 7-10, 2003, to be printed
Andreas Hoheisel and Uwe Der
Dynamic Workflows for Grid Applications
There are several approaches in the Grid computing community to execute not only single tasks on
single Grid resources but also to support workflow schemes that enable the composition and execution
of complex Grid applications. The most commonly used workflow model for this purpose is the
Directed Acyclic Graph (DAG). DAGs have a very simple structure and are easy to use; they possess,
however, two relevant disadvantages: they do not support bidirectional coupling and it is not possible
to explicitly define loops.
Within the establishment of the Fraunhofer Resource Grid, we developed a Grid Job Definition
Language (GJobDL) that is based on the concept of Petri nets instead of DAGs. Petri nets are graphical
representations of the workflow of discrete systems. In contrast to DAGs, which only describe the
dynamical behaviour of the system, Petri nets also describe the system's state. The type of Petri nets
we introduced here corresponds to the concept of Petri nets with individual tokens (coloured Petri
net) and constant arc expressions.
The Grid Job Definition Language is used to describe the workflow of a Grid application on an
abstract level. This description is independent from the Grid infrastructure and defines the
relationships between the software components (transitions) and the data (places). Transitions can
be annotated with conditions that are dependent from the tokens that are moving along the arcs of
the Petri net. During the workflow execution, the abstract workflow must be concretized in order
to be mapped onto the real Grid environment. This requires dynamic completion of the workflow based
on actual information. It may be necessary to introduce new tasks - such as data transfers, deployment
of software, authorization request, and data retrievals. These tasks can be represented by sub Petri
nets that replace parts of the existent Petri net during runtime of the Grid application.
Only few Grid initiatives include advanced fault management. Mostly the fault management is
predefined implicitly by the Grid architecture, and results in re-scheduling, recovering or migration
of single tasks in case of a fault. We propose a concept for fault management of entire job workflows,
by explicitly modelling the fault management within the workflow model. This can be done user-defined
or automatically by introducing new tasks enabling fault management, based on fault management
Andreas Gellrich, Jacek Nowak, Maxim Vorobiev
EDG 1.4 RPM Package Installation on DESY Linux 4
(SuSe 7.2) Machines
The workstations at DESY (Deutsches Elektronen Synchrotron) run DESY Linux v.4. It is a customized
version of SuSe Linux v.7.2. It uses AFS user accounts and also most of DESY libraries and applications
are shared through AFS. This customized Linux distribution has been used as the base for an EDG
1.4 testbed installation. The installation was performed based on the manual installation from the
"EDG installation guide". Binary RPM packages from the official EDG 1.4 distribution were used. As
the only officially supported platform for EDG 1.4 is RedHat Linux 6.2, a lot of small modifications
to the installation procedure are necessary and many problems need to be fixed. The testbed at DESY
proves that it is possible to install all main Grid nodes (WN, CE, SE, = RB, BDII, RC, UI) on other
Linux flavors than the officially supported one.
In this paper we would like to present the main issues which arise when trying to install EDG 1.4
on DESY Linux v.4, and how to overcome them.
Katarzyna Rycerz, Marian Bubak, Maciej Malawski, Peter Sloot
Execution Support for HLA-based Distributed Iteractive Applications
This paper presents the design of a system that supports execution of a HLA  distributed interactive simulations in an unreliable Grid environment.
The design of the architecture is based on the OGSA  concept that
allows for modularity and compatibility with Grid Services already being developed. First of all,
we focus on the part of the system that is responsible for migration of a HLA-connected component or
components of the distributed application in the Grid environment.
The pleriminary results can be fount in . We present a runtime support Migrator Library (ML) for
easily plugging HLA simulations into the Grid Services Framework. We also present the impact of
execution management (namely migration) on overall system performance.
As HLA  is explicitly designed as a support for interactive distributed simulations, it provides
various services needed for that specific purpose, such as time management useful for time-driven or
event-driven interactive simulations. It also takes care of data distribution management and allows
all application components to see the entire application data space in an efficient way. On the other
hand, the HLA standard does not provide automatic setup of HLA distributed applications. In HLA there
is no mechanism for migrating federates according to the dynamic changes of host loads or failures,
which is essential for Grid applications. In our opinion, the OGSA  concept provides a good
starting point for building and connecting independent blocks of different functionality of the
HLA execution management system.
Our solution introduces HLA functionality to the Grid Services framework extended by specialized
high-level Grid Services. This allows for execution control through Grid Service interfaces; the
internal control and data of distributed interactive simulations flows through HLA. The design also
supports migration of federates (components) of HLA applications according to environmental conditions.
In the full version of the paper we also present performance results of migration.
This research is partly funded by the European Commission the IST-2001-32243 Project "CrossGrid".
- HLA specification, http://www.sisostds.org/stdsdev/hla/
- Foster I., Kesselman C., Nick J., Tuecke S.: The Physiology of the Grid: An Open Grid Services
Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum,
June 22, 2002
- Zajac, K., Bubak, M., Malawski, M. and Sloot, P.: Towards a Grid Management System for HLA-based
Interactive Simulations. To appear in: Proceedings of the 7th IEEE International Symphosium on
Distributed Simulation and Real-Time Applications 2003
Jaroslaw Wypychowski, Krzysztof Nowinski, Piotr Bala
Generic Plugin - New Concept for Developing Users Interfaces in Grid
UNICORE becomes one of the important grid middleware.
The main advantage of the UNICORE environment is
powerful Graphical User Interface which allows for
job preparation, job submission and job management.
These generic features can be extended using plugin concept
which allows for creation of application specific
interfaces adjusted to the users needs.
However, plugin development is still time consuming
and requires significant knowledge of Java.
We have proposed and developed generic plugin,
which allows for creation of the typical
graphical user interface for the application based on the
XML definition. This allows for even non experienced users to
build powerful application interface in the grid environment.
Peter Praxmarer, Paul Heinzlreiter, Dieter Kranzlmueller
GMF: A Framework for Module Management on the Grid
Utilization of grid environments often requires parallel and distributed programming to solve a single
but large-scale problem. In the traditional approach the workload of different modules is distributed
over various heterogeneous grid resources, which are then interconnected in some kind of pipeline
or graph structure. The basic functionality to achieve this distribution and interconnection is provided
by the Globus-Toolkit , which includes software for security, information infrastructure,
resource management, data management, communication, fault detection, and portability. While the
functionality of Globus is fundamental for the application of grid environments, its low-level interface
requires substantial knowledge and effort to utilize it in applications.
The Grid Management Framework GMF addresses this problem by providing a higher level of abstraction
for Globus services. The aim of GMF is to build a framework that encapsulates common tasks necessary
for building modules within an object-oriented framework, and to instance these objects as modules
within a configurable pipeline or graph structure over the grid. The main benefit of GMF is thus that
the application programmer is freed from tedious and error prone tasks necessary to use Globus, and
instead focus on the actual grid-enabled application.
The functionality of GMF is provided by two kinds of modules, an arbitrary set of worker modules and
a master module. The worker modules implement user-defined functionality for solving the application
tasks. In addition, they receive control events from the master module such as start, stop, accept a
connection, connect to another module, perform a user-defined checkpoint (if supported) and migrate
to another resource. Each of these worker modules is instanced through GRAM by the master module,
which is thus capable of generating the worker pipeline (or graph) and controlling its operation.
The commands from the master module are forwarded to the worker modules via separate control connection,
which is installed at module startup.
In addition to the basic functionality of GMF, various other parts of the Globus-API are encapsulated
in the object-oriented C++ framework with integrated error handling (for services such as GlobusIO, GlobusFTPClient, GlobusCommon, GlobusGRAMClient). The error handling procedures can be easily exchanged
on a per operation basis to meet the user requirements. Besides that, GMF enriches the functionality
of GlobusIO with multiplexed connections and a buffered I/O mode to increase the throughput of I/O
connections. Furthermore, the module framework of GMF is designed to support user-defined checkpoints
that allow migration of a module within in a heterogeneous computing environment.
The current version of GMF is an integral part of the Grid Visualization Kernel (GVK) . The idea
of GVK is to build a visualization pipeline over a set of grid resources, and thus enables scientific
visualization on the grid. In addition to GVK, the utilization of GMF is also investigated for the
distributed program analysis environment DeWiz, which delegates the analysis of program state data
to different modules on the grid.
The work described in this abstract is partially supported by the EU CrossGrid Project under contract
number IST-2001-32243. We highly appreciate the contribution of our colleagues at GUP Linz, most notably
- Paul Heinzlreiter, Dieter Kranzlmueller, "Visualization Services on the Grid", Parallel Processing
Letters (PPL), Vol.13, No.2, pp. 135-148 (June 2003)
- Ian Foster, Carl Kesselman, Steven Tuecke, "The Anatomy of the Grid: Enabling Scalable Virt
ual Organizations", International J. Supercomputer Applications, 15(3), 2001
Thilo Ernst, Jochen Wauer
Grid Content Evolution and Grid Content Management
A complex distributed Grid-based Science Portal or similar platform in which data and model resources
from various authors/organziations are integrated on an ongoing basis over substantial time horizons,
to be offered for shared use in virtual organizations cannot be realistically considered a "static"
system which is "finished" at some defined point in time.
Of course, in projects developing such platforms, designated deliverables and milestones must be
committed to in order to control project progress, make sure demonstrations and pilot operations can
be carried out, etc.
However in order to be successfull beyound the start phase, such platforms also need to offer strong
support for further evolution - integration of new data sources and models, versioning, etc.
The VirtualLab project, an ongoing collaboration of Fraunhofer FIRST and DLR (German Areospace Center)
in which a specific Science Portal has been built and which in the next version will fully rely on
Grid technology, provided valuable insights here.
DLR primarily views this platform as a new technology transfer channel, for identifying and trying
out "external application potential" for in-house developed scientific software. Such software is
produced and improved at DLR (and at similar author/vendor organizations likewise) not in isolated,
in frequent activities but rather on a continuous, day-to-day basis. Unsurprisingly, end users who can
easily access this software (and data resources) remotely from their desktops don't expect outdated
It thus should be as easy as possible to integrate new or improved models and data resources into
the platform. If this is not easy enough (that is: possible for people with basic programming skills
but without substantial Grid or Web development background), the platform risks "starvation" in the
For these reasons, we are carefully studying the Grid resource integration process in ongoing relevant
projects and aim at designing adequate support for this process in the next generation of Virtual
Strong parallels to traditional web content management are expected here - just that the concept of
"content" now extends beyound the traditional interpretation of hypertext and multimedia documents to
cover general data and executable resources as well.
Grid Infrastructure Monitoring and Management - Lessons from the EU DataGrid and GridLab Projects
Grid Performance, Grid Benchmarks, Grid Metrics
Grids, as an emerging infrastructure for novel ways of
computation, raises various theoretical and technical
questions. Despite the intensive research work, these
questions are sometimes even inarticulated. Is there a
common understatnding what grids are? Is there an accepted
definition for grid performance? Are traditional performance
analysis techniques adequate for grids? The presentation
points out problems and questions related to performance
analysis in the framework of the new computing paradigm and
proposes a new scenario for benchmarking.
Sergio Andreozzi, Antonia Ghiselli, Cristina Vistoli, Sergio Fantinel, Gennaro Tortone, Natascia De Bortoli
GridICE: a Monitoring Service for the Grid
The Grid is a new paradigm of distributed computing that enables the coordination of resources and
services not subject to centralized control. These resources may span multiple administrative domains,
machine architectures, and software boundaries.
The management of this complex system, that is distributed by nature, has to deal with the
heterogeneity of the resources and the decentralization of the ownership. An appropriate organization
type that can monitor and control a multi-institutional grid is under investigation. Such an organization
is called Grid Operation Center (GOC). Its characteristics, capabilities and use cases are being designed.
With this paper we want to present to the grid community our research and development results in the
area of grid monitoring infrastructure for Grid Operation Centers. Our work gain from our experience
within the WorldGrid event, within the information modeling of grid services related to the GLUE
Schema, and within a close collaboration with the LHC Grid Computing (LCG) Project.
We define the Grid Monitoring as the activity of measuring significant grid resources related
parameters in order to analyze usage, behavior and performance of the grid, detect and notify fault
situations, contract violations, and user-defined events.
The current outcome of our activity is a monitoring infrastructure called GridICE. In its first
release, we have privileged aspects such as easy integration with the current production grid
middleware, and modularity of the components in respect of the separation of concern design principle.
The current architecture is structured in six layers, from initial producers of monitoring data to
final consumer of monitoring information.
The first layer is the measurement service, which task is to probe resources for simple/composite
metrics defined in the information model. These metrics mostly refer to quality aspects. The second er is the publisher service and its task is to offer the gathered data to potential consumer. Our
decision was to rely on the available grid information service, that is the Globus MDS 2.x, an
LDAP-based solution for the access to distributed time-sensitive data. Advantages of this choice are
the availability of a distributed query engine and a standard interface to data of different resources.
The drawbacks are the need for continuous polling (no event-based data delivery is implemented), and
the absence of persistence storage for historical data. With proper design choices, we have managed
to smooth the former limitation, while we solved the latter limitation by introducing a data collector
service in the fourth layer. This service is provided with a self-discovery feature that
automatically can detect new grid services and configure them to be properly monitored. The fifth
layer is a set of two services: detection and notification service, providing a flexible and
configurable mean for event detection and notification actions, and a data analyzer service,
providing performance analysis, usage level and general reports and statistics. The sixth layer is the presentation service, a web-based graphic user interface that offer concise monitoring information.
This is designed on a role-based strategy, providing for different views depending on
the type of information consumer.
GridICE, release one, has been already selected to monitor several multi-institutional grids. The
first large deployment started with the LCGpre1-CMS grid. After, it has been selected for LCG1 grid
monitoring in both testing and deployment testbeds. And finally, has been deployed for the Italian
Grid of Sciences.
Our future work will focus on the consolidation and improvement of the current infrastructure. Moreover,
new solutions to overcome the aforementioned limitations will be investigated.
Holger Marten, K.-P. Mickel
GridKa – the German Regional Tier-1 Computing Centre: Status, Strategy
and Future Plans
The Grid Computing Centre Karlsruhe, GridKa, will be one of the 10-15 largest computing centres within the LHC Computing Grid Project. In 2001, 40 groups of German particle physicists initiated the construction of GridKa as a German Regional Tier-1 Centre in the LHC framework, and the official inauguration of GridKa took place at the end of October 2002. Today, GridKa already hosts 460 Linux processors, 110 TeraBytes of usable disk space and 170 TeraBytes of tape storage. Part of these resources are made available to already running high energy physics experiments: BaBar at SLAC, CDF and D0 at FermiLab and Compass at CERN will generate a data volume of about 1/10 of LHC during the next years. They are optimal candidates to test and validate the scaling of the hard- and software infrastructure. In 2007 - at the startup of LHC - GridKa will host about 4000 processors, 1500 TeraBytes online and 3800 TeraBytes tape storage, and co-operate via a few dozens of Gigabit connections with hundreds of other grid installations worldwide. The paper summarizes the organizational structure of GridKa, current strategies for the infrastructure setup, and outlines some of the related R&D projects.
Ladislav Hluchy, Ondrej Habala, Branislav Simo,
Jan Astalos, Viet D. Tran, Miroslav Dobrucky
Grid-based System for Flood Forecasting
This paper presents our experience and current status of a Collaborative Problem Solving Environment for Flood Forecasting under development as a part of the IST CROSSGRID project. Over the past few years, floods have caused severe damages throughout the world. Most of the Europe was heavily threatened. Therefore, modeling and simulation of flood forecasting in order to predict and to make necessary prevention became very important matter. The environment described here uses Grid technology to interconnect experts, data and computation resources needed for quick and correct flood management decisions. In the core of the system lies a coupled set of simulation models used to predict precipitation and temperature, hydrological river status and hydraulic events in target areas. The environment and its web-based interface also provides some basic communication tools, enabling its users to cooperate. Virtual Organization for Flood Forecasting, using this environment may consis of several cycle providers, storage providers, end users, experts and developers.
Forecasting of flood events requires quantitative precipitation forecasts as well as forecasting of temperature (to determine snow accumulation/melting). The system makes use of the ALADIN/SLOVAKIA model. ALADIN is a LAM (Limited Area Model) developed jointly by Meteo France and cooperating countries. In the next stage we are using several hydrological simulation models, depending on conditions and needs which will be applied model for which situation and territory, they can be also used in combined way. For hydraulic predictions, FESWMS (Finite Element Surface-Water Modeling System) Flo2DH is used, which is a 2D hydrodynamic, depth averaged, free surface, finite element model. Flo2DH computes water surface elevations and flow velocities for both super- and sub-critical flow at nodal points in a finite element mesh representing a body of water (such as a river, harbor, or estuary). imulation of floods is very computation-expensive. Several days of CPU-time may be needed to simulate large areas. For critical situations, e.g. when a coming flood is simulated in order to predict which areas will be threatened, and to make necessary prevention, long computation times are unacceptable. Therefore, FESWMS Flo2DH was parallelized in order to achieve better performance.
The storage space for simulation outputs and direct measurements used by the application is provided by II SAS. Hourly outputs of meteorological simulation, hydrographs provided by the hydrological part of the cascade and selected hydraulic outputs will be stored. The storage will also hold configuration files for the simulations and some other resources, needed to operate the application. The stored files are accessible through standard Grid tools used in the CrossGrid testbed. We are also working on a common description scheme for these files and a way to store the metadata in a Grid-aware database system. The metadata structure will include detailed information about origin of the file, time of its creation, the person who actually created it, etc. In case the file is the output of a simulation, the metadata will also contain names of the input files, model executable and configuration files.
GRIP: Creating Interoperability between Grids
Roger Menday and Philipp Wieder
GRIP: the Evolution of UNICORE Towards a Service Oriented Grid
The current UNICORE software implements a vertically integrated Grid architecture providing seamless
access to various resources within different Virtual Organizations. The software is deployed and
developed by companies, research and computing centres and projects throughout Europe coordinated by
the UNICORE Forum (http://www.unicore.org).
Interoperability between two different Grid infrastructures, UNICORE and Globus, enlarges the range
of available resources and service available to each system, and was the motivation for the Grid
Interoperability Project (GRIP, funded in part through EC grant IST-2001-32257). GRIP designed and
implemented an interoperability layer, the capabilities of which have been demonstrated at conferences
and workshops. In addition the project is contributing to the standardization efforts within the
Grid community by participating in or leading Global Grid Forum activities.
With the advent of the Open Grid Services Architecture (OGSA) and - Infrastructure (OGSI) and the
increasing usage of Web Services for the operation of Grids, the focus of the project changed. One
of the benefits Web Services bring to Grid computing is the concept of loosely coupled distributed
services. Merging the idea of "everything being a service" with the achievements of the Grid community
led to Grid Services, enabling a new approach to the design of Grid architectures. The adoption of XML
and the drive for standardisation of OGSI compliant protocols provide the tools to move closer to
the promise of interoperable Grids. An early demonstrator validated the correspondence of
UNICORE's architectural model with the OGSA/I approach and encouraged GRIP to shift its efforts to
start the development of an OGSA/I compliant Grid based on the UNICORE architecture.
In this paper we discuss the UNICORE exemplary for the evolution of a Grid system towards a service
oriented Grid, primarily focussing on architectural concepts and models. Based on the current
architecture and the enhancements provided by GRIP, we depict first steps already taken to integrate
Web - and Grid Services into UNICORE. This includes the provision of OGSI compliant port types parallel
to the proprietary interfaces as well as the design of XML based protocols. Furthermore we present
the roadmap taken by GRIP to achieve a consistent development towards an OGSA implementation. In
addition to the GRIP related achievements and plans we report on the current status of the OGSA/I
standardisation. We also consider the evolving Web Service standards and relate them to the UNICORE
architecture, particularly regarding the recent developments towards a service-oriented Grid
GRMS: GridLab Resource Management System
Kazimierz Balos, Leszek Bizon, Michal Rozenau, Krzysztof Zielinski
Interoperability Architecture for Grid Networks Monitoring Systems
Grid networks are computing environments where resource brokering and load balancing require reliable
monitoring system with interfaces adequate to area where they are working and security systems
assuring that confidential data are transferred in secure and efficient manner. Considering typical
grid networks consisting of over a dozen clusters where each cluster consists of several worker nodes
there is also a need for efficient way to install, run and maintain such monitoring system.
The aim of this study is to solve the problem of creating such interfaces of monitoring systems which
will be suitable for distributed and heterogeneous environments, especially for clusters and grid
networks. This study looks at development of scalable and easy to maintain system, that can be used
to expose monitored parameters, like network traffic and nodes' infrastructure resource availability
to outer applications for further processing.
This paper covers topic of hierarchical information aggregation in context of local area networks
where computing elements of clusters work, and wide area networks where clusters are cooperating and
where there is need for choosing appropriate protocol for information interchange. Approach shown in
this article presents current achievements in area of using Sun's early implementation of Java
Management Extensions (JMXTM) technology for communication at cluster level and Web Services
and SOAP protocol for communication at the grid network level. This also covers a way of dynamic
monitored stations registering using an open implementation of discovery services, which can be used
in all environments performing on Open Source license.
Finally, there are presented results of SOAP Gateway implementation, which can be used for
comparison purposes with other existing interfaces of monitoring systems. There are presented
results of long term making use of such a system, including automated installation methods on number
of nodes, gathering information through RMI and SOAP protocols and maintenance of software, including
development, debugging and upgrading to the newer versions of modules. There are also presented results
of system monitoring performance and it's impact on monitored stations.
Using monitoring system presented in this study is adequate where there is developed security
infrastructure for SOAP traffic encryption. Although this study does not present any security subsystem
which can be used for this purpose, there is possibility to use existing and coherent with Web Services
technology in form of SOAP requests' interceptors what is covered in the last section.
Witold Alda, Tomasz Wojtowicz, Piotr Bys, Michal Gabor,
Dariusz Gocol, Jacek Kitowski
Java Applications for Web-based Visualisation
of Biological Data
We present three java applications for visual presentation of biological data, developed within the PROGRESS project. All programs can be loaded as applets from the given address, but they can also be run as applications under control of the migrating desktop - one of the interface tools in PROGRESS portal. Similarly the data can be taken directly from the given address, or from the database available through the portal. The first program is designed for simple 3D visualisation of biomolecules: proteins and DNA structures. 3D graphics is based on Java3D library. Special attention is paid to details which could help visualisation and analysis of secondary structures. The second program is used for two-dimensional visualization of the results from genome assemblation calculations. The third program shows the structure of phylogenetic trees generated on the basis of the evolution data.
Master-Worker Workflow Management for Distributed Biomedical
Applications in the GIRD
GRID middleware provides basic services and foundation frameworks for
building global environments for scientific computing. Low level
issues are addressed such as security, virtual organizations, data
replication, job submission and execution. DIANE framework
(http://cern.ch/diane) is a generic workflow manager for distributed
master-worker applications, which builds on top of the existing GRID
middleware providing high-level facilities and idioms for application
development and deployment. DIANE may be easily deployed into a
concrete GRID environment such as Globus Toolkit or used in standalone
clusters with popular workload management systems such as LSF or PBS.
DIANE is a callback framework which controls the job execution,
creates tasks and workers, passes data messages between the master and
the workers and finally integrates the task output. Applications do
not have to open communication channels explicitly -- this setup is
done automatically. Fixing master-worker parallel computation model
limits the generality of the applications but enormously increases the
flexibility of the framework itself in a way completely transparent to
applications. Runtime flexibility allows to switch between in-process
application loading based on shared libraries and IPC-based
application execution based on statically-linked executables. Both
setups have practical implications and DIANE offers a convenience to
choose the appropriate one in an easy way.
Architecture of DIANE is based on component-container object model,
which allows creation of various Application Adapters to enhance the
framework easily. Application Adapters provided by default support
python and C++ bindings, as well as mixing the two in single job.
The paper presents basic architecture and design principles of DIANE
and benchmark results of distributed simulation in biomedical
Claus-Juergen Lenz, Detlev Majewski
Meteo-GRID: Performing Local Weather Forecast Using GRID Computing
Today there is an increasing demand for reliable high-resolution short range (up to 48 hours) weather
forecasts for government, industry, traffic and media. These local forecasts are most valuable in
cases of high-impact weather, that is for weather systems such as tropical (hurricanes, typhoons)
and violent extra-tropical storms and severe thunderstorms which may result in loss of lifes and
property due to wide-spread flooding and gale-force winds. Many national weather services as the
Deutscher Wetterdienst (DWD) run regional (local) numerical weather prediction models with a mesh
size of 10 km or less up to four times a day to provide the necessary forecast products for the
Within the framework of the European Union shared research and technology development project
EUROGRID (Application Testbed for European GRID Computing, IST-1999-20247; funding period: November
2000 until January 2004) the aim of the Deutscher Wetterdienst is to provide local weather prediction
for arbitrary regions in the world via Internet and EUROGRID (subproject Meteo-GRID) using the
relocatable non-hydrostatic numerical weather prediction model LM (Local Model). This ASP (Application
Service Provider) solution will allow virtually anyone to run a high-resolution numerical weather
prediction model on demand for his/her domain of interest and hence to calculate his/her own weather
prediction. For this purpose the user will be able to specify the model domain, grid resolution, initial
date and time, forecast range and forecast products via a Java based Graphical User Interface (GUI).
Taking into account the user specifications the following steps are executed. All steps are performed
within the EUROGRID software environment.
- Derivation of topographical data for the model domain selected by the user from high resolution
(1 km x 1 km) data sets stored in a global geographical information system (GIS) at DWD.
- Preparation of initial and lateral boundary data sets for the Local Model (LM). These data are
derived from analyses and forecasts of the Global Model GME, which are stored in an ORACLE data base
- Transfer of the topographical data and the GME data to a high performance computer within the HPC
GRID of EUROGRID.
- Execution of the LM forecast run on the supercomputer mentioned in the foregoing step. The job
consists of two separate tasks which run in parallel, namely an interpolation program GME2LM and the
numerical weather prediction model LM itself.
- Dissemination of the LM forecast data to the user's computer or visualization of the LM results
on a computer within the HPC GRID and transfer of the graphs to the user using the Internet and
In the presentation a more detailed description of the steps of the EUROGRID application of the LM
will be shown. In addition, some specifications and requirements of a numerical weather forecast model
on supercomputers will be discussed.
Bob Dobinson, Piotr Golonka, Andreas Hirstius, Mihai Ivanovici,
Catalin Meirosu, Stefan Stancu
Moving the Decimal Point: 10 Gigabit Ethernet between Geneva and
Marian Bubak, Wlodzimierz Funika, Marcin Smetek
and Roland Wismueller
OMIS-compliant Monitoring System for Java-Distributed Applications
A prototype monitoring system, the J-OCM, compliant with the On-line Monitoring Interface Specification (OMIS) [2,3] concept, provides the ability to observe and manipulate a whole distributed Java application's execution. The major concept of the Java oriented On-line Monitoring Interface Specification (J-OMIS) , which extends the original OMIS and underlies, a Java-oriented monitoring system, the J-OCM, is a set of object types with support for object specific services. The tool e.g. performance analyzer is provided with access to such objects as node objects, JVM objects, threads, class objects etc. provided with appropriate services.
The architecture of the J-OCM  comprises a central component, responsible for distributing tool requests and assembling replies, and a distributed part, across nodes,Local Monitors, and across JVMs, JVM Local Monitors which are agents embedded into JVM processes. The target Java system it is considered in terms of the client-server distributed system architecture with focusing on its components: interface definition, proxy, object manager, naming service, and communication protocol.
In our event-based monitoring system, basic events are captured by sensors which are inserted in the target system and sent to the monitoring system. The monitoring system takes some action(s) - a sequence of instructions associated with the event. These actions can either carry out data collection, or also manipulate the running program.
The original event model provided in the OCM has been extended by a Java specific event submodel which covers the functioning of the basic application and execution entities of the Java distributed application. >
A greater part of the designed services underlying the functionality of the J-OCM has implemented [4,5,6]. The results of this research may be applied in monitoring of grid services.
- M. Bubak, W. Funika, P. Metel, R. Orlowski, and R. Wismueller: Towards a Monitoring Interface Specification for Distributed Java Applications. In Proc. 4th Int. Conf. PPAM 2001, Naleczow, Poland, September 2001, LNCS 2328, pp. 315-322, Springer, 2002
- T. Ludwig, R. Wismueller, V. Sunderam, and A. Bode: OMIS - On-line Monitoring Interface Specification (Version 2.0). Shaker Verlag, Aachen, vol. 9, LRR-TUM Research Report Series, (1997)
- R. Wismueller, J. Trinitis and T. Ludwig: A Universal Infrastructure for the Run-time Monitoring of Parallel and Distributed Applications. In: Euro-Par'98, Parallel Processing, volume 1470 of Lecture Notes in Computer Science, pages 173-180, Southampton, UK, September 1998. Springer-Verlag
- M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Request processing in the Java-oriented OMIS Compliant Monitoring System. Presented at 5th Int. Conf. PPAM 2003, Czestochowa, Poland, September 2003 (to be printed)
- M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Event Handling in the J-OCM Monitoring System. Presented at 5th Int. Conf. PPAM 2003, Czestochowa, Poland, September 2003 (to be printed)
- M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Architecture of Monitoring System for Distributed Java Applications. Presented at Euro PVM/MPI Int. Workshop 2003, Venice, Italy, September 2003
Bartosz Balis, Marian Bubak, Wlodzimierz Funika, Marcin Radecki,
Tomasz Szepieniec, Roland Wismueller
OMIS/OCM-G and Other Application Monitoring Approaches for the Grid
While the current Grid technology is oriented more towards batch processing, the CrossGrid project is
focused on interactive applications, where there is a person `in the computing loop'. Monitoring of
interactive applications is only possible in the on-line mode in which the information is immediately
delivered to the visualization tools with low latencies. Only then can the user's interactions can be
related to the performance results. On-line monitoring is also essential to enable manipulations on
the target application.
In this paper, we provide an overview of three grid application monitoring
approaches currently being developed and compare it to our approach based
on OMIS/OCM-G . The projects/systems mentioned are GrADS (Autopilot) ,
GridLab (based on GRM) , and DataGrid (GRM) .
The GrADS project introduces a framework for grid application development. A
part of it is the Autopilot toolkit, which can gather real-time application and
infrastructure data, and analyse it as well as allows for the modification of
the application's behavior. Autopilot is, however, more oriented towards
automatic steering than providing feedback to the programmer. It gives a rather
general view of application and environment, e.g. to explore patterns in
behaviour instead of particular performance loss.
Application Monitoring system developed within the GridLab project
implements on-line steering guided by performance prediction routines
deriving results from low level, infrastructure-related sensors (CPU,
network load). However, this approach is not suitable
for interactive applications. First, it does not allow for manipulations
on the target application. Second, the approach seems to rely only on full
traces. Finally, the semantics of all metrics is fixed which does not
allow for user-defined metrics.
In DataGrid project, the GRM monitoring system is introduced. The GRM is a
semi-on-line monitor which collects information about application and
delivers them to the PROVE visualisation tool. While the GRM/PROVE
environment is well suited for the DataGrid project where only batch
processing is supported, it is less usable for the monitoring of
interactive applications. First, the R-GMA communication infrastructure
used by GRM is based on Java servlets, which introduce a rather high
communication latency. Second, achieving low latency and low intrusion at
the same time is basically impossible when monitoring is based on trace
data. If the traces are buffered, the latency increases, if not, the
overhead for transmitting the events is too high.
The monitoring infrastructure created in the CrossGrid project is the OCM-G
which is a distributed, decentralized, autonomous system, running as a
permanent Grid service and providing monitoring services accessible via a
standardized interface OMIS. The system works in on-line mode, and allows for
manipulations. Its general philosophy is to provide a flexible set of
monitoring services which allow to define metrics with a semantics the user
needs, instead of providing a predefined set of high-level metrics.
- B. Balis, M. Bubak, W. Funika, T. Szepieniec, R. Wismueller,
Monitoring and Performance Analysis of Grid Applications. In
Proc. ICCS 2003, St. Petersburg, Russia, June 2003. Springer 2003
- J.S. Vetter and D.A. Reed. Real-time Monitoring, Adaptive Control and
Interactive Steering of Computational Grids. In: The International Journal
of High Performance Computing Applications, vol. 14, 2000
- The GridLab project web pages: http://www.gridlab.org
- Z. Balaton, P. Kacsuk, N. Podhorszki, and F. Vajda. From Cluster
Monitoring to Grid Monitoring Based on GRM. In Proc. Euro-Par 2001 Parallel
Processing, August 2001, Manchester, UK, Springer 2001
OpenMolGRID: Complex Problem Solving in Molecular Design
Porting Applications to Globus Toolkit 3.0 and Designing an OGSI-based Architecture
This tutorial will explore the issues involved in the process of engineering
the applications on top of the OGSA and Globus Toolkit 3.0.
The latest major edition of the Globus Toolkit allows developers to
build applications in a grid service oriented architecture. This new
technology based on web services opens a wide range of new
possibilities, but at the same time requires a certain discipline from
the software designer and poses new challenges to the project manager.
We will explore the pros and cons of moving applications into an OGSI
compliant architecture. The question whether it makes sense to grid enable a
software component can be answered by identifying and comparing the potential
benefits and dangers of such move.
We will later move to discuss the features of the Open Grid Service
Architecture. We will demonstrate how to apply them to chosen applications.
Finally the technology issues will be discussed, and participants will get a
feel of the problems expected at the implementation level.
The workshop will be strongly based on real examples. One of the examples we
wish to explore is the process of designing and builging the NeesGRID's NTCP
component, the first application of GT3.0 deployed in production.
We encourage the participants of the tutorial to submit their own questions,
problems and suggestions related to their own experience or plans of building
a GT3.0-based application layer infrastructure. We will spend time analyzing
the submitted questions and exploring them on the tutorial. Information is
available at http://perfringo.com/events/
Marian Bubak, Andrzej Jozwik, Maciej Malawski, Katarzyna
Rycerz, Dominik Ziembinski
Porting Irregular and Out-of-core Computations on Grids
Parallelization of irregular problems is a non-trivial task due to the access pattern to data,
where data arrays are accessed by one or more levels of indirection arrays. Our work on
parallelization of such problems resulted in the implementation of the LIP  library that
was providing the runtime support for irregular programs in MPI on clusters.
Our current work concentrated on a feasibility study of possible migration of the LIP library
towards the Grid environment. Grid offers more computing power to solve large-scale problems, but
its distributed nature makes programming more difficult and less effective. Our new G-LIP library
tries to support irregular computations on the Grid.
There are two phases for computing irregular problems when using LIP library: inspector and
executor. To support applications with significant computation effort those two phases have been
split like in master-workers schema. Role of masters play inspectors and workers are nodes that
process irregular loop. At the beginning of running application all data elements are distributed
among inspectors. Then, inspectors examine data References:, build communication schedules, gather
non-local data elements and translate global indexes into local. After that, each inspector
broadcasts pre-packed data arrays to its workers. Meanwhile workers do computation, inspectors build
consecutive communication schedules for the next time step. When workers finish their job and return
computed results, inspectors scatter received data according to first build communication schedule
and gather data for the next time step of computation using previously prepared schedules. This
approach was subject to preliminary tests using MPICH-G and Globus Toolkit 2.x.
The second improvement to the LIP library was support for adaptive problems. In such problems, data
arrays accessed via indirection arrays and data access patterns change during computation. Hence
the schedule must be regenerated each time step and communication volume rises as data partitioning
does not cover data elements migration. Since most of index analysis can be reused, communication
schedule should be only updated and so cost of processing inspector--phase is minimized. Moreover, to
keep the communication volume at the same level, shifting data elements across processors should be
- Brezany, P., Bubak, M., Luszczek, P., Malawski, M., Zajac, K., A Runtime Support for Large-Scale
Irregular Computing on Clusters and Grids, Annual Review of Scalable Computing, in: Kwong, Y. C. (E
ds.), Series on Scalable Computing, vol. 5, Singapore University Press and World Scientific, 2003, p
Bartosz Balis, Marian Bubak, Michal Wegiel
Proposal of Adaptation of Legacy C/C++ Software
to Grid Services
The adaptation of legacy C/C++ software to web/grid servieces are important from both scientific 
and commercial  point of view. Yet, existing approaches are mainly concentrated on the web
services technology. In consequence, they lack a number of crucial concepts and may prove unacceptable
in the grid services context. Along with the emergence of the Open Grid Services Architecture  a
novel, significantly different approach was introduced. It extends the model of web services by
imposing additional requirements concerning security and lifetime management of stateful service
We focus on the adaptation of exsiting legacy libraries and applications to work as grid services.
The clients interact with these services in a standard fashion using WSDL, SOAP and message-level
security. They are expected to follow the paradigm of service factories and create service instances
for their own exclusive usage. The central concept behind the proposed architecture is as follows.
Each created service instance is automatically associated with an internal proxy client. This client
is hidden to the external world and bears responsibility for translating the actual client method
invocations to the underlying native calls. It receives client requests and supplies the corresponding
results via interaction with the specialized proxy service. One instance of a proxy service is
introduced for each service instance created by a client. The proxy client can be run on arbitrary host
with the required legacy libraries installed and can even migrate during its execution. This conforms
to the ideas of brokering and job submission mechanisms. Proxy clients are in fact jobs scheduled fo
r execution on behalf of the corresponding client. This enables to achieve a high degree of
scalability. Obviously, proxy clients need to be implemented in native languages (C/C++) since they
directly cooperate with the legacy code. However they are not different from other clients in the respect
of the manner in which they perform communication (language-neutral SOAP messages are used).
There are a number of advantages of the above approach, to name only the most important ones:
- no changes are required in the existing legacy code,
- the solution ensures hight scalability and security (no open ports needed,
authentication, authorization, credentials delegation are incorporated),
- high flexibility is achieved by introducing a universal framework,
- substantial development effort is avoided due to the use of tools
facilitating migration to grid services (e.g. gSOAP, Java2WSDL etc.),
- the solution is portable between various grid hosting environments,
- compatibility with grid service requirements (lifetime management, job
submission, notifications, etc.).
As a proof of the concept, the presented solution was partially implemented in a project aiming at
adapting grid application monitoring system, OCM-G, to grid services. Globus Toolkit 3.0 together
with gSOAP package and GSI plugin were employed for this purpose.
- Yan Huang, Ian Taylor, David W. Walker, Robert Davies Wrapping Legacy Codes for Grid-Based Appli
cations. To be published in proceedings on the HIPS 2003 workshop.
- Dietmar Kuebler, Wolfgang Eibach (IBM Germany) Adapting Legacy Applications as Web Services.
- The OGSA project homepage: http://www.globus.org/ogsa
Witold Alda, Remigiusz Górecki, Marek Budyn, Michal
Prototype System for Distributed Scientific Visualisation
We present the architecture of distributed visualisation system which reads the data from the remote server or servers and distributes further processing, such as data extraction, mapping, visualisation on both remote and local computers. The decision how the processing should be distributed is in the users' hand.
The architecture of the is designed to fulfill the needs of general purpose visualisation. Thus system is flexible and can be easily enriched by adding new modules on both server and client sides. The user can build the visualisation pipeline by picking available modules and constructing his/her own projects and scenarios. The entire system is written using Java/XML technologies. Existing visualisation modules use Java3D, but tests with OpenGL and GL4Java show that they can also be applied in the system with any problems. Currently,the prototype modules for chemical molecules visualisation are available.
CERN European Organization for Nuclear Research
Review of DataGrid Progress and Plans for the EGEE Project
We present the architecture of distributed visualisation system which reads the data from the remote server or servers and distributes further processing, such as data extraction, mapping, visualisation on both remote and local computers. The decision how the processing should be distributed is in the users' hand.
The architecture of the is designed to fulfill the needs of general purpose visualisation. Thus system is flexible and can be easily enriched by adding new modules on both server and client sides. The user can build the visualisation pipeline by picking available modules and constructing his/her own projects and scenarios. The entire system is written using Java/XML technologies. Existing visualisation modules use Java3D, but tests with OpenGL and GL4Java show that they can also be applied in the system with any problems. Currently, the prototype modules for chemical molecules visualisation are available.
SAN over WAN - a New Way of Solving the GRID Data Access Bottleneck
After solving the user access, job scheduling and system monitoring remote data access of big
amounts of data still remains a challenge in many GRID environments. Due to the bandwidth limitations
and the mismatch between peak and sustained throughput performance of the wide area networks (WAN)
big files have to be copied well in advance of running the job or sometimes even sent on tape to the
targeted GRID node. Obviously the results have to undergo the same procedure.
This fact sometimes makes it inefficient to run a job on a remote, better suited or less loaded
SGI's engineering spent the last couple of years in extending industry standard storage architectures
which are traditionally tailored to the needs of commercial applications and environments to the
slightly different requirements of the technical and scientific world.
The lastest storage technology is SAN (Storage Area Network). The idea was to solve the bandwidth
bottlenecks traditional file servers have due to the fact that they use shared or dedicated TCP/IP
networks for data transport. SANs use their own storage networks based on the Fibre Channel protocol
which is better suited for high performance data access to storage devices like raid arrays and
However those SANs are limited to local sites, usually buildings, due to the distance limitation of
Fibre Channel. Also only one system can access the data on one partition of the storage array and
therefore data sharing between systems again has to be done via TCP/IP networks.
The paper presents a new concept of sharing and accessing data from different nodes in a GRID
environment which has been co-developed by SGI and LightSand, a company producing gateways for long
distance network connections. It discusses the implementation details and also the resulting benefits
for a GRID environment. One of those is that the application has direct access to the data on the
"home" storage array from every GRID node without the need of copying the data to the GRID node where
the job is executed.
Case studies and other areas of deployment are shown as well.
Konrad Karczewski, Lukasz Kuczynski and
Secure Data Transfer and Replication Mechanisms
in Grid Environments
Data transfer issues are amongst the most important in the modern Grid environments. The applications
running currently in Grid environments become more real--life oriented, they base on and generate
data sets of growing importance and confidentiality. All this implies that data management systems
must provide great reliability and should incorporate mechanisms of data safety and confidentiality.
In the data replication service confidentiality of data becomes a major problem, because of the need
for the geographically and logically distributed data storage. This implies either chosing only
"trusted" data storage places, which may prove difficult in many cases. More importantly, it negates
the idea of transparent data replication with intelligent load balancing. It is necessary to create
data replication service providing not only data security by the means of replication and/or encrypted
transmision but including as well methods of keeping replicated data confidential.
Confidentiality should be understood on multiple levels. The first one is keeping data on the storage
device illegible for unathorised users. The second one is keeping unauthorised users not aware even
of existence of a given data set. This implies the necessity of keeping replica content unknown even
for the owner of the storage device it is being kept on. This requirement imposes development of
distributed replica catalog keeping track of all data entering the data management system and providing
unique identifiers replacing the original URIs. Moreover the catalog service has to include access
control lists and authentication mechanisms to ensure the required level of data privacy.
Additional level of confidentiality can be achieved by data partitioning. In addition to URI encoding,
every data set can be divided into multiple parts stored in different storage elements. Original
data can be only reconstructed when the information about physical location and ordering of partitions
is retrieved from the replica catalog by an authorised user. This solution allows not only for greater confidentiality and security of data but also can improve data management system performance by making
parallel transfers from multiple sources possible.
In such confidentiality-oriented environment it is crucial to remove every possible single point of
failure, especially it is essential to keep multiple, synchronised instances of replica catalog. In
this case development of distributed replica catalog seems the most promising solution. In addition,
creation of a distributed data broker is necessary. Its main task would be user authentication and
authorisation, passing user requests to the replica catalog, making decisions about data partitioning, replication and reconstruction.
Syed Naqvi, Michel Riguidel
Security Risk Analysis for Grid Computing
The security and privacy issues are coming to the fore with the growing size and profile of the
Grid community. The forthcoming generations of the Computational Grid will make available a huge number
of computing resources to a large and wide variety of users. The diversity of applications and mass
of data being exchanged across the Grid resources will attract the attention of hackers to a much
higher extent. A comprehensive security system, capable of responding to any attack on its resources,
is indispensable to guarantee the anticipated adoption of Grid by both the Grid users and the resource
providers. In this article, the authors argue that the first brick of an effective plan of
countermeasures against these threats is an analysis of the potential risks associated with Grid
This article presents a pragmatic analysis of the vulnerability of existing Grid systems and the
potential threats posed to their resources once their spectrum of users is broadened. Various existing
Grid projects and their security mechanisms are analyzed. The experience of using common Grid
software and an examination of Grid literature served as the basis for this analysis. Legal loopholes in
the implementation of Grid applications across the geopolitical frontiers, and the ethical issues
that could obstruct the wide acceptance and trustworthiness of Grids are also discussed.
The weaknesses revealed are classified with respect to their sources and possible remedies are discussed.
The results show that the main reason for the vulnerability is the fact that Grid technology has been
little used except by a certain kind of public (mainly academics and government researchers). This
public benefit greatly from being able to share resources on the Grid, and have no intention of harming
the resource owners or fellow users. Thus there was no need to address security in depth. This is all
about to change. The number of people who know about the Grid is growing fast, as are the
worthwhile targets for the potential attackers. The security nightmare can not be avoided unless the problem
is addressed urgently.
This detailed taxonomy of potential threats and the sources of vulnerability in the existing Grid
architectures is the first milestone on the road to a robust Grid security system. It provides a
comprehensive overview which shall enable us to effectively plan the countermeasures against the
existing risk. Our future direction includes the definition of a Protection Profile (Common Criteria)
followed by the formulation of a comprehensive security policy and finally its implementation.
NB: This research work is a part of ongoing research activities of the Information Technology Security
Group at the Computer Sciences and Networks Department under the patronage of European Union Information
Society Technologies funded projects.
Marian Bubak, Maciej Malawski, Grzegorz Mlynarczyk, Piotr
Nowakowski, Robert Pajak, Michal Turala, Katarzyna Rycerz
Software Development in the EU CrossGrid Project
CrossGrid is one of the largest European projects in Grid research, uniting 21 separate institutions
and funded by the 5th European Framework Programme. As such, it demands the creation and implementation
of custom-tailored software development, testbed integration and quality assurance procedures.
The Project demands that the following be achieved:
- all software developed within CrossGrid should follow a uniform set of rules and coding conventions,
- similar testing and validation procedures should be employed by all partners responsible for
- CrossGrid testbeds need to be prepared on time for software releases, with a uniform set of
middleware, pre-tested and approved by the Project Architecture Team,
- consistent quality assurance must be implemented throughout all stages of the Project, pursuant
to terms and conditions set out by the European Union for large, multinational collaborations.
All the above mentioned goals can only be realized through a systematic approach to software design,
verification and quality control. This article will outline some aspects of this approach, as developed
by CrossGrid Project Management and implemented throughout the Project Consortium. In particular,
the CrossGrid Quality Assurance plan will be discussed, as it explains the relevant procedures and
organizational bodies dealing with all aspects of CrossGrid software development, testing and
integration. Special attention will be paid to unit testing and the notion of Quality Indicators,
very well suited for ensuring quality within a project of this scope and magnitude.
- Eric J. Braude, Software Engineering: An Object-Oriented Perspective (John Wiley & Sons, Inc., 2001)
- The CrossGrid Quality Assurance Plan; available at: http://www.eu-crossgrid.org/Deliverables/1st
- CrossGrid Deliverable D5.2.3 (Standard Operating Procedures); available at:
Maciej Malawski, Marek Wieczorek, Marian Bubak, Elzbieta Richter-Was
Storage and Analysis System for Data Intensive High Energy Physics Applications
The presented work is devoted to the problem of storage and analysis of data originating from the
great experiments of particle physics. Detector simulation software produces a number of large datasets
that are subject to further analysis resulting in generation of specific histograms . The problem
of managing such large number of files becomes a non-trivial task for the researcher. Our work
is aiming for making this problem easier.
We introduce and describe the Lhcmaster - a system storing the data and performing basic analysis of
data (production of the histograms for each data file).
The Lhcmaster is the database system based on the relational model of data, providing several user
interfaces including command line administrative interface and web interface used by external user.
Main use case of the Lhcmaster is to get the files from the base, relying on the graphic index of
files (sets of histograms generated for each file). The basic analysis if performed within the ROOT
framework. There is also a mechanism of an authentication and authorization of the users, based on the
model of user groups. The system is implemented with use of basic languages and technologies like
MySQL RDBS and Perl CGI.
The Lhcmaster-G is project of a system equivalent to the Lhcmaster, but realized with use of completely
different Grid technologies. We believe that this approach allows to create a more powerful and
flexible tool, adapted to operate within a distributed and heterogeneous environment.
Furthermore, it would be also possible to add new functionalities to the Lhcmaster-G in comparison
with the Lhcmaster. In particular, production of new data files and more sophisticated data analysis
might be performed in the future within the Lhcmaster-G. The outline of Lhcmaster-G assumes that the
system will be based on the widespread Grid tools, such as the Globus Toolkit and the European Data
Grid . The Lhcmaster-G hasn't been implemented yet, but the present work contains detailed
description of how the system might be implemented.
- ATLFAST 2.0 a fast simulation package for ATLAS, ATL-PHYS-98-131 13 Nov 1998
- European DataGrid project http://www.eu-datagrid.org
The Grid Visualization Kernel - Scientific Visualization on the Grid
Marian Bubak, Wlodzimierz Funika, Roland
Wismüller, Tomasz Arodz,
The G-PM Performance Measurement Tool for Interactive Grid Applications
The task of performance analysis of interactive
applications, which feature highly distributed nature and
dynamical changes in the execution environment, is connected
with the run-time measurement definition, selective
instrumentation, and using of counters/timers mechanism
rather than extensive tracing as it is common with most
performance tools. This implies the necessity to focus on
interaction of distributed application components and to
provide data meaningful in the context of an application.
For providing this kind of information, the G-PM tool uses
three sources of data: performance measurement data related
to the running application, measured performance data on the
execution environment, and results of micro-benchmarks,
providing reference values for the performance of the
Via GUI one can select a preferred measurement type, an
appropriate display type and features within a dialog. High-level
performance properties are defined by the user via
loading from a file or via measurement definition dialog.
The user-defined metric is transformed into an appropriate
set of standard metrics. Measurements are realized either by
sampling or by determining monitoring events relevant to the
value to be measured. Application-specific metrics are
associated with inserting user-defined procedure calls,
probes resulting in generating special events to captured by
the monitoring system. Active measurements are created
whenever a concrete measurement is defined.
The tool interacts with the OCM-G and collects
information about a selected application, by connecting to
the monitoring system via the monitoring interface call, by
sending conditional or unconditional requests to the
monitoring system, which result in the obtaining of
monitoring data or executing some manipulations on the
- Bubak, M., Funika, W., and Wismueller, R.: The CrossGrid
Performance Analysis Tool for Interactive Grid Applications.
In: Kranzlmueller, D. and Kacsuk, P. and Dongarra, J. and
Volkert, J. (Eds.), Recent Advances in Parallel Virtual
Machine and Message Passing Interface, 9th European PVM/MPI
Users' Group Meeting, September - October 2002, Linz,
Austria, 2474, Lecture Notes in Computer Science, 50-60,
- Balis, B., Bubak, M., Funika, W., Szepieniec, T., and
Wismueller, R.: An Infrastructure for Grid Application
Monitoring. In: Kranzlmueller, D. and Kacsuk, P. and
Dongarra, J. and Volkert, J. (Eds.), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, 9th
European PVM/MPI Users' Group Meeting, September - October
2002, Linz, Austria, 2474, Lecture Notes in Computer
Science, 41-49, Springer-Verlag, 2002
Bartosz Balis, Marian Bubak, Wojciech Rzasa, Tomasz Szepieniec,
Two Aspects of Security Solution for Distributed Systems in the Grid on the Example
of the OCM-G
This paper presents security solution for the OCM-G -- Grid enabled monitoring system. Grid
applications become complex, thus tools facilitating the development process are required. The OCM-G
is designed as an agent between such tools and application processes running on numerous nodes
belonging to distributed Grid sites.
The OCM-G is designed as distributed and decentralized system, thus scalability required in the Grid
environment is achieved. Monitoring system consists of two parts -- permanent, handling multiple
applications of numerous users and transient, belonging to the monitored application owner. The
OCM-G is designed to support on-line monitoring, which results in requirements concerning delivery time
between the user and the application processes.
Security issues are essential for the OCM-G, for the sake of support for multiple users and wide
abilities of controlling processes. The monitoring system should not lower security of the site.
This paper presents analysis of a solution proposed to ensure security of the OCM-G. We distinguish
two aspects of the solution: inter-component communication and forge-component attack. The second
security aspect results from the fact, that secure communication between system components does not
ensure security of whole system. Results of the test are shown in order to estimate overhead caused
by the solution. We extend concepts presented in  and describe the idea of different security
levels for different network environments. We show that the forge-component security aspect is universal
and can be used to secure other systems similar to the OCM-G.
- Balis B., Bubak M., Rzasa W., Szepieniec T., Wismuller R.: Security in the OCM-G Grid Application
Monitoring System, accepted for PPAM'2003 Czestochowa, Poland, 2003
UNICORE - Towards Production Quality Grid Environment
Krzysztof Benedyczak, Michal Wronski
UNICORE Plugins - How to Design Application Specific Interfaces
VAN - Visual Area Network
Interactive and Collaborative Visualization of Large Data Sets in Distributed Environments
Visual Area Networking (VAN) solving the biggest problems in science and industry requires the best
minds. However, people are increasingly globally mobile or in locations remote from an organization's
advanced computing resources. VAN solve these challenges by integrating high-productivity computing, visualization, scalable storage and networking technologies that make it easier to bring people
together with the visual information they need to do their jobs effectively. VAN not only allows
individuals and groups to solve complex problems while drawing on the company's best experts.
Furthermore, it delivers these capabilities directly to the end users or groups, wherever they may be,
so shared visualization becomes part of user's standard environment. As a result, users are free to
focus on creativity and insight in collaborative setting rather than on the technical details of
computing, visualization, and data management.
Visual area networking represent a shift from focusing only on advancing the power needed for the
most precise rendering to include consideration of the location and availability of visualized data
across the network. VAN is driven by two core technologies developed by SGI: the SGI Onyx visualization
system and a new software component called OpenGL Vizserver. OpenGL Vizserver allows users remote
workstations, laptops, and even wireless tablet computers to use existing unmodified applications to
access and control the power of SGI Onyx family visualization systems and collaborate with one another
using existing visualization applications based on OpenGL API.
OpenGL Vizserver Architecture
OpenGL Vizserver software has two primary components: a server and a client. The server runs on SGI
graphics supercomputers managing graphics resources (e.g., graphics pipelines) and monitoring the
visualization application activity. Once a visualization application is started, the OpenGL Vizserver
server will assign the application the requested graphics resources and begin serving the applicatio
n rendered frames to the OpenGL Vizserver client. This visual serving is the basis of the OpenGL
Vizserver technology and SGI's VAN strategy. Only after a visual application has rendered a frame will
OpenGL Vizserver intercede and capture that frame. The captured frame can be small fraction of the
original data set size and orders of magnitude less complex, because only the pixels associated with
the screen representation of the data are captured.
Each frame captured is compressed using either lossy or lossless data compressors that take advantage
of interframe coherency to minimize the amount of data sent to the OpenGL Vizserver clients. Once
compressed, the image stream is sent to the client. An OpenGL Vizserver client is a lightweight
application that reads the image stream from the OpenGL Vizserver server, uncompresses the stream, and
display the uncompressed image on the client computer. The OpenGL Vizserver client directs all user
interaction back to the OpenGL Vizserver server, creating a seamless visualization environment on
the client as if the user were interacting locally with the SGI graphics supercomputer. The OpenGL
Vizserver client runs on a variety of operating systems including IRIX, Linux, Windows, or Solaris,
and the client system need not have extensive graphics or computational power.
Tomasz Kuczynski, Roman Wyrzykowski and Jaroslaw Zola
Web Access to Distributed Condor Pools
The original goal of WebCI project is creation of a tool that allows monitoring and management of
a Condor pool using WWW.
Main pressure is put on job submittion and control ease, as well as convenient UNIX shell access.
The key element of the project is the portal security and platform independence. These requirements
constrain us to use only the standard system tools.
All the above leads to a concept of using SSH sessions and scp tool through pseudo terminals.
At the beginning no support for interactive jobs was planned, taking into account non-persistent
nature of HTTP protocol. The use of a local server keeping SSH connections states between subsequential
HTTP transactions allows addition of this functionality and performance gain.
The usage of SSH and SCP enables to separate the portal from the access node of the pool. This in
turn allows for adding the functionality of multiple Condor pools interaction and monitoring.
Not without importance is the ability of seamless attachment of new Condor pools by simple addition
of a domain or IP address of the access node to the WebCI config file. Every pool may be accessed by
an unrestricted number of portals, allowing for removing a single point of failure and increasing the
The use of mainly server-side technologies allows to use a thin client, and create WAPCI, in the
future. This will provide the full support for mobile devices.
It is possible to easily adapt the WebCI system architecture to Grid structures, thus creating a
secure and efficient WWW interface. Among others tasks, this interface will enable monitoring of
resources and job queues, job submitting and management, exchange of files between a web browser
and a user account, management of files and directories on user accounts. The important advantage of the
WebCI Grid portal will be convenient use of shell commands using a tool similar to the Midnight Commander.
Günter Kickinger, Jürgen Hofer, A Min Tjoa,
and Peter Brezany
Workflow Management in GridMiner
Knowledge discovery in data resources (files, file collections, relational databases, XML databases
and semistructured data, etc.) managed within Computational Grids is a challenging research and
development problem. The GridMiner project aims to cover all aspects of knowledge discovery and
implement them as an advanced Grid application. We focus our effort on data mining and On-Line
Analytical Processing (OLAP), two complementary technologies, which, if applied in conjunction,
can provide a highly efficient and powerful data analysis and knowledge discovery solution on the Grid.
Knowledge discovery is a high interactive process. To achieve appealing results the user must
permanently have the possibility to influence this process by applying different algorithms or adjusting
parameters of them. So it is essential that GridMiner provides a powerful, flexible and simple to use
user-interface to support the knowledge discovery process.
The research on GridMiner can be divided into two tasks. The first one is to provide a set of Grid
services, which realize the single steps of the knowledge discovery process. This set of services
includes, for example, the integration of different data sets, the pre-processing of the data, like
cleaning and normalization, various data mining algorithms, and the presentation of the knowledge
discovered to the user.
A special service class is represented by services for data mining on top of OLAP, so called On-Line
Analytical Mining (OLAM) including services for cube creation, and interactive OLAP and OLAM
services. The second task of our research deals with the integration of all these different services
into one system.
The knowledge discovery process can be interpreted as connecting the different available services to
a workflow which is executed by an appropriate engine.
Despite the intensive research on workflow languages for Grid or Web Services, no approach is
suitable for GridMiner. This is due to the fact, that all existing workflow languages are targeting to
building a new orchestrated service out of existing services. This new orchestrated service is then
published and can be used by a client like a conventional service. GridMiner needs a high dynamic
workflow concept, where a client can compose the workflow to its individual needs.
In our approach, we design a new language for dynamic service composition called the Dynamic Service
Composition Language (DSCL) which is based on XML and develop a workflow engine called the Dynamic
Service Composition Engine (DSCE).
DSCL allows the description of a workflow consisting of various Grid services and the specification
of the parameter values for the single underlying Grid services.
DSCE is implemented as a Grid service and can be controlled interactively by a client, which has the
possibilities to execute, stop, resume or even to change the workflow and its parameters. Moreover,
DSCE can used for processing in batch mode. The development of Dynamic Service Composition is not
exclusively linked with GridMiner. Every Grid application which needs highly dynamic workflows can
make use of this novel concept.