D. Göhr, C. Grimm, N. Jensen, S. Piger, J. Wiebelitz
A Novel Approach to Protect Grids with Firewalls
Abstract:
Motivation
The communication requirements of common Grid middlewares, with their extensive demand for unhindered
communication, run contrary to the concept of legacy firewalls. The devices are normally statically configured
to accept or deny certain packets or communication streams. Advanced firewalls include application-level
gateways that forward only packets that fit in the course of events of the related protocol. But in the near future,
an implementation of Grid protocols (for example Grid FTP) in application-aware firewalls is unlikely, as these
protocols are not commonly used in the Internet and are momentarily of limited commercial interest. To
leverage the use of firewalls in the Grid communities, one must take novel approaches. Today, dynamic
configuration in terms of a controlled opening of firewall resources is one of the major issues in Grid security.
Previous work
One way is to configure firewalls dynamically, without the necessity to alter the operating system, by means of
using proxy system. Grid applications that must open a communication channel to a system which is guarded
by one or more firewalls could request the opening of certain ports for the duration of the session from the
proxy system. The proxies would configure the firewalls and restore the previous configuration after the end of
the session. The advantage is the total transparency to the involved firewalls. The disadvantage is the
complexity of the proxy system because of the required extensive knowledge of different configuration
protocols of the firewalls.
Another way is the use of an in-path signalisation, which uses a common, authenticated protocol. Applications
or communication sources could use the protocol to signal the need for certain communication paths to
middleboxes, for example firewalls. The IETF MIDCOM working group has specified mechanisms in RFC 3303
and 3304.
A novel approach
A drawback of the approaches is that one must modify existing components. Better, software would establish
an authenticated, temper-proof communication channel between two Grid nodes, and not affect "Gridunaware"
firewalls. The complete Grid communication would take place over this tunnel. Our proposal is to use
IPSec AH in conjunction with existing LDAP directories for the Grid X.509 certificates. IPSec can use the
certificates to set up an authenticated tunnel to the destination Grid node. Firewalls can accept traffic to the
nodes because IPSec implementations are sufficiently secure. Only minor modifications of the rules of the
firewall are necessary. Another benefit is that "application-unawareness" reduces load on the firewall.
The paper summarizes the advantages of the approach, which is a feasible way to enhance Grid security.
back
Pawel Jurczyk, Maciej Golenia, Maciej Malawski, Dawid Kurzyniec, Marian Bubak, Vaidy
S. Sunderam
A System for Distributed Computing Based on H2O and JXTA
Abstract:
H2O is Java-based, component-oriented, lightweight resource sharing platform for metacomputing [1]. It allows
deployment of services into container not only by container owners, but by any authorized clients. As a
communication mechanism, H2O uses RMIX.
JXTA technology is a set of open protocols that allows any connected device on network to communicate and
collaborate in P2P manner[2].
The main goal of this work is to build a uniform global computational network using H2O distributed computing
framework and JXTA P2P technology. This computational network will give users new possibilities in building
and utilizing of distributed computing systems, namely H2O kernels behind firewalls will be accessible and
group management in JXTA will bring us possibility of creating virtual groups of kernels, which enables
dynamic ad-hoc created collaborations.
Our current implementation of H2O over JXTA allows user to export kernels having JXTA endpoints. This
allows H2O metacomputing applications to seamlessly run across private networks and NATs, by using JXTA
as an underlying connection technology. Communication between H2O kernels within JXTA network was
made possible by adding a JXTA socket provider to RMIX. JXTA socket factories are used by RMIX to enable
remote method invocations in the P2P environment.
At present, we focus our work on discovery of H2O kernels. We plan to create a service that holds information
about currently registered kernels from local and JXTA network. Next, we will elaborate a mechanism of
measurement of delay times in network from Name Service to H2O kernels.
References
[1] D. Kurzyniec, T. Wrzosek, D. Drzewiecki, and V. S. Sunderam, "Towards Self-Organizing Distributed
Computing Frameworks: The H2O Approach", Parallel Processing Letters, vol.13, no.2, pp. 273--290, 2003.
[2] The JXTA Project, http://www.jxta.org/
back
Tomasz Gubala, Marian Bubak, Maciej Malawski, Katarzyna Rycerz
Abstract Workflow Composition in K-WfGrid Project Environment
Abstract:
This paper presents a new tool supporting workflow composition for Grid applications [1]. The Workflow
Composition Tool (WCT) is developed to tackle with dynamic workflow problem. In order to provide a
possibility of runtime workflow rescheduling or optimization, the tool is designed to compose abstract
workflows of applications. Therefore we define a new element building abstract workflows: a service class
which contains (logically) all the published services implementing a certain interface.
As a service class is just a description of an interface - a piece of functionality provided by any service within
the class - the WCT tool is concerned only on functional composition of abstract workflows. Each element of
workflow is added after successful matching of the requirements with the (functional) capabilities of a particular
service class. The multiplication of the resultant abstract workflows may be caused by many possible service
classes conforming to a certain set of requirements.
The main input to the WCT is a description of data (results) which should be produced by the future
application. It is also possible to upload an incomplete workflow as an input to complete it. The main output of
the composition process is a description of several abstract workflows, each in a distinct document. During its
operation the tool extensively uses external service registry which provides descriptions of available service
classes (implemented service interfaces). The WCT contacts the registry and queries it in order to obtain
interesting descriptions.
This paper is a discussion on the most important issues and problems related to a process of abstract
workflow composition in the Grid. It also presents how such a tool may cooperate in a wider environment of
workflow construction and execution on an example of European K-WfGrid project [2].
References
[1] M. Bubak, T. Gubała, M. Kapałka, M. Malawski, K. Rycerz, Workflow Composer and Service Registry for
Grid Applications. FGCS Journal of Grid Computing, accepted
[2] K-WfGrid: www.kwfgrid.net
back
Piotr Grabowski, Bartosz Lewandowski, Jarek Nabrzyski
Access to Grid Services for Mobile Users
Abstract:
The article examines the problem of giving the Grid users possibility to access their applications and
resources, from any place, using mobile devices. According to our approach the devices (mobile phones,
PDAs) are incorporated as clients of Grid services. Moreover, because of well-known limitations of mobile
devices and "heavy weight" of protocols, standards and technologies used in Grids, our approach assumes
adopting a gateway between the client and the Grid. This central point in our architecture is called Mobile
Command Center (MCC). The MCC is written as a portal portlet with separate presentation layers for mobiles
and standard web browsers - this allows us to reuse portal services. Within the aforementioned model,
communication between clients and the gateway is performed using the HTTP protocol (native for Java 2 Micro
Edition (J2ME) enabled mobile devices) in the client/server architecture. The communication between the
Gateway and Grid Services is also performed in the client/server architecture where the MCC acts as a client,
whose requests are served by GSI secured Web Services on the Grid side.
The gateway (which is exactly aware of mobile client limitations that are presented during initial handshake)
gathers the requested information from different Grid Services, adopts it to client needs and abilities and sends
back to the mobile device. The Grid Services that are accessible from the gateway can be divided into two
groups. The first group consists of standard GridLab grid services, which responses have to be translated
inside the MCC. This group can be represented by the GridLab Resource Management System (GRMS) which
is used for application/simulation steering. The second group consists of services dedicated to use the mobile
devices. An example of these is the Visualization Service for Mobiles (VSfM) which can produce visualizations
(scaled down in resolution and color depth) that can be displayed on mobile devices screens. Another service
that is accessible from the mobile device via the MCC gateway is Message Box Service(MBS). It is used for
storing, sending and managing different kind of messages/notifications for users. It can be also used for
sending notifications from different grid applications and registering new visualizations by the GridLab
Visualization Service. The newly registered visualization can be than accessed from the mobile device with the
use of the MCC and the VSfM.
back
Pawel Plaszczak, Chris Wilk, Matt Miller
Adapting Insecure Applications to Grid Environment Based
on the Experience of Enhancing DataTurbine with the GSI Security Layer
Abstract:
RBNB DataTurbine from Creare is a dynamic data server, providing high performance data streaming and data
stream dispatching. So far, DataTurbine has mostly been used inside firewalls where performance had priority
over security.
To cope with the open grid requirements of the NEESgrid project, Gridwise Technologies helped to integrate
the GSI security functionality with the DataTurbine. Using the X.509 certificates and the GSI scheme, users can
now securely stream data between the servers belonging to their virtual organization (VO). What's more, the
new Grid security layer is transparent and the users with properly set certificates can operate their old
streaming applications exactly as they did before.
In this paper we will analyze our development and integration work trying to answer the question: how much
effort is needed to transition a typical insecure application into a grid-aware one, introducing as little changes
as possible to the installations already in operation.
back
Hong-Linh Truong, Bartosz Balis, Thomas Fahringer, Marian Bubak
Adaptive and Integrated Monitoring System for Knowledge Grid Workflows
Abstract:
With the dynamics and the diversity of the Grid, Grid monitoring systems have to collect and handle diverse
types of data. Moreover, they must be capable of self-management in order to cope with the structure of the
Grid which is changed frequently. Despite tremendous effort has spent on developing monitoring systems for
the Grid, still existing monitoring systems support limited types of sensors and do not focus on selfmanagement
aspects.
We present an adaptive and integrated monitoring framework which is intended to collect and deliver arbitrary
monitoring information of a variety of resources and workflow applications. The monitoring system follows the
architecture of sensor networks and peer-to-peer systems. Sensors are adaptive and controllable; they will use
rules to control the collection and the measurement of monitoring data, and to react appropriate with the
change of underlying systems. It will also be possible to enable, disable or change behavior of sensors via an
external request. Both event-driven and demand-driven sensors are supported. Sensor managers are
organized into super-peer model and monitoring data is stored in distributed locations. Thus, the monitoring
system can cope with the dynamics of the Grid, and monitoring data is widely disseminated and highly
available.
The monitoring framework unifies diverse types of monitoring data, such as performance measurements of
applications, status of hardware infrastructure components, workflow execution statue, etc., in a single system.
A generic sensor library supporting adaptive and controllable, event-driven and demand-driven sensors, and
an event infrastructure will be developed so that any clients can publish and subscribe for certain events which
could occur in monitored entities. Clients will discover the monitoring service which provides the data of
interest, and will be able to query and subscribe a variety of types of data by using a uniform interface.
back
K. Bertels, K. Sigdel, B.Pour Ebrahimi, S. Vassiliadis
Adaptive Matchmaking in Distributed Computing
Abstract:
The sharing of resources is a main motivation for constructing distributed systems in which multiple computers
are connected by a communication network. The problem in such systems is how resource allocation should
be done in the case where some resources are lying idle and could be linked with other overloaded nodes in
the network. This assumes some kind of matchmaking that is the process of finding an appropriate provider of
resources for a requester. Given the variety of approaches for matchmaking, it is important to be able to
determine the conditions under which particular mechanisms are more performing than others. A framework
that defines a set of criteria can be used to assess the usefulness of a particular mechanism. Examples of
such criteria are scalability, flexibility, robustness and throughput. Previous research of matchmaking
mechanisms shows that systems having completely centralized or completely localized mechanisms each
have their deficiencies. On the continuum from centralized to p2p mechanisms we are interested in designing
a mechanism that enables the network to change its internal matchmaking mechanism from p2p to a more
centralized form or vice versa whenever that is required. This should allow the distributed system to adapt itself
dynamically to changing conditions and always have the most appropriate infrastructure available. Evidently,
the framework mentioned before will provide the necessary check points to induce such modifications to the
infrastructure. This approach boils down to, for instance the idea of multiple matchmakers where the entire
system can be partitioned into segments such that every segment has a matchmaker. Agents can then interact
with each other in a local neighborhood, and matchmaking in different segments can take place in parallel.
Toward this aim, we need to find the conditions under which either approach is best suited. Previous studies
on centralized matchmaking show that there exists some kind of population size beyond or below which either
no improvement can be generated or the matching efficiency goes down
respectively. So the influence of the population size of each segment should be studied in view of matching
rate, matching time and matching quality. Different matching functions that vary in complexity and information
used for matching, should be evaluated based on the above mentioned criteria. The current research also
takes into account varying circumstances under which the network operates. Events such as highly unequal
task and resource distribution, communication failures, etc. are introduced and studied. The current research
will do large scale experiments using environments such as PlanetLab and Globus. These platforms build
infrastructures to enable extensible, co-operated, and secure resource sharing across multiple domains
through the concept of virtual machines.
back
Marcin Pilarski
Adaptive Services Grid towards PlanetLab Research in Poland
Abstract:
Goal of Adaptive Services Grid is to develop a prototype of an open development platform for adaptive
services registration, discovery, composition, and enactment. ASG aims at creating an open development
platform for services composition and enactment rather than another computational grid infrastructure. This
paper describes the forthcoming use case scenarios and the motivation for the usage of PlanetLab as a
testbed. PlanetLab is a geographically distributed platform for deploying planetary-scale network services.
PlanetLab gets access to one or more isolated "slices" of PlanetLab's global resources via a concept called
distributed virtualization. The aspects as historical timeline, current status and the background activity of the
PlanetLab research will be also presented.
back
Bartlomiej Balcerek, Tomasz Kowal
Advanced Authentication Service for Complex Grid Environments
Abstract:
We present a centralized authentication service for Globus based grids, initially developed for SGIGrid project.
The approach is to deliver an advanced authentication-related remote methods for grid users and services.
Throught an integrated authentication server, we propose a centralized credential repository and service for
Globus' GSI supplied grid environments, as replacement of authentication data dispersion introduced by GSI
standard. This approach enables easy management and maintenance of user credentials and gives a
possibility to deliver advanced authentication methods for standard and non-standard grid entities, like RAD,
VLAB or VUS, which are developed within SGIGrid project. The server is implemented in the Web Service
technology and allows to invoke methods such as generation of proxy certificates or verification of user
credentials.
References:
1. SGIgrid: Large-scale computing and visualization for virtual laboratory using SGI cluster (in Polish), KBN
Project, http://www.wcss.wroc.pl/pb/sgigrid/
2. Tomasz Kowal, Bartlomiej Balcerek, SGIGrid project report: "Authorization and authentication server
Authproxyd" (in Polish)
back
Wlodzimierz Funika, Marcin Koch, Allen D. Malony, Sameer Shende, Marcin Smetek, Roland Wismüller
An Approach to the Performance Visualization of Distributed Applications
Abstract:
A key issue in designing and deploying effective distributed applications is the ability to monitor the execution
and measure its performance. A standard approach to this problem, based on pre-execution instrumentation
and post execution measurement, data analysis and visualization, features huge volumes of monitoring data
whose processing is very time-consuming. As a result, there is a distinct need for performance analysis
systems that can perform on-the-fly monitoring, and also allow application control and interaction at run-time.
J-OCM is one of the few monitoring systems that provide these features. It can supply performance data from
a distributed application to any tool which requests the services provided by J-OCM via J-OMIS, a Java-related
extension to the OMIS interface specification. Such an approach allows J-OCM to co-operate with many tools
and evolve separately, without loss of compatibility. Up to now there were no complete performance
monitoring tools which used J-OCM for performance analysis. As a result, it has been impossible to make a full
use of on-line application monitoring for performance measurement and analysis. On the other hand, there
exist very powerful tools for computational steering, such as SCIRun, that apply advanced visualization of
complex data in real-time.
The paper presents an approach to visualize performance data provided by J-OCM. We use the TAU
performance system with the Paravis performance visualization package built with the SCIRun environment.
TAU has mainly been used on single parallel systems for both post-execution and online analysis. Here we
aim at making TAU/Paravis collaborate with J-OCM. TAU/Paravis package consists of several modules which
are used to build a complete visualizing tool. Such a component approach makes the whole system very
flexible and easy to extend. We use this ability to extend the package.
We implemented additional modules placed between J-OCM and TAU/Paravis. These modules form a
separate SCIRun package. Because TAU and J-OCM are data-incompatible, modules should provide
necessary data conversion. Additionally, they allow the developer to choose which elements of application are
to be monitored. The scope of performance visualization may comprise selected nodes of the distributed
system, JVMs, threads, classes and methods. There are modules to gather different kinds of performance
data, like threads synchronization, methods execution time, dependencies in method calls, and communication
(Remote Method Invocation - RMI). Data can be summarized over a time interval or can be momentary.
At the end, we present an expansion into the monitoring of distributed Java applications based on web
services. Recently, web services are becoming a more and more popular paradigm of distributed programming
and this is why the ability to measure its performance is so important. We intend to check the adaptability of JOCM
based monitoring tools to the environments in which web services are provided.
References:
- W. Funika, M. Bubak, M.Smętek, and R. Wismüller. An OMIS-based Approach to Monitoring Distributed
Java Applications. In Yuen Chung Kwong, editor, Annual Review of Scalable Computing, volume 6, chapter 1.
pp. 1-29, World Scientific Publishing Co. and Singapore University Press, 2004.
- A. D. Malony, S. Shende, and R. Bell, "Online Performance Observation of Large-Scale Parallel
Applications", Proc. Parco 2003 Symposium, Elsevier B.V., Sept. 2003.
- C. Johnson, S. Parker, "The SCIRun Parallel Scientific Computing Problem Solving Environment" Proc.
Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999.
back
Marian Bubak, Piotr Nowakowski
Analysis of Implementation of Next Generation Grids Visions in FP6 Grid Technology Projects
Abstract:
A group of experts in the area of Grid technologies and their applications has presented its visions of directions
of research in this field in two previous reports [1,2]. These reports, starting with an analysis of requirements
and usage scenarios, present a kind of research agenda and the list of the most important topics from three
points of view: end-user, architectural, and software.
Following the slew of Grid projects implemented as part of the IST priority within the 5th Framework Program,
the EC has decided to carry on with Grid-related research work as part of the 6th FP, starting in September
2004 [3]. These Grid research projects, 12 in all, can broadly be divided into the following groups:
- underpinning research and Grid architectures,
- applications driving business growth and scientific innovations,
- new technologies for problem solving,
- building the European research area.
The paper aims at presenting the emerging trends in European Grid research. An analysis of the adherence of
the objectives declared by these new projects to the Grid Expert Group recommendations and proposed
research agenda is presented.
References
[1] Next Generation Grid(s). European Grid Research 2005-2010. Expert Group Report, June 2003,
[2] Next Generation Grids 2. Reqirements and Options for European Grids Research 2005-2010. Expert Group
Report, July 2004, http://www.cordis.lu/ist/grids/
[3] European Grid Technology Days 2004, http://www.nextgrid.org/events/
http://www.cordis.lu/ist/grids/
back
Jarek Nabrzyski
Building Grid Infrastructures and Developing Grid Applications with the GridLab Toolkit
Abstract:
The GridLab project is one of the biggest European research and development undertakings in the development of application tools and
middleware for Grid environments. GridLab is funded by the European Commission under the 5th Framework Programme. GridLab will produce a
set of application-oriented Grid services and toolkits providing capabilities such as dynamic resource brokering, monitoring, data
management, security, information, adaptive services and more. Services are accessed using the Grid Application Toolkit (GAT). The GAT provides
applications with access to various GridLab services, resources, specific libraries, tools, etc. in a way that the end-users and
especially application developers can build and run applications on the
Grid without needing to know details about the runtime environment in advance. Applications use the GAT through a fixed GAT API.
All GridLab technologies fit into the GridLab architecture which
defines a cleanly layered environment. On the highest layer (called User Space) there is GAT (Application oriented high level API to
complex and dynamic Grid Environments) and GridSphere (Grid-Portal development framework). The Middleware layer (called Capability Space)
covers the whole range of Grid capabilities as required by applications, users and administrators, such as: GRMS (Grid Resource
Management and Brokering Service), Data Access and Management (Grid Services for data management and access), GAS (Grid Authorization
Service), iGrid (GridLab Information Services), Delphoi (Grid Network Monitoring & Performance Prediction Service), Mercury (Grid Monitoring
infrastructure), Visualization (Grid Visualization Services), Mobile Services (Grid Services supporting wireless technologies). GridLab
technologies help real end-users to develop and run their grid-enabled applications on one hand and on the other hand they help the grid
infrastructure designers to build fully production grid environments. In this presentation we will show the strengths and the methodologies
of using the GridLab Toolkit.
back
J. Jurkiewicz, P. Bala
Building Grid Services Using UNICORE Middleware
Abstract:
Grid middleware is designed to provide access to remote high performance resources. The most important
paradigm is seamless access to the resources. The user obtains number of tools ranging from the set of
scripts to the graphical application launched on the client workstation. The second approach motivated
development of the UNICORE grid middleware. In result user obtains advanced UNICORE Client which allows
user to access distributed resources.
The significant advantage of the UNICORE client is flexible graphical user interface which, through plugin
technology, allows for easy development of the application specific interfaces. The CPMD, Gaussian, Amber,
Gamess or DataBase Access plugins are good examples.
Unfortunately, this technology cannot be used in the Web Services environment and adaptation of the
application specific interfaces, if not done properly, will be a time consuming task. In order to overcome these
disadvantages, we have developed web services tools to access UNICORE grid. The main idea of the
presented approach is to move application specific interface (plugin) from the UNICORE Client to web services
environment. The solution is based on the portal hosting environment which allows to access resources,
similarly as it has been performed by the UNICORE Client.
The user obtains similar functionality, however access to the distributed resources is easier and allows to hide
unnecessary details. Presented approach brings grid services functionality to the UNICORE middleware.
At the moment, the main communication protocol is based on the existing components, in particular Unicore
Protocol Layer (UPL) adopting Abstract Job Object paradigm.
In the future, the standard web services protocols (SOAP based) will be used, which will provide high
interoperability with other grid services based solutions such as Globus.
This work is supported by European Commission under IST UniGrids project (No. 004279). The financial
support of State Committee for Scientific Research is also acknowledged.
back
Jiri Sitera, Ludek Matyska, Ales Krenek, Miroslav Ruda, Michal Vocu, Zdenek Salvet, Milos Mulac
Capability and Attribute Based GRID Monitoring Architecture
Abstract:
The Grid Monitoring Architecture (GMA), as proposed by GGF, is an architecture specification for general Grid
monitoring infrastructure. While designed as a basis for interoperable systems, most actual implementations
do not allow inclusion of third party infrastructure elements. We propose a GMA extension, adding support for
(virtual) overlay infrastructures, through explicit meta-description of data requirements and infrastructure
element capabilities.
The extension consist of three main parts: data attributes, infrastructure components capabilities and matchmaking
process. The key part of our proposal is the idea that all data (events) should have associated
attributes, i.e., metadata description of required behavior of the monitoring infrastructure when processing this
data. The metadata description is thus extended to describe not only data types and structures but also
constraints for data handling and usage. The data attributes are complemented by component capabilities.
They describe infrastructure element features and specific behavior for data processing.
The extended GMA also defines match-making process which virtually connects data with components while
taking into account both the expressed data requirements and declared component capabilities. This matchmaking
functionality is a part of generalized directory service. The data are thus guaranteed to be seen and
manipulated only by such parts of the infrastructure, that are capable of appropriate processing or that can
guarantee trust, persistence or other requirements.
Major advantage of the proposed extension, called capability based GMA(CGMA), is that components with
different capabilities and features can coexist and provide their services relatively independently, but under one
unifying framework (which can be seen as a meta-infrastructure). The goal is to provide a way how to
incorporate and use specialized and optimized components rather than look for ways how to create general
enough components fitting all, often contradictory, needs.
In CGMA, data sources are not only describing types of data they are dealing with but also conditions under
which the data could be passed to the infrastructure. The producers adopt data attributes and register them
together with their own capabilities while consumers register their capabilities as part of their search for
appropriate data sources. The directory (its match-making part) is using attributes and capabilities to find
appropriate components for proper consumer/producer interaction. As more complex infrastructure elements,
with both consumer and producer capabilities, are added, the same process is able to create a path for ``event
flow'' through the infrastructure. This way, specific overlays on top of the monitoring infrastructure are created
and maintained.
Proposal is based on experience gained during the development of Logging and Bookkeeping (LB) service for
the EU DataGrid and recently EGEE projects. We will show motivation and key advantages of the extended
monitoring architecture on real LB use cases, explaining how it can solve the deficiencies of current GMA
implementations like R-GMA.
The paper will be concluded with a proposal for capability based GMA prototype. The proposal builds on the RGMA
relational model and extends its mediator with the match-making capabilities. This implementation will be
shown to cover all the LB use cases, while leaving space for co-existence with other GMA implementations,
including the R-GMA itself.
back
Radoslaw Januszewski, Gracjan Jankowski, Rafal Mikolajczak
Checkpoint/Restart Mechanism for Multiprocess Applications Implemented under SGI Grid Project
Abstract:
One of the most required tools allowing to obtain the high availability and fault tolerance level for Grid
computing in the HPC and HTC areas is the checkpoint - restart and migration mechanism. This mechanism
can be used when there is a need to turn off the production system for maintenance purposes or when the
system fails. The next very important issue is the ability for dynamic load balancing between computational
nodes. To achieve process migration in distributed systems you need to put the migration aware checkpointing
mechanism on the nodes. There are a few system level checkpoint/ restart tools which are available for
commercial operating systems (eg. for IRIX and UNICOS), moreover there are some packages for 32-bit Linux
system as well.
The paper describes the checkpoint restart package which was developed as a part of the SGIgrid project
("High Performance Computing and Visualization with the SGI Grid for Virtual Laboratory Applications" project
nr 6 T11 0052 2002 C/05836). The project was co-funded by the State Committee for Scientific Research and
SGI.
We present the architecture of the tools which were developed on the SGI Altix 3300 server system with four
Intel Itanium2 processors working under Linux OS based on kernel version 2.4. Our package provides kernel
level checkpointing. It consists of a set of tools and kernel modules that allow saving the state of the job and
restarting it later.
Additionally we describe in detail solutions for some interesting issues that we encountered during this project
e.g. the virtualization mechanism which ensures coherency of the recovered application, surrounding
environment and resources. The solutions will be considered in two aspects: as a general checkpoint/restart
problem and implementation problems pertaining to Itanium limitations and features. The document will cover
some portability subject matter in context of virtualization and checkpointing mechanisms.
There is available to download the checkpointing package designed for kernel version 2.4.20-sgi220r3. The
package has implemented the following functionality:
- provides support for single and multi processes applications
- provides support for 32 and 64 bit applications
- provides support for interprocess communication
- virtualization mechanisms are solving migration issues
Key Words: Grids, checkpointing, system architecture, resource virtualization.
back
Pawel Plaszczak, Dominik Lupinski
Choosing and Integrating Various Grid Resource Managers with GT4 - the Early Experience
Abstract:
This presentation will serve as an introduction to one of our research projects related to grid technologies
available on todays market and also those that are about to be ready. The paper documents the early
experiences in integrating Globus Toolkit 4 and state of the art industry schedulers.
In the long term, our research will give us an insight in the heterogeneous Grid solution for
computationally intensive industrial applications. As far as the date of the conference is concerned we are
going to present our results as a work in progress, though.
First, we are going to portray our environment used in conducting this research. The set of virtualized
computers have been achieved with the "User Mode Linux" software.
Next, we will depict the operation of compute clusters functioning under various resource management
schemes. Our research covers scheduling systems such as PBS/Torque and SUN N1 Grid Engine.
Next, we put emphasis on the effort necessary for integration of those schedulers. The beta version of the
Globus Toolkit 4 will be evaluated as the solution, with the stress of the latest GRAM version, also
compared to the alternative solution from Globus 2.4.
As the last integration step, a central installation of the Community Scheduler Framework (CSF) will be
evaluated as a final layer interacting with end-users of this set of VO's resources.
back
Lukasz Kuczynski, Konrad Karczewski, Roman Wyrzykowski
CLUSTERIX Data Management System
Abstract:
Nowadays grid applications deal with large volumes of data. This creates the need for effective datamanagement
solutions. For the CLUSTERIX project CDMS (Clusterix Data Management System) is being
developed. User requirements and analysis of existing implementations have been the foundation for
development of CDMS. A special attention has been paid to making the system user-friendly and efficient,
allowing for creation of a robust Data Storage System.
Taking into account grid specific networking conditions - different bandwidth, current load and network
technologies between geographically distant sites, CDMS tries to optimize data throughput via replication and
replica selection techniques. Another key feature to be considered during grid service implementation is faulttolerance.
In CDMS, modular design and distributed operation model assure elimination of a single point of
failure. In particular, multiple instances of Data Broker are running simultaneously and their coherence is
assured by a synchronization subsystem.
back
E. Imamagic, B. Radic, D. Dobrenic
CRO-GRID Infrastructure: Project Overview and Perspectives
Abstract:
We have witnessed tremendous success of production grids and various grid applications in last decade.
Motivated by previous and current grid initiatives and raising science community demands for computing
resources, project CRO-GRID was initiated. CRO-GRID is a national multi-project whose goals are introduction
of grid and cluster technologies to science community and industry and development of distributed computing
environment and grid applications.
CRO-GRID multi-project consists of three projects: CRO-GRID Infrastructure, CRO-GRID Mediator and CROGRID
Applications. CRO-GRID Mediator develops distributed middleware system based on latest Web Service
Resource specifications.
CRO-GRID Applications is responsible for design and implementation of applications by using CRO-GRID
Mediator system and existing grid middleware, programming environments and tools.
CRO-GRID Infrastructure is responsible for building underlying cluster and grid infrastructure for two other
projects to use. Furthermore, CRO-GRID Infrastructure will assure infrastructure maintenance and optimization
and provide support to grid and cluster related projects. Status of project CRO-GRID Infrastructure is
described below.
In first six months thorough investigation of existing cluster technologies has been completed. Investigation
was focused on particular cluster subsystems (such as job management systems, monitoring tools, etc.) and
followed by selection of the most appropriate cluster technologies. Selected cluster technologies were
implemented on five clusters placed on scientific centers and universities in four cities. In coordination with
project Giga CARNet, clusters were connected with gigabit links.
Currently we are analyzing contemporary grid technologies and projects. We are planning to finish grid
technologies evaluation and decide which will be used by the end of this year. In the same time we will
implement basic grid services on existing clusters.
Afterwards we will start upgrading grid functionalities based on specific needs of project CRO-GRID
applications. In the following two years, we will focus our work on grid and cluster maintenance and
optimization, community outreach and linking with international initiatives.
back
Patryk Lason, Andrzej Ozieblo, Marcin Radecki, Tomasz Szepieniec
CrossGrid and EGEE Installations at Cyfronet: Present and Future
Abstract:
In keeping with the trend toward the grid technology, Cyfronet is actively involved in recent EU Grid projects. In
these projects the centre undertakes a variety of activities which include providing testbed for software
developers, maintenance of computing resources and support for partners across Europe. Moreover, many
significant responsibilities lie on the Cyfronet's teams e.g. to guarantee appropriate level of services, ensure
proper operation of computing infrastructure in the region and certify new resources as they are deployed.
To fulfil the mandates a considerable hardware infrastructure have been established, including 40 dual
processor Intel Xeon based nodes (IA32), and also 20 dual Itanium-2 based nodes. Currently, the IA32
processors are used by the CrossGrid and LCG projects while the IA64 part is being utilized for initial
installation for the EGEE project. In the future we plan to merge all the computing resources into one resource
pool to be used by a range of applications.
During last years we have gained much experience which allows us to play a key role in propagation of the
Grid technology in our region. The plans are to improve comprehension among scientist of the advantages and
possible profits that can be gained from the grid technology.
back
Renata Slota, Lukasz Skital, Darin Nikolow, Jacek Kitowski
Data Archivization Aspects with SGI Grid
Abstract:
The SGI Grid project[1] aims to design and implements broadband services for remote access to expensive
laboratory equipment, backup computational center and remote data-visualization service. Virtual Laboratory
(VLab) experiments and visualization tasks very often produce huge amount of data, which have to be stored
or archived. Data archivization aspects are addressed by Virtual Storage System (VSS). Its main goal is to
integrate storage resources distributed among computational centers into common system and to provide
storage service for all SGI Grid applications. VSS can use variety of storage resources, like databases, file
systems, tertiary storage (tape libraries and optical jukeboxes managed by Hierarchical Storage Management -
HSM systems).
In the paper we present data archivization functionalities of the VSS, its architecture and user interfaces.
VSS offers standard operations, which let user to store and retrieve datafile organized in meta-directories.
Besides this standard operations, some archivization specific functionalities are implemented: access time
estimation, file fragment access, file ordering and fast access to large files residing on tapes. Access time
depends on storage resource type and state. It can vary from milliseconds to minutes or even ten of minutes
for data archived on tape libraries. VSS offers Access Time Estimation for HSM systems, which provide
approximate access time to given file. User can order file or file fragment, which means to inform VSS, when
he will need to access the file. If file resides on tapes or other slow media, the system will copy it to cache,
what will minimize access time to the file. Access to large files residing on tapes is accelerated by
incorporating file fragmentation. Automatic replication has been implemented as a method of data access
optimization. Implementation details of VSS have been presented in [2].
VSS is equipped with following user interfaces: Web portal, Text console, Java API, VSSCommander (java api
application). Web portal is VSS interface, which hides all system details and provide easy way of using VSS.
Text console is an interfaces for VSS developers, it provides more details then others interfaces. Text console
is not intended to by used by end users. Java API is programmers interface, which allows easy VSS client
applications development. VSSCommander is a java application written using VSS Java API.
VSSCommannder offers access to VSS in manner similar to WinSCP or Windows Commander.
The VSS as a part of SGI Gird is already in deployment phase and is used as a production system, which
functional and performance tests are also presented.
[1] SGI Grid - http://www.wcss.wroc.pl/pb/sgigrid/en/index.php
[2] D. Nikolow, R. Slota, J. Kitowski, L. Skital, "Virtual Storage System for the Grid Environment", 4th Int.
Conf. on Computational Science, Krakow, June 6-9, 2004, LNCS vol.3036, pp. 458-461.
http://www.cyfronet.krakow.pl/iccs2004/
back
Ondrej Habala, Marek Ciglan, Ladislav Hluchy
Data Management in the FloodGrid Application
Abstract:
We would like to present the data management tasks and tools used in a flood prediction application of the
CROSSGRID project. The project is aimed towards improving support for interactive applications in the Grid.
Apart from providing several tools for development and support of Grid applications, it also contains a testbed
with several testing applications. One of these is the FloodGrid application.
The grid prediction application of CROSSGRID, called FloodGrid aims to connect together several potential
actors interested in flood prediction - data providers, infrastructure providers and users. The application
consists of a computational core, a workflow manager, two user interfaces and a data management suite. The
project is based on the Grid technology, especially the Globus toolkit 2.4 and 3.2 and the EU DataGrid project.
architecture and used technology.
The application's core is a cascade of several simulation models. At the beginning of the cascade is a
meteorological prediction model, which receives its boundary conditions from outside of the cascade and
computes temperature and precipitation predictions. These are then used in the second stage of the cascade -
a hydrological model, which computes flow volume in selected points of the target river basin. This is then
processed in the last stage of the cascade, in a hydraulic model. This model uses actual terrain model to
compute water flow in the area. Where the water hits area outside of the river basin, a flood is expected.
The computational core is interfaced to the environment and to FloodGrid users by several other components -
the FloodGrid portal, which provides user interface to the whole application, workflow service for the control of
the cascade and a data management suite.
The data management software of FloodGrid is equipped with facilities that support transport and storage of
input data for the simulation cascade, cataloguing of all available files in the environment and easy access to
them. It also provides radar imagery and data from measurement stations directly to the FloodGrid portal. It
builds on the software available in the CROSSGRID testbed, which is interconnected with the portal and
augmented with a metadata catalog for easy lookup of needed files. The whole suite consists of this metadata
catalog (in the form of an OGSI grid service), data delivery software and EDG replica management (part of the
testbed). All these parts are connected with the user interface of FloodGrid (the portal) and may be used from
there, as well as automatically by the workflow service and simulation jobs. The metadata catalog is divided
into two parts - a reusable grid service interface and an application-specific relational database. The interface
may be used with any application, provided that the underlying database follows provides several tables with
information about its structure in a form expected by the interface. The interface itself is accessible via several
service calls for metadata management. Each metadata item describes a file (identified by its GUID from the
EDG replica manager). New metadata items may be added, existing items may be removed and guid lookup is
performed by specifying a set of constraints.
back
P. Rek, M. Kopec, Z. Gdaniec, L. Popenda, R.W. Adamiak, M. Wolski, M. Lawenda,
N. Meyer, M. Stroinski
Digital Science Library for Nuclear Magnetic Resonance Spectroscopy
Abstract:
The need (and possibilities) for electronic publishing and presenting of the digital content grew along with the
development of the Internet network. Nowadays users have the possibility of using more and more
sophisticated tools designed to create, browse and search for electronic documents. These tools perform an
important role in the global information infrastructure, and can benefit to the education and scientific
environments.
The Digital Science Library (DSL) is created on the base of the Data Management System (DMS), which was
developed for the PROGRESS project. Its main functionality, which is storing and presenting data in grid
environments, was extended with the functions specific to the require-ments of the Virtual Laboratory. In its
concept the Digital Science Library allow to store and pre-sent data used by the virtual laboratories. This data
can be input information used for scientific experiments as well as the results of performed experiments.
Another important aspect is the capa-bility of storing various types of publications and documents, which are
often created in the scien-tific process. This type of functionality, which is well known to all digital library users,
is also pro-vided by the DSL. One of the main examples of using DSL in practice is the Virtual Laboratory of
NMR.
This work presents the general assumptions, project design and implementation of the Digital Science Library
for the purpose of Nuclear Magnetic Resonance Spectroscopy (NMR) data. NMR spectroscopy is a unique
experimental technique that is widely used in physics, organic and inor-ganic chemistry, biochemistry as well
as in medicine.
The analysis of one- or multidimensional homo- and heteronuclear spectra obtained in the course of NMR
experiment can provide information about the chemical shifts of the nuclei, scalar coupling constants, residual
dipolar coupling constants and the relaxation times T1, T2. All of these data can be stored in the presented
database, which also offers tools for performing quick and op-timal search through the repository.
Compared to the other NMR databases available through the Internet, like BioMagResBank, NMR data-sets
bank, NMRShiftDB, SDBS and Spectra Online, DSL is more suitable for teaching, since it contains an entire
range of the information about the performance and analysis of the NMR experiment.
back
W. Dzwinel and K. Boryczko
Discrete Particle Simulations Using Shared Memory Clusters
Abstract:
We discuss the use of current shared-memory systems for discrete-particle modeling of heterogeneous
mesoscopic complex fluids in irregular geometries. This has been demonstrated by way of mesoscopic blood
flow in various capillary vessels.
The plasma is represented by fluid particles. The fluid particle model (FPM) is a discrete-particle method,
which is a developed, mesoscopic version of molecular dynamics (MD) technique. Unlike in MD, where the
particles represent atoms and molecules, in FPM fluid particles are employed. The fluid particles mimic the
"lumps of fluid", which interact with each other, not only via central conservative forces as it is in MD, but with
non-central, dissipative and stochastic forces as well. Because of three times greater communication load per
FPM particle than per MD atom, the reconfiguration of the system becomes very time consuming. The other
blood constituents are made of "solid" particles interacting with harmonic forces. The tests were performed for
the same system employing two million fluid and "solid" particles. We show that irregular boundary conditions
and heterogeneity of the particle fluid inhibit efficient implementation of the model on superscalar processors.
We improve the efficiency almost threefold by reducing the effect of computational imbalance using simple
load-balancing scheme.
Additionally, in employing MPI on shared memory machines, we have constructed a simple middleware library
to simplify parallelization. The efficiency of the particle code depends critically on the memory latency. As an
example of application of small, shared-memory clusters and GRID environment in solving very complex
problems we demonstrate the results of modeling red blood cells clotting in blood flow in capillary vessels due
to fibrin aggregation.
back
Waclaw Kus, Tadeusz Burczynski
Distributed Evolutionary Algorithm Plugin for UNICORE System
Abstract:
The shape optimization problem of structures can be solved using methods based on sensitivity analysis
information or non-gradient methods based on genetic algorithms[5]. Applications of evolutionary algorithms in
optimization need only information about values of an objective (fitness) function. The fitness function is
calculated for each chromosome in each generation by solving the boundary - value problem by means of the
finite element method (FEM)[3,8] or the boundary element method (BEM)[3]. This approach does not need
information about the gradient of the fitness function and gives the great probability of finding the global
optimum. The main drawback of this approach is the long time of calculations. The applications of the
distributed evolutionary algorithms can shorten the time of calculations[1,6].
The use of grid techniques in optimizations can lead to improvements in hardware and software utilization. The
other advantages of grids are simple and uniform end user communication portals/programs.
The first evolutionary optimization tests [4] were performed using Condor package[2]. The idea presented in
the paper is to prepare group of plugins and programs for evolutionary optimization of structures using
UNICORE environment[7]. The distributed evolutionary algorithm plugin for UNICORE environment is
Some numerical test of optimization of structures are presented in the paper.
[1] T. Burczynski, W. Kus, Optimization of structures using distributed and parallel evolutionary algorithms
Parallel Processing and Applied Mathematics, PPAM2003, Revised papers, Lecture Notes on Computational
Sciences 3019, Springer, pp. 572-579, 2004.
[2] Condor, High Throughput Computing, http://www.cs.wisc.edu/condor/
[3] M. Kleiber (red.), Handbook of Computational Solid Mechanics, Springer- Verlag, 1998.
[4] W. Kus, T. Burczynski Computer implementation of the coevolutionary algorithm with Condor scheduler,
KAEIOG 2004, Kazimierz, 2004.
[5] Z. Michalewicz, Genetic algorithms + data structures =evolutionary algorithms. Springer-Verlag, Berlin,
1996.
[6] R. Tanese, Distributed Genetic Algorithms. Proc. 3rd ICGA, pp.434-439, Ed. J.D. Schaffer. San Mateo,
USA, 1989.
[7] UNICORE Plus Final Report - Uniform Interface to Computing Resources, Joint Project Report for the
BMBF Project UNICORE Plus, Grant Number: 01 IR 001 A-D, 2003.
[8] O. C. Zienkiewicz, R. L. Taylor, The Finite Element Method. The Basis, Vol. 1-2, Butterworth, Oxford, 2000
back
Witold Alda, Bartlomiej Balcerek, Maciej Dyczkowski,
Grzegorz Kowaliczek, Stanislaw Polak, Rafal Wcislo, Adrian Zachara
Distributed Visualization System Based on Web-Services
Abstract:
In the paper we present the architecture and implementation of the general-purpose distributed visualization
system based on web-services tools. The data files, which contain information for 3D visualization can reach
enormous size in current computer simulations and experiments. Practically they can only be gathered and
stored on the remote servers with large storage devices.
System presented in the paper can interactively convert files stored on remote servers into common X3D
format, possibly extracting only parts which are temporarily needed, thus reducing the number of bytes
transferred to client's workstation. In the implementation layer we use contemporary Java-based tools, such as
J2EE, EJB (Enterprise Java Beans) and Jboss application server on the remote side, as well as Java3D
graphic library for local (client) rendering. The architecture of the system utilizes design patterns, component
technology and XML notation in order to achieve clarity and flexibility of the project. The current version the
system is supplied with sample readers/converters of vastly used PDB files - for storing molecular information
of proteins - and DEM files - suitable for keeping elevation data of digital 3D geographical maps. However, due
to modular architecture of the system, it is fairly easy to add new readers.
back
Krzysztof Kurowski, Bogdan Ludwiczak, Jarek Nabrzyski, Ariel Oleksiak, Juliusz Pukacki
Dynamic Grid Scenarios with Grid REsource Management System in Action
Abstract:
GridLab Resource Management System (GRMS) is an open source meta-scheduling system, developed under
the GridLab Project. It provides mechanisms for resource management for Grid environment: job submission,
job migration, job control, application checkpointing, gathering information about submitted jobs, etc.
GRMS operates on behalf of a user using his or her credentials for authentication and authorization. In
cooperation with other Grid middleware services and using low level core services GRMS is trying to deal with
dynamic nature of Grid environment.
GRMS has some basic mechanisms which allow to implement more complex scenarios for job management.
Basic pieces of job management are: job submission and job migration witch application level checkpointing.
Based on that and with strong assistance of dynamic resource discovery mechanism implemented within
GRMS it is possible to provide more advanced and dynamic scenarios based on job rescheduling methods.
The reschedule policy checkpoints and migrates already running jobs in order to release the amount of
resources required by a job pending in the GRMS queue. The rescheduling method that is used, consists of
several steps. At first resources that meet static requirements have to be found. If there are such resources
available, list of jobs that are running on that resources and were submitted by GRMS is created. Then the
system tries to determine the migration of which jobs can bring the required result. This step consists of two
actions. First, GRMS searches for jobs after termination of which the pending job could be started immediately.
Second, selected jobs are analyzed again to check which of them can be migrated to other available
resources, taking into account the requirements of these jobs and the resources available at the moment.
In the next step the best job for migration is chosen by evaluation of available machines and jobs found for
migration. Selected job is then checkpointed and migrated to the new location and original job is submitted to
the machine on which migrated job was running.
Presented rescheduling technique is very useful and brigs much better results comparing to other possible
mechanisms - keeping job in a queue, or submitting job to overloaded machine. We will also compare this
method to a popular backfilling mechanism.
back
Speakers/Presenters: Jason Novotny, Michael Russell
GridSphere Portal Framework (GridLab)
Abstract:
Grid portals build upon the familiar Web portal model to offer virtual
communities a single point of access to computational resources
(clusters, data servers, applications, scientific instruments and
computing services) as well as the information and business services
traditionally offered on the Web. The main idea is to leverage the Web
to enable scientists and engineers to solve larger and more complex
problems with the computing services made available to the them on the
Grid. The Web is becoming increasingly attractive as a means for
delivering applications given the evolution in Grid computing towards
the use of Web service technologies. Moreover, widely-adopted stardards
are making the Web a more viable platform for developing applications.
For example, the portlet JSR standard (Java Specification Request 168)
defines a way for packaging and sharing user interface components, or
"portlets", over the Web. The portlet JSR standard makes it easier to
deliver a wide-variety of applications to end-users and helps to
encourage interoperability between those applications.
In this tutorial, we'll describe the GridSphere Portal Framework.
GridSphere is an open-source portlet JSR compliant container for
hosting and developing portlets. We'll highlight the base-classes and
tools GridSphere offers for developing sophisticated portlets that
provide a consistent, professional look & feel. We'll then show how
GridSphere can be easily customized and administered online to provide
a complete portal solution. Next, we'll discuss the set of portlets and
tools the GridSphere Project offers for developing and administering
Grid-enabled portals. We'll conclude our tutorial with a brief
discussion of how the Numerical Relativity Portal, under construction
at the Albert Einstein Institute, utilizes GridSphere to enable
physicists to manage their physics applications on the Grid. The
Numerical Relativity Portal provides just one example of how GridSphere
can be used to develop a Grid portal tailored to the needs of a
particular application community.
The intended audience: Web and Grid application developers; scientists,
engineers, and system administrators using or interested in using Grid
technologies. Attendees should have some familiarity with using or
developing Web portals and a basic understanding of Grid computing
concepts. Attendees will get an introduction on how they can use
GridSphere to develop standards-based Grid portals.
back
Dieter Kranzlmueller
The Austrian Grid Initiative and the Central European Grid
Consortium - Combining National and Regional Efforts
Abstract:
The hype surrounding the term grid computing seems to be calming down.
Industry has started to take "the grid" seriously, not only as a buzz word
for marketing, but as a technology mature enough to be actually deployed.
Grid computing itself seems to be in a state of consolidation, with
every nation engaged in its own national grid initiative, and large-scale
projects such as EGEE promising production level grids on
never before seen production levels.
This contribution describes the authors view of the current situation
in grid computing with focus on the national Austrian Grid program and
the regional Central European Grid Consortium (CEGC). Starting with the
EU CrossGrid project, a predecessor in many aspects of
todays and future grid projects, the characteristics of available and
deployed grid computing technology will be reviewed, leading to an outlook
of the plans for the Austrian Grid initiative and
the potential impact of regional consortia such as CEGC.
back
Pawel Plaszczak
The State and the Future of the Commercial Grid Computing
Abstract:
In this presentation, we look at Grid computing as a set of technologies enabling resource virtualization bringing the world towards the ubiquitous
market of digital services. We will look at how grids are implemented today and in which industries they are used. We will present a handful of use cases of commercial
"enterprise grid" systems. These are based on commercial and open source Grid products available on the market. We will next look at what
obstacles are being faced by those who try to implement partner grids.
These obstacles will be further analyzed to see what pieces of technology are still missing.
We will next try to make an intelligent guess on how open grid systems of the near future may work and what roles will be available for the companies and institutions wishing to participate in the Grid-based
market.
The presentation is a summary of a year-long research which the author conducted for his Savvy Manager's Guide to Grid Computing, to be
published in 2005 by Morgan Kaufmann.
The author is heading Gridwise Technologies, a consulting firm
specializing in Grid solutions. Prior to founding Gridwise, the author worked for the Globus Project.
back