Cracow, Poland
D e c e m b e r  12 - 15, 2004

 
Home
Committees
Programme
Papers
Registration
Accommodation
Practical info
About Kraków
 
Previous CGW
 
Main Organizer:

ACC Cyfronet AGH
back

A b s t r a c t s
( in alphabetical order )



D. Göhr, C. Grimm, N. Jensen, S. Piger, J. Wiebelitz

A Novel Approach to Protect Grids with Firewalls

Abstract:

Motivation
The communication requirements of common Grid middlewares, with their extensive demand for unhindered communication, run contrary to the concept of legacy firewalls. The devices are normally statically configured to accept or deny certain packets or communication streams. Advanced firewalls include application-level gateways that forward only packets that fit in the course of events of the related protocol. But in the near future, an implementation of Grid protocols (for example Grid FTP) in application-aware firewalls is unlikely, as these protocols are not commonly used in the Internet and are momentarily of limited commercial interest. To leverage the use of firewalls in the Grid communities, one must take novel approaches. Today, dynamic configuration in terms of a controlled opening of firewall resources is one of the major issues in Grid security.

Previous work
One way is to configure firewalls dynamically, without the necessity to alter the operating system, by means of using proxy system. Grid applications that must open a communication channel to a system which is guarded by one or more firewalls could request the opening of certain ports for the duration of the session from the proxy system. The proxies would configure the firewalls and restore the previous configuration after the end of the session. The advantage is the total transparency to the involved firewalls. The disadvantage is the complexity of the proxy system because of the required extensive knowledge of different configuration protocols of the firewalls.
Another way is the use of an in-path signalisation, which uses a common, authenticated protocol. Applications or communication sources could use the protocol to signal the need for certain communication paths to middleboxes, for example firewalls. The IETF MIDCOM working group has specified mechanisms in RFC 3303 and 3304.

A novel approach
A drawback of the approaches is that one must modify existing components. Better, software would establish an authenticated, temper-proof communication channel between two Grid nodes, and not affect "Gridunaware" firewalls. The complete Grid communication would take place over this tunnel. Our proposal is to use IPSec AH in conjunction with existing LDAP directories for the Grid X.509 certificates. IPSec can use the certificates to set up an authenticated tunnel to the destination Grid node. Firewalls can accept traffic to the nodes because IPSec implementations are sufficiently secure. Only minor modifications of the rules of the firewall are necessary. Another benefit is that "application-unawareness" reduces load on the firewall.

The paper summarizes the advantages of the approach, which is a feasible way to enhance Grid security.

back



Pawel Jurczyk, Maciej Golenia, Maciej Malawski, Dawid Kurzyniec, Marian Bubak, Vaidy S. Sunderam

A System for Distributed Computing Based on H2O and JXTA

Abstract:

H2O is Java-based, component-oriented, lightweight resource sharing platform for metacomputing [1]. It allows deployment of services into container not only by container owners, but by any authorized clients. As a communication mechanism, H2O uses RMIX.

JXTA technology is a set of open protocols that allows any connected device on network to communicate and collaborate in P2P manner[2].

The main goal of this work is to build a uniform global computational network using H2O distributed computing framework and JXTA P2P technology. This computational network will give users new possibilities in building and utilizing of distributed computing systems, namely H2O kernels behind firewalls will be accessible and group management in JXTA will bring us possibility of creating virtual groups of kernels, which enables dynamic ad-hoc created collaborations.

Our current implementation of H2O over JXTA allows user to export kernels having JXTA endpoints. This allows H2O metacomputing applications to seamlessly run across private networks and NATs, by using JXTA as an underlying connection technology. Communication between H2O kernels within JXTA network was made possible by adding a JXTA socket provider to RMIX. JXTA socket factories are used by RMIX to enable remote method invocations in the P2P environment.

At present, we focus our work on discovery of H2O kernels. We plan to create a service that holds information about currently registered kernels from local and JXTA network. Next, we will elaborate a mechanism of measurement of delay times in network from Name Service to H2O kernels.

References
[1] D. Kurzyniec, T. Wrzosek, D. Drzewiecki, and V. S. Sunderam, "Towards Self-Organizing Distributed Computing Frameworks: The H2O Approach", Parallel Processing Letters, vol.13, no.2, pp. 273--290, 2003.
[2] The JXTA Project, http://www.jxta.org/

back



Tomasz Gubala, Marian Bubak, Maciej Malawski, Katarzyna Rycerz

Abstract Workflow Composition in K-WfGrid Project Environment

Abstract:

This paper presents a new tool supporting workflow composition for Grid applications [1]. The Workflow Composition Tool (WCT) is developed to tackle with dynamic workflow problem. In order to provide a possibility of runtime workflow rescheduling or optimization, the tool is designed to compose abstract workflows of applications. Therefore we define a new element building abstract workflows: a service class which contains (logically) all the published services implementing a certain interface.

As a service class is just a description of an interface - a piece of functionality provided by any service within the class - the WCT tool is concerned only on functional composition of abstract workflows. Each element of workflow is added after successful matching of the requirements with the (functional) capabilities of a particular service class. The multiplication of the resultant abstract workflows may be caused by many possible service classes conforming to a certain set of requirements.

The main input to the WCT is a description of data (results) which should be produced by the future application. It is also possible to upload an incomplete workflow as an input to complete it. The main output of the composition process is a description of several abstract workflows, each in a distinct document. During its operation the tool extensively uses external service registry which provides descriptions of available service classes (implemented service interfaces). The WCT contacts the registry and queries it in order to obtain interesting descriptions.

This paper is a discussion on the most important issues and problems related to a process of abstract workflow composition in the Grid. It also presents how such a tool may cooperate in a wider environment of workflow construction and execution on an example of European K-WfGrid project [2].

References
[1] M. Bubak, T. Gubała, M. Kapałka, M. Malawski, K. Rycerz, Workflow Composer and Service Registry for Grid Applications. FGCS Journal of Grid Computing, accepted
[2] K-WfGrid: www.kwfgrid.net

back



Piotr Grabowski, Bartosz Lewandowski, Jarek Nabrzyski

Access to Grid Services for Mobile Users

Abstract:

The article examines the problem of giving the Grid users possibility to access their applications and resources, from any place, using mobile devices. According to our approach the devices (mobile phones, PDAs) are incorporated as clients of Grid services. Moreover, because of well-known limitations of mobile devices and "heavy weight" of protocols, standards and technologies used in Grids, our approach assumes adopting a gateway between the client and the Grid. This central point in our architecture is called Mobile Command Center (MCC). The MCC is written as a portal portlet with separate presentation layers for mobiles and standard web browsers - this allows us to reuse portal services. Within the aforementioned model, communication between clients and the gateway is performed using the HTTP protocol (native for Java 2 Micro Edition (J2ME) enabled mobile devices) in the client/server architecture. The communication between the Gateway and Grid Services is also performed in the client/server architecture where the MCC acts as a client, whose requests are served by GSI secured Web Services on the Grid side.

The gateway (which is exactly aware of mobile client limitations that are presented during initial handshake) gathers the requested information from different Grid Services, adopts it to client needs and abilities and sends back to the mobile device. The Grid Services that are accessible from the gateway can be divided into two groups. The first group consists of standard GridLab grid services, which responses have to be translated inside the MCC. This group can be represented by the GridLab Resource Management System (GRMS) which is used for application/simulation steering. The second group consists of services dedicated to use the mobile devices. An example of these is the Visualization Service for Mobiles (VSfM) which can produce visualizations (scaled down in resolution and color depth) that can be displayed on mobile devices screens. Another service that is accessible from the mobile device via the MCC gateway is Message Box Service(MBS). It is used for storing, sending and managing different kind of messages/notifications for users. It can be also used for sending notifications from different grid applications and registering new visualizations by the GridLab Visualization Service. The newly registered visualization can be than accessed from the mobile device with the use of the MCC and the VSfM.

back



Pawel Plaszczak, Chris Wilk, Matt Miller

Adapting Insecure Applications to Grid Environment Based on the Experience of Enhancing DataTurbine with the GSI Security Layer

Abstract:

RBNB DataTurbine from Creare is a dynamic data server, providing high performance data streaming and data stream dispatching. So far, DataTurbine has mostly been used inside firewalls where performance had priority over security.

To cope with the open grid requirements of the NEESgrid project, Gridwise Technologies helped to integrate the GSI security functionality with the DataTurbine. Using the X.509 certificates and the GSI scheme, users can now securely stream data between the servers belonging to their virtual organization (VO). What's more, the new Grid security layer is transparent and the users with properly set certificates can operate their old streaming applications exactly as they did before. In this paper we will analyze our development and integration work trying to answer the question: how much effort is needed to transition a typical insecure application into a grid-aware one, introducing as little changes as possible to the installations already in operation.

back



Hong-Linh Truong, Bartosz Balis, Thomas Fahringer, Marian Bubak

Adaptive and Integrated Monitoring System for Knowledge Grid Workflows

Abstract:

With the dynamics and the diversity of the Grid, Grid monitoring systems have to collect and handle diverse types of data. Moreover, they must be capable of self-management in order to cope with the structure of the Grid which is changed frequently. Despite tremendous effort has spent on developing monitoring systems for the Grid, still existing monitoring systems support limited types of sensors and do not focus on selfmanagement aspects.

We present an adaptive and integrated monitoring framework which is intended to collect and deliver arbitrary monitoring information of a variety of resources and workflow applications. The monitoring system follows the architecture of sensor networks and peer-to-peer systems. Sensors are adaptive and controllable; they will use rules to control the collection and the measurement of monitoring data, and to react appropriate with the change of underlying systems. It will also be possible to enable, disable or change behavior of sensors via an external request. Both event-driven and demand-driven sensors are supported. Sensor managers are organized into super-peer model and monitoring data is stored in distributed locations. Thus, the monitoring system can cope with the dynamics of the Grid, and monitoring data is widely disseminated and highly available.

The monitoring framework unifies diverse types of monitoring data, such as performance measurements of applications, status of hardware infrastructure components, workflow execution statue, etc., in a single system. A generic sensor library supporting adaptive and controllable, event-driven and demand-driven sensors, and an event infrastructure will be developed so that any clients can publish and subscribe for certain events which could occur in monitored entities. Clients will discover the monitoring service which provides the data of interest, and will be able to query and subscribe a variety of types of data by using a uniform interface.

back



K. Bertels, K. Sigdel, B.Pour Ebrahimi, S. Vassiliadis

Adaptive Matchmaking in Distributed Computing

Abstract:

The sharing of resources is a main motivation for constructing distributed systems in which multiple computers are connected by a communication network. The problem in such systems is how resource allocation should be done in the case where some resources are lying idle and could be linked with other overloaded nodes in the network. This assumes some kind of matchmaking that is the process of finding an appropriate provider of resources for a requester. Given the variety of approaches for matchmaking, it is important to be able to determine the conditions under which particular mechanisms are more performing than others. A framework that defines a set of criteria can be used to assess the usefulness of a particular mechanism. Examples of such criteria are scalability, flexibility, robustness and throughput. Previous research of matchmaking mechanisms shows that systems having completely centralized or completely localized mechanisms each have their deficiencies. On the continuum from centralized to p2p mechanisms we are interested in designing a mechanism that enables the network to change its internal matchmaking mechanism from p2p to a more centralized form or vice versa whenever that is required. This should allow the distributed system to adapt itself dynamically to changing conditions and always have the most appropriate infrastructure available. Evidently, the framework mentioned before will provide the necessary check points to induce such modifications to the infrastructure. This approach boils down to, for instance the idea of multiple matchmakers where the entire system can be partitioned into segments such that every segment has a matchmaker. Agents can then interact with each other in a local neighborhood, and matchmaking in different segments can take place in parallel. Toward this aim, we need to find the conditions under which either approach is best suited. Previous studies on centralized matchmaking show that there exists some kind of population size beyond or below which either no improvement can be generated or the matching efficiency goes down respectively. So the influence of the population size of each segment should be studied in view of matching rate, matching time and matching quality. Different matching functions that vary in complexity and information used for matching, should be evaluated based on the above mentioned criteria. The current research also takes into account varying circumstances under which the network operates. Events such as highly unequal task and resource distribution, communication failures, etc. are introduced and studied. The current research will do large scale experiments using environments such as PlanetLab and Globus. These platforms build infrastructures to enable extensible, co-operated, and secure resource sharing across multiple domains through the concept of virtual machines.

back



Marcin Pilarski

Adaptive Services Grid towards PlanetLab Research in Poland

Abstract:

Goal of Adaptive Services Grid is to develop a prototype of an open development platform for adaptive services registration, discovery, composition, and enactment. ASG aims at creating an open development platform for services composition and enactment rather than another computational grid infrastructure. This paper describes the forthcoming use case scenarios and the motivation for the usage of PlanetLab as a testbed. PlanetLab is a geographically distributed platform for deploying planetary-scale network services. PlanetLab gets access to one or more isolated "slices" of PlanetLab's global resources via a concept called distributed virtualization. The aspects as historical timeline, current status and the background activity of the PlanetLab research will be also presented.

back



Bartlomiej Balcerek, Tomasz Kowal

Advanced Authentication Service for Complex Grid Environments

Abstract:

We present a centralized authentication service for Globus based grids, initially developed for SGIGrid project. The approach is to deliver an advanced authentication-related remote methods for grid users and services. Throught an integrated authentication server, we propose a centralized credential repository and service for Globus' GSI supplied grid environments, as replacement of authentication data dispersion introduced by GSI standard. This approach enables easy management and maintenance of user credentials and gives a possibility to deliver advanced authentication methods for standard and non-standard grid entities, like RAD, VLAB or VUS, which are developed within SGIGrid project. The server is implemented in the Web Service technology and allows to invoke methods such as generation of proxy certificates or verification of user credentials.

References: 1. SGIgrid: Large-scale computing and visualization for virtual laboratory using SGI cluster (in Polish), KBN Project, http://www.wcss.wroc.pl/pb/sgigrid/
2. Tomasz Kowal, Bartlomiej Balcerek, SGIGrid project report: "Authorization and authentication server Authproxyd" (in Polish)

back



Wlodzimierz Funika, Marcin Koch, Allen D. Malony, Sameer Shende, Marcin Smetek, Roland Wismüller

An Approach to the Performance Visualization of Distributed Applications

Abstract:

A key issue in designing and deploying effective distributed applications is the ability to monitor the execution and measure its performance. A standard approach to this problem, based on pre-execution instrumentation and post execution measurement, data analysis and visualization, features huge volumes of monitoring data whose processing is very time-consuming. As a result, there is a distinct need for performance analysis systems that can perform on-the-fly monitoring, and also allow application control and interaction at run-time. J-OCM is one of the few monitoring systems that provide these features. It can supply performance data from a distributed application to any tool which requests the services provided by J-OCM via J-OMIS, a Java-related extension to the OMIS interface specification. Such an approach allows J-OCM to co-operate with many tools and evolve separately, without loss of compatibility. Up to now there were no complete performance monitoring tools which used J-OCM for performance analysis. As a result, it has been impossible to make a full use of on-line application monitoring for performance measurement and analysis. On the other hand, there exist very powerful tools for computational steering, such as SCIRun, that apply advanced visualization of complex data in real-time.

The paper presents an approach to visualize performance data provided by J-OCM. We use the TAU performance system with the Paravis performance visualization package built with the SCIRun environment. TAU has mainly been used on single parallel systems for both post-execution and online analysis. Here we aim at making TAU/Paravis collaborate with J-OCM. TAU/Paravis package consists of several modules which are used to build a complete visualizing tool. Such a component approach makes the whole system very flexible and easy to extend. We use this ability to extend the package.

We implemented additional modules placed between J-OCM and TAU/Paravis. These modules form a separate SCIRun package. Because TAU and J-OCM are data-incompatible, modules should provide necessary data conversion. Additionally, they allow the developer to choose which elements of application are to be monitored. The scope of performance visualization may comprise selected nodes of the distributed system, JVMs, threads, classes and methods. There are modules to gather different kinds of performance data, like threads synchronization, methods execution time, dependencies in method calls, and communication (Remote Method Invocation - RMI). Data can be summarized over a time interval or can be momentary. At the end, we present an expansion into the monitoring of distributed Java applications based on web services. Recently, web services are becoming a more and more popular paradigm of distributed programming and this is why the ability to measure its performance is so important. We intend to check the adaptability of JOCM based monitoring tools to the environments in which web services are provided.

References:

  1. W. Funika, M. Bubak, M.Smętek, and R. Wismüller. An OMIS-based Approach to Monitoring Distributed Java Applications. In Yuen Chung Kwong, editor, Annual Review of Scalable Computing, volume 6, chapter 1. pp. 1-29, World Scientific Publishing Co. and Singapore University Press, 2004.
  2. A. D. Malony, S. Shende, and R. Bell, "Online Performance Observation of Large-Scale Parallel Applications", Proc. Parco 2003 Symposium, Elsevier B.V., Sept. 2003.
  3. C. Johnson, S. Parker, "The SCIRun Parallel Scientific Computing Problem Solving Environment" Proc. Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999.

back



Marian Bubak, Piotr Nowakowski

Analysis of Implementation of Next Generation Grids Visions in FP6 Grid Technology Projects

Abstract:

A group of experts in the area of Grid technologies and their applications has presented its visions of directions of research in this field in two previous reports [1,2]. These reports, starting with an analysis of requirements and usage scenarios, present a kind of research agenda and the list of the most important topics from three points of view: end-user, architectural, and software.

Following the slew of Grid projects implemented as part of the IST priority within the 5th Framework Program, the EC has decided to carry on with Grid-related research work as part of the 6th FP, starting in September 2004 [3]. These Grid research projects, 12 in all, can broadly be divided into the following groups:

  • underpinning research and Grid architectures,
  • applications driving business growth and scientific innovations,
  • new technologies for problem solving,
  • building the European research area.
The paper aims at presenting the emerging trends in European Grid research. An analysis of the adherence of the objectives declared by these new projects to the Grid Expert Group recommendations and proposed research agenda is presented.

References [1] Next Generation Grid(s). European Grid Research 2005-2010. Expert Group Report, June 2003,
[2] Next Generation Grids 2. Reqirements and Options for European Grids Research 2005-2010. Expert Group Report, July 2004, http://www.cordis.lu/ist/grids/
[3] European Grid Technology Days 2004, http://www.nextgrid.org/events/
http://www.cordis.lu/ist/grids/

back



Jarek Nabrzyski

Building Grid Infrastructures and Developing Grid Applications with the GridLab Toolkit

Abstract:

The GridLab project is one of the biggest European research and development undertakings in the development of application tools and middleware for Grid environments. GridLab is funded by the European Commission under the 5th Framework Programme. GridLab will produce a set of application-oriented Grid services and toolkits providing capabilities such as dynamic resource brokering, monitoring, data management, security, information, adaptive services and more. Services are accessed using the Grid Application Toolkit (GAT). The GAT provides applications with access to various GridLab services, resources, specific libraries, tools, etc. in a way that the end-users and especially application developers can build and run applications on the Grid without needing to know details about the runtime environment in advance. Applications use the GAT through a fixed GAT API. All GridLab technologies fit into the GridLab architecture which defines a cleanly layered environment. On the highest layer (called User Space) there is GAT (Application oriented high level API to complex and dynamic Grid Environments) and GridSphere (Grid-Portal development framework). The Middleware layer (called Capability Space) covers the whole range of Grid capabilities as required by applications, users and administrators, such as: GRMS (Grid Resource Management and Brokering Service), Data Access and Management (Grid Services for data management and access), GAS (Grid Authorization Service), iGrid (GridLab Information Services), Delphoi (Grid Network Monitoring & Performance Prediction Service), Mercury (Grid Monitoring infrastructure), Visualization (Grid Visualization Services), Mobile Services (Grid Services supporting wireless technologies). GridLab technologies help real end-users to develop and run their grid-enabled applications on one hand and on the other hand they help the grid infrastructure designers to build fully production grid environments. In this presentation we will show the strengths and the methodologies of using the GridLab Toolkit.

back



J. Jurkiewicz, P. Bala

Building Grid Services Using UNICORE Middleware

Abstract:

Grid middleware is designed to provide access to remote high performance resources. The most important paradigm is seamless access to the resources. The user obtains number of tools ranging from the set of scripts to the graphical application launched on the client workstation. The second approach motivated development of the UNICORE grid middleware. In result user obtains advanced UNICORE Client which allows user to access distributed resources.

The significant advantage of the UNICORE client is flexible graphical user interface which, through plugin technology, allows for easy development of the application specific interfaces. The CPMD, Gaussian, Amber, Gamess or DataBase Access plugins are good examples.

Unfortunately, this technology cannot be used in the Web Services environment and adaptation of the application specific interfaces, if not done properly, will be a time consuming task. In order to overcome these disadvantages, we have developed web services tools to access UNICORE grid. The main idea of the presented approach is to move application specific interface (plugin) from the UNICORE Client to web services environment. The solution is based on the portal hosting environment which allows to access resources, similarly as it has been performed by the UNICORE Client.

The user obtains similar functionality, however access to the distributed resources is easier and allows to hide unnecessary details. Presented approach brings grid services functionality to the UNICORE middleware. At the moment, the main communication protocol is based on the existing components, in particular Unicore Protocol Layer (UPL) adopting Abstract Job Object paradigm.

In the future, the standard web services protocols (SOAP based) will be used, which will provide high interoperability with other grid services based solutions such as Globus. This work is supported by European Commission under IST UniGrids project (No. 004279). The financial support of State Committee for Scientific Research is also acknowledged.

back



Jiri Sitera, Ludek Matyska, Ales Krenek, Miroslav Ruda, Michal Vocu, Zdenek Salvet, Milos Mulac

Capability and Attribute Based GRID Monitoring Architecture

Abstract:

The Grid Monitoring Architecture (GMA), as proposed by GGF, is an architecture specification for general Grid monitoring infrastructure. While designed as a basis for interoperable systems, most actual implementations do not allow inclusion of third party infrastructure elements. We propose a GMA extension, adding support for (virtual) overlay infrastructures, through explicit meta-description of data requirements and infrastructure element capabilities.

The extension consist of three main parts: data attributes, infrastructure components capabilities and matchmaking process. The key part of our proposal is the idea that all data (events) should have associated attributes, i.e., metadata description of required behavior of the monitoring infrastructure when processing this data. The metadata description is thus extended to describe not only data types and structures but also constraints for data handling and usage. The data attributes are complemented by component capabilities. They describe infrastructure element features and specific behavior for data processing.

The extended GMA also defines match-making process which virtually connects data with components while taking into account both the expressed data requirements and declared component capabilities. This matchmaking functionality is a part of generalized directory service. The data are thus guaranteed to be seen and manipulated only by such parts of the infrastructure, that are capable of appropriate processing or that can guarantee trust, persistence or other requirements.

Major advantage of the proposed extension, called capability based GMA(CGMA), is that components with different capabilities and features can coexist and provide their services relatively independently, but under one unifying framework (which can be seen as a meta-infrastructure). The goal is to provide a way how to incorporate and use specialized and optimized components rather than look for ways how to create general enough components fitting all, often contradictory, needs.

In CGMA, data sources are not only describing types of data they are dealing with but also conditions under which the data could be passed to the infrastructure. The producers adopt data attributes and register them together with their own capabilities while consumers register their capabilities as part of their search for appropriate data sources. The directory (its match-making part) is using attributes and capabilities to find appropriate components for proper consumer/producer interaction. As more complex infrastructure elements, with both consumer and producer capabilities, are added, the same process is able to create a path for ``event flow'' through the infrastructure. This way, specific overlays on top of the monitoring infrastructure are created and maintained.

Proposal is based on experience gained during the development of Logging and Bookkeeping (LB) service for the EU DataGrid and recently EGEE projects. We will show motivation and key advantages of the extended monitoring architecture on real LB use cases, explaining how it can solve the deficiencies of current GMA implementations like R-GMA.

The paper will be concluded with a proposal for capability based GMA prototype. The proposal builds on the RGMA relational model and extends its mediator with the match-making capabilities. This implementation will be shown to cover all the LB use cases, while leaving space for co-existence with other GMA implementations, including the R-GMA itself.

back



Radoslaw Januszewski, Gracjan Jankowski, Rafal Mikolajczak

Checkpoint/Restart Mechanism for Multiprocess Applications Implemented under SGI Grid Project

Abstract:

One of the most required tools allowing to obtain the high availability and fault tolerance level for Grid computing in the HPC and HTC areas is the checkpoint - restart and migration mechanism. This mechanism can be used when there is a need to turn off the production system for maintenance purposes or when the system fails. The next very important issue is the ability for dynamic load balancing between computational nodes. To achieve process migration in distributed systems you need to put the migration aware checkpointing mechanism on the nodes. There are a few system level checkpoint/ restart tools which are available for commercial operating systems (eg. for IRIX and UNICOS), moreover there are some packages for 32-bit Linux system as well.

The paper describes the checkpoint restart package which was developed as a part of the SGIgrid project ("High Performance Computing and Visualization with the SGI Grid for Virtual Laboratory Applications" project nr 6 T11 0052 2002 C/05836). The project was co-funded by the State Committee for Scientific Research and SGI.

We present the architecture of the tools which were developed on the SGI Altix 3300 server system with four Intel Itanium2 processors working under Linux OS based on kernel version 2.4. Our package provides kernel level checkpointing. It consists of a set of tools and kernel modules that allow saving the state of the job and restarting it later.

Additionally we describe in detail solutions for some interesting issues that we encountered during this project e.g. the virtualization mechanism which ensures coherency of the recovered application, surrounding environment and resources. The solutions will be considered in two aspects: as a general checkpoint/restart problem and implementation problems pertaining to Itanium limitations and features. The document will cover some portability subject matter in context of virtualization and checkpointing mechanisms. There is available to download the checkpointing package designed for kernel version 2.4.20-sgi220r3. The package has implemented the following functionality:

  • provides support for single and multi processes applications
  • provides support for 32 and 64 bit applications
  • provides support for interprocess communication
  • virtualization mechanisms are solving migration issues
Key Words: Grids, checkpointing, system architecture, resource virtualization.

back



Pawel Plaszczak, Dominik Lupinski

Choosing and Integrating Various Grid Resource Managers with GT4 - the Early Experience

Abstract:

This presentation will serve as an introduction to one of our research projects related to grid technologies available on todays market and also those that are about to be ready. The paper documents the early experiences in integrating Globus Toolkit 4 and state of the art industry schedulers.

In the long term, our research will give us an insight in the heterogeneous Grid solution for computationally intensive industrial applications. As far as the date of the conference is concerned we are going to present our results as a work in progress, though.

First, we are going to portray our environment used in conducting this research. The set of virtualized computers have been achieved with the "User Mode Linux" software.

Next, we will depict the operation of compute clusters functioning under various resource management schemes. Our research covers scheduling systems such as PBS/Torque and SUN N1 Grid Engine. Next, we put emphasis on the effort necessary for integration of those schedulers. The beta version of the Globus Toolkit 4 will be evaluated as the solution, with the stress of the latest GRAM version, also compared to the alternative solution from Globus 2.4.

As the last integration step, a central installation of the Community Scheduler Framework (CSF) will be evaluated as a final layer interacting with end-users of this set of VO's resources.

back



Lukasz Kuczynski, Konrad Karczewski, Roman Wyrzykowski

CLUSTERIX Data Management System

Abstract:

Nowadays grid applications deal with large volumes of data. This creates the need for effective datamanagement solutions. For the CLUSTERIX project CDMS (Clusterix Data Management System) is being developed. User requirements and analysis of existing implementations have been the foundation for development of CDMS. A special attention has been paid to making the system user-friendly and efficient, allowing for creation of a robust Data Storage System.

Taking into account grid specific networking conditions - different bandwidth, current load and network technologies between geographically distant sites, CDMS tries to optimize data throughput via replication and replica selection techniques. Another key feature to be considered during grid service implementation is faulttolerance. In CDMS, modular design and distributed operation model assure elimination of a single point of failure. In particular, multiple instances of Data Broker are running simultaneously and their coherence is assured by a synchronization subsystem.

back



E. Imamagic, B. Radic, D. Dobrenic

CRO-GRID Infrastructure: Project Overview and Perspectives

Abstract:

We have witnessed tremendous success of production grids and various grid applications in last decade. Motivated by previous and current grid initiatives and raising science community demands for computing resources, project CRO-GRID was initiated. CRO-GRID is a national multi-project whose goals are introduction of grid and cluster technologies to science community and industry and development of distributed computing environment and grid applications.

CRO-GRID multi-project consists of three projects: CRO-GRID Infrastructure, CRO-GRID Mediator and CROGRID Applications. CRO-GRID Mediator develops distributed middleware system based on latest Web Service Resource specifications.

CRO-GRID Applications is responsible for design and implementation of applications by using CRO-GRID Mediator system and existing grid middleware, programming environments and tools.

CRO-GRID Infrastructure is responsible for building underlying cluster and grid infrastructure for two other projects to use. Furthermore, CRO-GRID Infrastructure will assure infrastructure maintenance and optimization and provide support to grid and cluster related projects. Status of project CRO-GRID Infrastructure is described below.

In first six months thorough investigation of existing cluster technologies has been completed. Investigation was focused on particular cluster subsystems (such as job management systems, monitoring tools, etc.) and followed by selection of the most appropriate cluster technologies. Selected cluster technologies were implemented on five clusters placed on scientific centers and universities in four cities. In coordination with project Giga CARNet, clusters were connected with gigabit links.

Currently we are analyzing contemporary grid technologies and projects. We are planning to finish grid technologies evaluation and decide which will be used by the end of this year. In the same time we will implement basic grid services on existing clusters.

Afterwards we will start upgrading grid functionalities based on specific needs of project CRO-GRID applications. In the following two years, we will focus our work on grid and cluster maintenance and optimization, community outreach and linking with international initiatives.

back



Patryk Lason, Andrzej Ozieblo, Marcin Radecki, Tomasz Szepieniec

CrossGrid and EGEE Installations at Cyfronet: Present and Future

Abstract:

In keeping with the trend toward the grid technology, Cyfronet is actively involved in recent EU Grid projects. In these projects the centre undertakes a variety of activities which include providing testbed for software developers, maintenance of computing resources and support for partners across Europe. Moreover, many significant responsibilities lie on the Cyfronet's teams e.g. to guarantee appropriate level of services, ensure proper operation of computing infrastructure in the region and certify new resources as they are deployed. To fulfil the mandates a considerable hardware infrastructure have been established, including 40 dual processor Intel Xeon based nodes (IA32), and also 20 dual Itanium-2 based nodes. Currently, the IA32 processors are used by the CrossGrid and LCG projects while the IA64 part is being utilized for initial installation for the EGEE project. In the future we plan to merge all the computing resources into one resource pool to be used by a range of applications. During last years we have gained much experience which allows us to play a key role in propagation of the Grid technology in our region. The plans are to improve comprehension among scientist of the advantages and possible profits that can be gained from the grid technology.

back



Renata Slota, Lukasz Skital, Darin Nikolow, Jacek Kitowski

Data Archivization Aspects with SGI Grid

Abstract:

The SGI Grid project[1] aims to design and implements broadband services for remote access to expensive laboratory equipment, backup computational center and remote data-visualization service. Virtual Laboratory (VLab) experiments and visualization tasks very often produce huge amount of data, which have to be stored or archived. Data archivization aspects are addressed by Virtual Storage System (VSS). Its main goal is to integrate storage resources distributed among computational centers into common system and to provide storage service for all SGI Grid applications. VSS can use variety of storage resources, like databases, file systems, tertiary storage (tape libraries and optical jukeboxes managed by Hierarchical Storage Management - HSM systems).

In the paper we present data archivization functionalities of the VSS, its architecture and user interfaces.

VSS offers standard operations, which let user to store and retrieve datafile organized in meta-directories. Besides this standard operations, some archivization specific functionalities are implemented: access time estimation, file fragment access, file ordering and fast access to large files residing on tapes. Access time depends on storage resource type and state. It can vary from milliseconds to minutes or even ten of minutes for data archived on tape libraries. VSS offers Access Time Estimation for HSM systems, which provide approximate access time to given file. User can order file or file fragment, which means to inform VSS, when he will need to access the file. If file resides on tapes or other slow media, the system will copy it to cache, what will minimize access time to the file. Access to large files residing on tapes is accelerated by incorporating file fragmentation. Automatic replication has been implemented as a method of data access optimization. Implementation details of VSS have been presented in [2].

VSS is equipped with following user interfaces: Web portal, Text console, Java API, VSSCommander (java api application). Web portal is VSS interface, which hides all system details and provide easy way of using VSS. Text console is an interfaces for VSS developers, it provides more details then others interfaces. Text console is not intended to by used by end users. Java API is programmers interface, which allows easy VSS client applications development. VSSCommander is a java application written using VSS Java API. VSSCommannder offers access to VSS in manner similar to WinSCP or Windows Commander.

The VSS as a part of SGI Gird is already in deployment phase and is used as a production system, which functional and performance tests are also presented.

[1] SGI Grid - http://www.wcss.wroc.pl/pb/sgigrid/en/index.php
[2] D. Nikolow, R. Slota, J. Kitowski, L. Skital, "Virtual Storage System for the Grid Environment", 4th Int. Conf. on Computational Science, Krakow, June 6-9, 2004, LNCS vol.3036, pp. 458-461. http://www.cyfronet.krakow.pl/iccs2004/

back



Ondrej Habala, Marek Ciglan, Ladislav Hluchy

Data Management in the FloodGrid Application

Abstract:

We would like to present the data management tasks and tools used in a flood prediction application of the CROSSGRID project. The project is aimed towards improving support for interactive applications in the Grid. Apart from providing several tools for development and support of Grid applications, it also contains a testbed with several testing applications. One of these is the FloodGrid application. The grid prediction application of CROSSGRID, called FloodGrid aims to connect together several potential actors interested in flood prediction - data providers, infrastructure providers and users. The application consists of a computational core, a workflow manager, two user interfaces and a data management suite. The project is based on the Grid technology, especially the Globus toolkit 2.4 and 3.2 and the EU DataGrid project. architecture and used technology.

The application's core is a cascade of several simulation models. At the beginning of the cascade is a meteorological prediction model, which receives its boundary conditions from outside of the cascade and computes temperature and precipitation predictions. These are then used in the second stage of the cascade - a hydrological model, which computes flow volume in selected points of the target river basin. This is then processed in the last stage of the cascade, in a hydraulic model. This model uses actual terrain model to compute water flow in the area. Where the water hits area outside of the river basin, a flood is expected. The computational core is interfaced to the environment and to FloodGrid users by several other components - the FloodGrid portal, which provides user interface to the whole application, workflow service for the control of the cascade and a data management suite.

The data management software of FloodGrid is equipped with facilities that support transport and storage of input data for the simulation cascade, cataloguing of all available files in the environment and easy access to them. It also provides radar imagery and data from measurement stations directly to the FloodGrid portal. It builds on the software available in the CROSSGRID testbed, which is interconnected with the portal and augmented with a metadata catalog for easy lookup of needed files. The whole suite consists of this metadata catalog (in the form of an OGSI grid service), data delivery software and EDG replica management (part of the testbed). All these parts are connected with the user interface of FloodGrid (the portal) and may be used from there, as well as automatically by the workflow service and simulation jobs. The metadata catalog is divided into two parts - a reusable grid service interface and an application-specific relational database. The interface may be used with any application, provided that the underlying database follows provides several tables with information about its structure in a form expected by the interface. The interface itself is accessible via several service calls for metadata management. Each metadata item describes a file (identified by its GUID from the EDG replica manager). New metadata items may be added, existing items may be removed and guid lookup is performed by specifying a set of constraints.

back



P. Rek, M. Kopec, Z. Gdaniec, L. Popenda, R.W. Adamiak, M. Wolski, M. Lawenda, N. Meyer, M. Stroinski

Digital Science Library for Nuclear Magnetic Resonance Spectroscopy

Abstract:

The need (and possibilities) for electronic publishing and presenting of the digital content grew along with the development of the Internet network. Nowadays users have the possibility of using more and more sophisticated tools designed to create, browse and search for electronic documents. These tools perform an important role in the global information infrastructure, and can benefit to the education and scientific environments.

The Digital Science Library (DSL) is created on the base of the Data Management System (DMS), which was developed for the PROGRESS project. Its main functionality, which is storing and presenting data in grid environments, was extended with the functions specific to the require-ments of the Virtual Laboratory. In its concept the Digital Science Library allow to store and pre-sent data used by the virtual laboratories. This data can be input information used for scientific experiments as well as the results of performed experiments. Another important aspect is the capa-bility of storing various types of publications and documents, which are often created in the scien-tific process. This type of functionality, which is well known to all digital library users, is also pro-vided by the DSL. One of the main examples of using DSL in practice is the Virtual Laboratory of NMR.

This work presents the general assumptions, project design and implementation of the Digital Science Library for the purpose of Nuclear Magnetic Resonance Spectroscopy (NMR) data. NMR spectroscopy is a unique experimental technique that is widely used in physics, organic and inor-ganic chemistry, biochemistry as well as in medicine.

The analysis of one- or multidimensional homo- and heteronuclear spectra obtained in the course of NMR experiment can provide information about the chemical shifts of the nuclei, scalar coupling constants, residual dipolar coupling constants and the relaxation times T1, T2. All of these data can be stored in the presented database, which also offers tools for performing quick and op-timal search through the repository. Compared to the other NMR databases available through the Internet, like BioMagResBank, NMR data-sets bank, NMRShiftDB, SDBS and Spectra Online, DSL is more suitable for teaching, since it contains an entire range of the information about the performance and analysis of the NMR experiment.

back



W. Dzwinel and K. Boryczko

Discrete Particle Simulations Using Shared Memory Clusters

Abstract:

We discuss the use of current shared-memory systems for discrete-particle modeling of heterogeneous mesoscopic complex fluids in irregular geometries. This has been demonstrated by way of mesoscopic blood flow in various capillary vessels.

The plasma is represented by fluid particles. The fluid particle model (FPM) is a discrete-particle method, which is a developed, mesoscopic version of molecular dynamics (MD) technique. Unlike in MD, where the particles represent atoms and molecules, in FPM fluid particles are employed. The fluid particles mimic the "lumps of fluid", which interact with each other, not only via central conservative forces as it is in MD, but with non-central, dissipative and stochastic forces as well. Because of three times greater communication load per FPM particle than per MD atom, the reconfiguration of the system becomes very time consuming. The other blood constituents are made of "solid" particles interacting with harmonic forces. The tests were performed for the same system employing two million fluid and "solid" particles. We show that irregular boundary conditions and heterogeneity of the particle fluid inhibit efficient implementation of the model on superscalar processors. We improve the efficiency almost threefold by reducing the effect of computational imbalance using simple load-balancing scheme.

Additionally, in employing MPI on shared memory machines, we have constructed a simple middleware library to simplify parallelization. The efficiency of the particle code depends critically on the memory latency. As an example of application of small, shared-memory clusters and GRID environment in solving very complex problems we demonstrate the results of modeling red blood cells clotting in blood flow in capillary vessels due to fibrin aggregation.

back



Waclaw Kus, Tadeusz Burczynski

Distributed Evolutionary Algorithm Plugin for UNICORE System

Abstract:

The shape optimization problem of structures can be solved using methods based on sensitivity analysis information or non-gradient methods based on genetic algorithms[5]. Applications of evolutionary algorithms in optimization need only information about values of an objective (fitness) function. The fitness function is calculated for each chromosome in each generation by solving the boundary - value problem by means of the finite element method (FEM)[3,8] or the boundary element method (BEM)[3]. This approach does not need information about the gradient of the fitness function and gives the great probability of finding the global optimum. The main drawback of this approach is the long time of calculations. The applications of the distributed evolutionary algorithms can shorten the time of calculations[1,6].

The use of grid techniques in optimizations can lead to improvements in hardware and software utilization. The other advantages of grids are simple and uniform end user communication portals/programs.

The first evolutionary optimization tests [4] were performed using Condor package[2]. The idea presented in the paper is to prepare group of plugins and programs for evolutionary optimization of structures using UNICORE environment[7]. The distributed evolutionary algorithm plugin for UNICORE environment is Some numerical test of optimization of structures are presented in the paper.

[1] T. Burczynski, W. Kus, Optimization of structures using distributed and parallel evolutionary algorithms Parallel Processing and Applied Mathematics, PPAM2003, Revised papers, Lecture Notes on Computational Sciences 3019, Springer, pp. 572-579, 2004.
[2] Condor, High Throughput Computing, http://www.cs.wisc.edu/condor/
[3] M. Kleiber (red.), Handbook of Computational Solid Mechanics, Springer- Verlag, 1998.
[4] W. Kus, T. Burczynski Computer implementation of the coevolutionary algorithm with Condor scheduler, KAEIOG 2004, Kazimierz, 2004.
[5] Z. Michalewicz, Genetic algorithms + data structures =evolutionary algorithms. Springer-Verlag, Berlin, 1996.
[6] R. Tanese, Distributed Genetic Algorithms. Proc. 3rd ICGA, pp.434-439, Ed. J.D. Schaffer. San Mateo, USA, 1989.
[7] UNICORE Plus Final Report - Uniform Interface to Computing Resources, Joint Project Report for the BMBF Project UNICORE Plus, Grant Number: 01 IR 001 A-D, 2003.
[8] O. C. Zienkiewicz, R. L. Taylor, The Finite Element Method. The Basis, Vol. 1-2, Butterworth, Oxford, 2000

back



Witold Alda, Bartlomiej Balcerek, Maciej Dyczkowski, Grzegorz Kowaliczek, Stanislaw Polak, Rafal Wcislo, Adrian Zachara

Distributed Visualization System Based on Web-Services

Abstract:

In the paper we present the architecture and implementation of the general-purpose distributed visualization system based on web-services tools. The data files, which contain information for 3D visualization can reach enormous size in current computer simulations and experiments. Practically they can only be gathered and stored on the remote servers with large storage devices.

System presented in the paper can interactively convert files stored on remote servers into common X3D format, possibly extracting only parts which are temporarily needed, thus reducing the number of bytes transferred to client's workstation. In the implementation layer we use contemporary Java-based tools, such as J2EE, EJB (Enterprise Java Beans) and Jboss application server on the remote side, as well as Java3D graphic library for local (client) rendering. The architecture of the system utilizes design patterns, component technology and XML notation in order to achieve clarity and flexibility of the project. The current version the system is supplied with sample readers/converters of vastly used PDB files - for storing molecular information of proteins - and DEM files - suitable for keeping elevation data of digital 3D geographical maps. However, due to modular architecture of the system, it is fairly easy to add new readers.

back



Krzysztof Kurowski, Bogdan Ludwiczak, Jarek Nabrzyski, Ariel Oleksiak, Juliusz Pukacki

Dynamic Grid Scenarios with Grid REsource Management System in Action

Abstract:

GridLab Resource Management System (GRMS) is an open source meta-scheduling system, developed under the GridLab Project. It provides mechanisms for resource management for Grid environment: job submission, job migration, job control, application checkpointing, gathering information about submitted jobs, etc. GRMS operates on behalf of a user using his or her credentials for authentication and authorization. In cooperation with other Grid middleware services and using low level core services GRMS is trying to deal with dynamic nature of Grid environment.

GRMS has some basic mechanisms which allow to implement more complex scenarios for job management. Basic pieces of job management are: job submission and job migration witch application level checkpointing. Based on that and with strong assistance of dynamic resource discovery mechanism implemented within GRMS it is possible to provide more advanced and dynamic scenarios based on job rescheduling methods. The reschedule policy checkpoints and migrates already running jobs in order to release the amount of resources required by a job pending in the GRMS queue. The rescheduling method that is used, consists of several steps. At first resources that meet static requirements have to be found. If there are such resources available, list of jobs that are running on that resources and were submitted by GRMS is created. Then the system tries to determine the migration of which jobs can bring the required result. This step consists of two actions. First, GRMS searches for jobs after termination of which the pending job could be started immediately. Second, selected jobs are analyzed again to check which of them can be migrated to other available resources, taking into account the requirements of these jobs and the resources available at the moment. In the next step the best job for migration is chosen by evaluation of available machines and jobs found for migration. Selected job is then checkpointed and migrated to the new location and original job is submitted to the machine on which migrated job was running.

Presented rescheduling technique is very useful and brigs much better results comparing to other possible mechanisms - keeping job in a queue, or submitting job to overloaded machine. We will also compare this method to a popular backfilling mechanism.

back



Speakers/Presenters: Jason Novotny, Michael Russell

GridSphere Portal Framework (GridLab)

Abstract:

Grid portals build upon the familiar Web portal model to offer virtual communities a single point of access to computational resources (clusters, data servers, applications, scientific instruments and computing services) as well as the information and business services traditionally offered on the Web. The main idea is to leverage the Web to enable scientists and engineers to solve larger and more complex problems with the computing services made available to the them on the Grid. The Web is becoming increasingly attractive as a means for delivering applications given the evolution in Grid computing towards the use of Web service technologies. Moreover, widely-adopted stardards are making the Web a more viable platform for developing applications. For example, the portlet JSR standard (Java Specification Request 168) defines a way for packaging and sharing user interface components, or "portlets", over the Web. The portlet JSR standard makes it easier to deliver a wide-variety of applications to end-users and helps to encourage interoperability between those applications.

In this tutorial, we'll describe the GridSphere Portal Framework. GridSphere is an open-source portlet JSR compliant container for hosting and developing portlets. We'll highlight the base-classes and tools GridSphere offers for developing sophisticated portlets that provide a consistent, professional look & feel. We'll then show how GridSphere can be easily customized and administered online to provide a complete portal solution. Next, we'll discuss the set of portlets and tools the GridSphere Project offers for developing and administering Grid-enabled portals. We'll conclude our tutorial with a brief discussion of how the Numerical Relativity Portal, under construction at the Albert Einstein Institute, utilizes GridSphere to enable physicists to manage their physics applications on the Grid. The Numerical Relativity Portal provides just one example of how GridSphere can be used to develop a Grid portal tailored to the needs of a particular application community.

The intended audience: Web and Grid application developers; scientists, engineers, and system administrators using or interested in using Grid technologies. Attendees should have some familiarity with using or developing Web portals and a basic understanding of Grid computing concepts. Attendees will get an introduction on how they can use GridSphere to develop standards-based Grid portals.

back



Dieter Kranzlmueller

The Austrian Grid Initiative and the Central European Grid Consortium - Combining National and Regional Efforts

Abstract:

The hype surrounding the term grid computing seems to be calming down. Industry has started to take "the grid" seriously, not only as a buzz word for marketing, but as a technology mature enough to be actually deployed. Grid computing itself seems to be in a state of consolidation, with every nation engaged in its own national grid initiative, and large-scale projects such as EGEE promising production level grids on never before seen production levels.

This contribution describes the authors view of the current situation in grid computing with focus on the national Austrian Grid program and the regional Central European Grid Consortium (CEGC). Starting with the EU CrossGrid project, a predecessor in many aspects of todays and future grid projects, the characteristics of available and deployed grid computing technology will be reviewed, leading to an outlook of the plans for the Austrian Grid initiative and the potential impact of regional consortia such as CEGC.

back



Pawel Plaszczak

The State and the Future of the Commercial Grid Computing

Abstract:

In this presentation, we look at Grid computing as a set of technologies enabling resource virtualization bringing the world towards the ubiquitous market of digital services. We will look at how grids are implemented today and in which industries they are used. We will present a handful of use cases of commercial "enterprise grid" systems. These are based on commercial and open source Grid products available on the market. We will next look at what obstacles are being faced by those who try to implement partner grids. These obstacles will be further analyzed to see what pieces of technology are still missing. We will next try to make an intelligent guess on how open grid systems of the near future may work and what roles will be available for the companies and institutions wishing to participate in the Grid-based market.

The presentation is a summary of a year-long research which the author conducted for his Savvy Manager's Guide to Grid Computing, to be published in 2005 by Morgan Kaufmann. The author is heading Gridwise Technologies, a consulting firm specializing in Grid solutions. Prior to founding Gridwise, the author worked for the Globus Project.

back



Coorganizers: Institute of Nuclear Physics Polish Academy of Sciences

IFJ PAN
Institute of Computer Science AGH
Institute of Computer Science AGH
ATM S.A.

ATM S.A.

SEP
O. Krakow