Cracow, Poland
O c t o b e r  27 - 29, 2003

 
Objectives
Committees
Programme
Papers
Registration
Accommodation
Practical info
About Kraków
 
Previous CGW
 
Main Organizer:

ACC Cyfronet AGH
back

A b s t r a c t s
and
P r e s e n t a t i o n s
( in alphabetical order )



Bartosz Balis, Marian Bubak, Wlodzimierz Funika, Marcin Radecki, Tomasz Szepieniec, Roland Wismueller, Tomasz Arodz, and Marcin Kurdziel

A Concept of a Monitoring Infrastructure for
Workflow-Based Grid Applications


Abstract:

The main goal of this work is to design a Grid service for on-line performance monitoring of Grid workflow-based applications. The service is meant to provide information about running applications, which concerns the current status of applications and Grid infrastructure, availability of resources, correct operation of services the application comprises. The information could be used by systems that ensure fault-tolerant execution of the application, scheduling systems, and the user who observes the execution of the application and takes relevant decisions on its operation [1].

In case of workflow-based Grid applications [1], two aspects of performance monitoring are important: first - monitoring the status of Grid services composing the workflow and interactions between them, second - monitoring of the internal performance of individual services. The latter aspect is already well addressed by such approaches as CrossGrid's OCM-G/G-PM [2].

Our focus in this paper is moved towards monitoring of applications built as a workflow of interacting Grid services. Support for such applications requires a completely new approach. There are new, grid-service-related metrics to be introduced like e.g., the overhead due to communication between services or the computing power utilization for a service.

We propose the following architecture for the monitoring system for workflow-based applications. First, individual Grid services which are to be monitored should be extended with a monitoring interface via which a monitoring information about a component can be retrieved. Additionally, this interface may also enable instrumentation or other manipulations on the target component. This implies that parts of the monitoring infrastructure must be integrated into the grid services themselves. In addition to the local monitoring interfaces in each component, a global monitoring service should be available which will itself be a grid service, and via which it will be possible to extract global monitoring propeties combining monitoring information from a set of components which are parts of a single application. An example of a global monitoring property is a "total communication volume between all components of an application".

Currently, we anticipate several clients of the monitoring system. First, the performance analysis and prediction tool may use the data to visualize the application behavior. Second, the workflow composition system might make use of the monitoring services in a decision support. Finally, the information about resources could be used to ensure fault-tolerant execution of the application, by scheduling systems, etc.

References:

  1. The Taverna Project http://taverna.sourceforge.net
  2. B. Balis, M. Bubak, W. Funika, T. Szepieniec, and R. Wismueller: Monitoring and Performance Analysis of Grid Applications. In: P.M.A. Sloot et al., editors, Computational Science - ICCS 2003, vol. 2657 of Lecture Notes in Computer Science, pages 214-224, St. Petersburg, Russia, June 2003. Springer-Verlag

Presentation
(PowerPoint: 148 KB)


Antonio Fuentes, Eduardo Huedo, Rubén S. Montero, Ignacio M. Llorete

A Grid Scheduling Algorithm Considering Dynamic Interconnecting Network Quality

Abstract:

For certain application domains the traditional concept of computing based on a homogeneous, and centrally managed environment is being displaced by Grid computing, based on the exchange of information and the sharing of distributed resources by applications. The Globus toolkit has become a de facto standard in Grid computing. Globus is a core Grid middleware that supports the submission of applications to remote hosts by providing resource discovery, monitoring, resource allocation and job control service.

However, application execution on Grids continues requiring a high level of expertise due to its complex nature. The user is responsible for manually performing all the job submission stages in order to achieve any functionality, namely: system selection and preparation, submission, monitoring, migration and termination. Moreover, computational Grids are dynamic environments, characterized by highly variable conditions: high failure rate, and resource availability, performance and cost. Therefore, adaptation to changing conditions is needed to achieve a reasonable degree of both application performance and fault tolerance. In the present work, job adaptation is achieved by implementing automatic application migration following performance degradation, better resource discovery, requirement change, owner decisions or remote resource failure.

The most important step in job scheduling is resource selection, which in turn relies completely in the information gathered from the grid. Resource selection usually takes into account the performance offered by the available resources, but it should also consider the quality of the interconnecting network in terms of latency and bandwidth between the Grid resources. For example, the bandwidth is very important because the size of the files involved in some applications domains, like Particle Physics or Bioinformatics, is very large. This fact is specially relevant in the case of adaptive job execution, since job migration requires the transfer of large restart files between the compute hosts. Therefore the quality of the interconnection network has a decisive impact on the overhead induced by the job migration.

In this work, we present a new job scheduling algorithm that takes into account the interconnecting network quality, dynamically evaluating decisive communication parameters. Our scheduler gathers dynamic information of remote resources and the network (bandwidth, latency,.. ) in order to choose the better available resource for job submission. The final contribution will include experimental results of the scheduling algorithm in a research testbed.

Presentation
(PDF: 0,6 MB)


Marian Bubak, Marcin Radecki, Tomasz Szepieniec

A Proposal of Application Failure Detection
and Recovery in the Grid


Abstract:

The large in scale and long-running applications are common in the Grid. The cost in case such a job collapses is high. What is more, the probability that software or hardware component involved in the application will fail during the computation is relatively high. Therefore, providing some mechanisms of fault detection and enabling procedures to preserve the job from total collapse seems to be a must.

The failures that could occur have diversity characteristics. The source of problems could lead in network, node, system configuration or application itself. The problem could occur timely or be permanent. Also the severity and danger of fault for the job differs e.g. in master-slave application the collapse of a slave's node is not as dangerous as the collapse of the master process. Failures ranges from cutting off the whole site to decrease of performance due to overloaded network link. Detailed recognition of fault characteristics is crucial for choosing the most suitable recovery strategy.

For each kind of failure appropriate recovery scenario should be work out. The simplest scenario is just to ignore the failure if danger is low. As a last resort we could kill and restart the whole job. More sophisticated scenario is, for example, migration of some processes or restart application using previous check-pointing. The right decision can not be taken without consideration of application programing paradigm while, for example, MPI-based applications allow different recovery techniques than Grid-service-based ones.

The service that could face so broadly defined fault recovery, should use all available methods to monitor Grid and to perform recovery actions over application. The list of Grid services that should be integrated in this activity contains application and infrastructure monitors, check-pointing and migration services, schedulers and others. These services should be coordinated by a fault recovery manager to profit from their gathered abilities to support application fault tolerancy.

References:

  1. Bubak, M., Zbik, D., Dick van Albada, et al, "Portable library of migratable sockets" Scientific Programming, Volume 9, Number 4/2001, pp. 211 - 222
  2. Balis, B., Bubak, M., Funika, W., Szepieniec, T., and Wismueller,R.: An Infrastructure for Grid Application Monitoring In: Kranzlmueller, D., Kacsuk, P., Dongarra, J., Volker, J. (Eds.): Recent Ad vances in Parallel Virtual Machine and Message Passing Interface, Proc. 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September/October 2002, LNCS 2474, pp. 41-49, 2002

Presentation
(PowerPoint: 162 KB)


Aleksander Nowinski, Krzysztof Nowinski, Jaroslaw Pytlinski, Piotr Bala, Krzysztof Benedyczak

Advanced Visualization Capabilities in the UNICORE

Abstract:

UNICORE becomes one of the important grid middleware. The main advantage of the UNICORE environment is powerful Graphical User Interface which allows for job preparation, job submission and job management. Presented work allows for build intoo the the UNICORE capabilities for the graphical postprocessing of the results. Dedicated extensions to the UNICORE client (plugins) have been developed. The graphical capabilities were introduced in the form of the generic visualization plugin, as well as for the postprocessing of the quantum chemistry code - Gaussian output.

back



Vaidy Sunderam, Dawid Kurzyniec

Alternative Frameworks for Cooperative Distributed Computing

Abstract:

Computational resource sharing across multiple administrative domains is gaining widespread use, driven by the benefits of aggregation and service-based computing. This project, comprising the Harness II and H2O systems, introduces a novel model for cooperative fault-tolerant distributed computing that emphasizes statelessness, (re)configurability, and interoperability. Sharing resources or services across administrative boundaries involves numerous technical challenges, notably those concerning security and resource allocation. The H2O architecture is designed to specify sharing relationships on a pairwise basis, thereby localizing all constraints and agreements between any provider-client pair and minimizing or even eliminating global distributed state. Upon this fabric, the Harness II infrastructure serves as an integrated platform for efficient high performance computing by aggregating distributed resources. Current distributed systems and grids are rigid, complex, and brittle. The H2O and Harness II frameworks, through their minimal-state design and component architecture that permits reconfiguration of resources as needed, aim to be flexible and robust. The architectural model that realizes this mode of distributed computing is described below.

The lowest layer in the Harness II framework is the component hosting and interaction substrate termed H2O. This layer is characterized by a lightweight but secure kernel, into which "pluglets" realizing different functions are loaded - either by the hosting provider, authorized clients, or third-part resellers. Examples of such pluglets are high-speed transport modules, parallel programming libraries such as FT-MPI, or specialized numerical solvers and application components. Pluglets interact across kernels using a specially developed communications layer called RMIX, that offers a well-understood and rich method invocation interface while permitting the use of interoperable transports. Its programming interface is based on Remote Method Invocation paradigm but is language neutral; features such as one-way and asynchronous method invocations have been added to better support the needs of high performance distributed computing. The RMIX layer also ensures secure communications and has inbuilt support for resilience to communication failures.

Within H2O kernels, pluglets conform to a well-defined interface to interact with the kernel, and are controlled by the kernel's security policies and resource consumption restrictions. This design makes it possible for clients to load domain- or platform-specific pluglets into provider kernels to cater to application needs, while protecting resource providers from damage or excess consumption of their resources. Resource providers specify these constraints when launching kernels via a portable schema that is XML-based, thereby permitting the publication and discovery of resources using standardized mechanisms and tools - or by using simpler and more appropriate means as the situation dictates. The companion poster to this abstract depicts the architectural and philosophical foundations of the H2O and Harness II frameworks, highlights the salient features of the resource sharing model, and outlines prototype implementation and use-case scenarios.

Presentation
(PowerPoint: 250 KB)
 


Renata Slota, Darin Nikolow, Jacek Kitowski, Jerzy M. Zaczek

Architecture of the Virtual Storage System for Grid-based Accessing

Abstract:

One of the important problems for grid computing is development of the middleware layer, that consolidates different kinds of national or international resources. This is especially important while dealing with data distributed amongst different locations. Therefore, the data management is essential for grid data access.

This paper describes the architecture of the virtual storage system (VSS) for grid-based accessing being developed as one of the task for the SGIgrid project [1]. Contrary to the software developed up-to-date, the architecture is kept at the simplest level, easy to operate and maintain. The virtual storage system is aimed at integrating the mass storage facilities being used in the computing centers taking part in the project. HSM type of software is used in these computing centers to allow accessing the mass storage hardware like tape libraries and optical jukeboxes.

The architecture consists of the following main modules:

  • VFM - Virtual File Manager, which is responsible for the VSS session authorisation, managing the files stored in the VSS, resolving virtual file names to physical replica instances, negotiating the data transfer between LFM and client application,
  • LFM - Local File Manager, which manages the physical files residing on HSM systems and transfers data between the HSM system and client application and estimates the data access time for the specified physical file
  • HSM - Hierarchical Storage Manager, HSM software allowing access to data residing on tertiary storage
  • MDB - Meta Database,
  • OR - Optimalization of Replicas, which is responsible for the replica selection based on the criteria of minimizing the data access time
  • API - Aplication Programing Interface, allowing the client application to comunicate with the VSS
Different kinds of our previous achievements are incorporated, like time estimation for tertiary storage system [2] and index based retrieving of video sequences [3]. Overview of the existing approaches to the problem will be depicted in the paper.

Bibilography
  1. SGIgrid: Large-scale computing and visualization for virtual laboratory using SGI cluster (in Polish), KBN Project, http://www.wcss.wroc.pl/pb/sgigrid/
  2. Nikolow, D., Slota, R., Dziewierz, M., Kitowski, J., Access Time Estimation for Tertiary Storage Systems, in: Monien, B., Feldman, R. (Eds.), Euro-Par 2002 Parallel Processing, 8th International Euro-Par Conference Paderborn, Germany, August 27-30, 2002 Proceedings , no. 2400, Lecture Notes in Computer Science, Springer, 2002, pp. 873-880.
  3. Nikolow, D., Slota, R., Kitowski, J., Nyczyk, P., Otfinowski, J., "Tertiary Storage System for Index-Based Retrieving of Video Seqences", in: Hertberger, B., Hoekstra, B., Williams, R. (Eds.), Proc. Int. Conf. High Performance Computing and Networking, Amsterdam, June 25-27, 2001, Lecture Notes in Computer Science 2110, pp. 62-71, Springer, 2001.

Presentation
(PowerPoint: 7.16 MB)


Thomas Fahringer

ASKALON: A Tool Set for Cluster and Grid Computing

Abstract:

Presentation
(PowerPoint: 6,28 MB)


Lukasz Dutka, Jacek Kitowski

Automatic Application Builder for Grid Workflow Orchestration

Abstract:

Grid web services are direct corollaries of component architectures. In reality they are good examples of application of component ideas [1] to large scale systems. Thus, the selection of grid web services could be treated as selection of components deployed in a grid environment. However, at present there are no many solutions existing of component selection on the fly without human interaction.

One of the rare examples of expert system application for semi-automatic workflow building is [2], although the problem addresses decision making for commercial proposes. Other examples are presented in [3-4].

In this paper the Automatic Application Builder is proposed for supporting the user in selection of workflow elements (services or program components). The selection of a component or a service from the repository is based on a rule-based expert system that incorporates requirements and state of the application to be build. The rules are developed by a human expert. The advantage of the system is its flexibility in development (since the components or services can be prepared independently) and reasoning concerning the choice of components or services according to the knowledge of the application requirements.

The proposed approach has already been successfully implemented for large-scale WWW-based information systems [5] as well as for optimization of access to grid storage [6-8].

Bibliography

  1. C. Szyperski, Component Software: Beyond Object-Oriented Programming. ACM Press and Addison-Wesley, New York, NY, 1998.
  2. http://www.connective-edi.com/news/CTI_OctNewsLetter_103001.asp
  3. C. Duvel, Establishing rule-based models to implement workflow within construction organizations, PhD Thesis, UMI 9976532, University of Florida, 1999
  4. C. Duvel, and R. R. A. Issa, The application of expert systems to controlling workflow within the construction management environment, Artificial Intelligence Applications in Civil and Structural Engineering, Eds. (B. Kumar and B.H.V. Topping), Civil-Comp Press, 1999, pp. 61-72
  5. L. Dutka and J. Kitowski, Flexible Component Architecture for In-formation WEB Portals, in: P. Sloot, D. Abramson, A. Bogdanov, J. Dongarra, A. Zomaya, Y. Gorbachev (Eds.), Proc. Computational Science - ICCS 2003, Int.Conf. St. Petersburg Russian Federation, Melbourne Australia, June 2-4, 2003, LNCS, vol. 2657, 2003, pp. 629-638
  6. L. Dutka and J. Kitowski, Application of Component-Expert Tech-nology for Selection of Data-Handlers in CrossGrid, in: D. Kranzlm=FCller, P. Kacsuk, J. Dongarra, J. Volkert (Eds.), Proc. 9th European PVM/MPI Users=92 Group Meeting, Sept.29, Oct.2, 2002, Linz, Austria, LNCS, vol.2474, Springer, 2002, pp.25-32
  7. L. Dutka, R. Slota, D. Nikolow and J. Kitowski, Optimization of Data Access for Grid Environment, 1st European Across Grids Conference February, 13-14, 2003 Universidad de Santiago de Compostela, Spain, LNCS, Springer, 2003, submitted
  8. K. Stockinger, H. Stockinger, L. Dutka, R. Slota, D. Nikolow, J. Kitowski, Access Cost Estimation for Unified Grid Storage Systems, Supercomputing 2003 IEEE Conf., Nov. 2003, accepted

Presentation
(PowerPoint: 3.58 MB)
 


Ariel Garcia, Marcus Hardt, Yannick Patois, Ulrich Schwickerath

Collaborative Development Tools

Abstract:

Groups of software developers, especially when spread over different physical locations, need a whole wealth of tools for managing mailinglists, bug trackers, code versioning, website hosting, nightly builds,... . The opensource area offers a whole wealth of tools allowing to set up a solution for every requirement. However, this consumes manpower - usually a rare resource.

The aim of the savannah project is to provide a software framework running on one central server, managed by a few experts, providing the necessary tools to its users. All services are available with the same sign-on. Currently the following services are provided:

  • General information page showing latest news, contact to project members and status about the project activity
  • News service: Allows for posting news and for discussion about news
  • Mailing forums: similar to Mailinglists but with a web based forum archive
  • Download service: for file distribution. File collections can be dynamically created on a daily basis
  • Bugtracker: a customizable bug tracking system similar to Bugzilla
  • Support Trackers: providing a simple feedback function to the user community
  • Patch Manager: for allowing anybody to suggest patches to the sourcecode
  • Task Manager: to structure the project into subtasks that can depend upon each other
  • CVS: Management of the poject assigned CVS repository
  • Autobuild: A tool for nightly builds of the code in the CVS repository
On http://gridportal.fzk.de one instance of savannah has been installed within the CrossGrid project. This server is being dedicated to the support of Grid and HEP related software development.

Presentation
(PDF: 182 KB)


V.N. Alexandrov and S. Mehmood Hasan

Collaborative Tools for the Grid

Abstract:

Our efforts described in this paper are aimed at harnessing the capabilities of Collaborative Computing with Grid Computing. The work aims to complement the efforts in Grid Computing by providing human-centred techniques and technologies for facilitating collaborative, computer-based cooperative work.

The notion is to provide interactive and real time visualisation, joint analysis of results and seamless access to data repositories. This cooperative work must be conducted within a coherent and inclusive collaborative environment. This can be achieved by the construction of a virtual work environment on multiple computer systems connected over the grid. The virtual work environment will use the Grid as its underlying infrastructure, using grid security mechanisms to authenticate participants in the collaboratory.

In this setting, participants interact with each other, simultaneously access and operate computer applications, refer to global data repositories or archives, collectively create and manipulate documents, perform computational transformations, and collaboratively visualise the results. The collaborative experience will be enhanced by providing integral support for human audio/video communication.

We will discuss recent work on grid enabled collaborative tools, and outline our plans to investigate and explore innovative enabling technologies to support collaborative, distributed, grid-based problem solving.

Presentation
(PowerPoint: 178 KB)


Pawel Pisarczyk

Commodity Computing Clusters - Next Generation Supercomputers?

Abstract:

Presentation
(PowerPoint: 883 KB)


Florian Schintke and Jan Wendler

Computational Fluid Dynamics in the Grid Using FlowGrid

Abstract:

We present the architecture of FlowGrid, a software package to enable Computational Fluid Dynamics (CFD) applications in the Grid. FlowGrid revolutionizes the way CFD simulations are set up, executed and monitored. In a network of Grid-enabled CFD centers across Europe, the development and validation of software and knowledge for Grid-based CFD computations takes place. This CFD Virtual Organization provides easy and flexible access to CFD resources for the industrial end users.

Computational Grids are ideal for CFD simulations since in general the computational resources planned for such simulations become either insufficient or underutilized most of the licensed time. The primary advantage of bundling such resources into a CFD Virtual Organization is the flexibility in providing on-demand computational power. A special property of CFD simulations is the need for synchronous communications between the subjobs, making it challenging to execute jobs on the Grid.

The FlowGrid architecture consists of:

  • a user client called GenIUS, which performs the task partitioning and operates the Grid for the user
  • the middleware FlowServe, which aggregates available resources, allocates and distributes job requests to computing resources
  • the backend executes the parallel CFD simulation code on clusters and high performance computers
  • a database that stores meta information about resource availability and costs
  • the portal which provides software to subscribers, allows resource providers to set policies and prices, and let the administrator manage the system manually
With GenIUS running on Microsoft Windows and FlowServe running on Linux, these two operating systems are combined within a single Grid environment. We describe the protocol between GenIUS and FlowServe, which also covers interfaces of FlowServe to other user frontends to allow an easy integration of other CFD programs into the Grid.

With FlowServe, preliminary results are provided to the user during runtime, so that he can see how the simulation converges. Thereby he can discover problems in the simulation during runtime. Current non Grid-aware CFD applications allow adaptations to the calculations during runtime. Such kind of adaptations will be also supported by FlowServe.

Presentation
(PDF: 960 KB)


Marian Bubak, Michal Turala

CrossGrid Project in Its Halfway: Achievements and Challenges

Abstract:

Presentation
(PowerPoint: 6,36 MB)


Piotr Nyczyk, Andrzej Ozieblo, Marcin Radecki

CrossGrid Testbed Cluster at ACK CYFRONET AGH

Abstract:

The testbed cluster at ACK Cyfronet AGH in Kraków is a part of the CrossGrid Project testbed network, which involves 16 partners from 9 countries across Europe. Our testbed hardware is based on a rack cluster of 1U Intel Dual processor units produced by RackSaver Inc. The initial configuration contains four Intel Dual P III nodes augmented by 23 Dual Xeon units with two 2.4MHz Xeon processors, 1GB memory, 40GB disk and 1000Mb Ethernet ports on each node. Communication between nodes is assured by a HP Switch with forty 100Mb ports and three 1000Mb uplink ports. Currently the bandwidth of the connection to the national research network is 622 Mbits/s but it will be increased to 10Gbit/s in a few months. A dedicated KVM (keybord, mouse, monitor) 1U unit, incorporated into the cluster rack, is in use for monitoring all elements. Disk space has been increased by additional 640MB 1U units (Quardian 4400) and a 4 GB 4U disk array. 10 additional Dual Xeon nodes will be added in a few weeks.

Currently, we are running on EDG 1.4, the separately-installed LCG-1 testbed software with Globus 2 for Grid services and OpenPBS for scheduling. Installation is automatized through an LCFng Installation and Configuration System with configuration profiles synchronized with a central repository. Several software packages have been installed: ATLAS for HEP applications, Gaussian, Mathematica and a developement environment (C/C++, Java, Fortran). For monitoring the entire cluster, a GANGLIA distributed system with additional temperature sensors is used.

Presentation
(PDF: 586 KB)
 


Olivier Martin

DataTAG Presentation

Abstract:

Presentation
(PowerPoint: 3,04 MB)


Marian Bubak, Tomasz Gubala, Maciej Malawski and Katarzyna Rycerz

Design of Distributed Grid Workflow Composition System

Abstract:

The Grid is a complex environment with many resources distributed geographically which may be connected together in order to execute Grid applications. These Grid applications consist of many independent and possibly heterogeneous modules, connected together in order to achieve required functionality. Discovering and joining such elements distributed throughout vast and frequently changing Grid environment can be difficult.

We propose a new, novel design of the Grid application workflow composition system, intended to support the user and other systems.

First, we shortly describe our previous prototype solution to the workflow composition problem, called Application Flow Composer system (AFC). The brief description of system architecture and internal mechanisms is followed by a discussion on advantages and disadvantages of this very first approach [1]. Taking these into consideration we propose a new composition system, based on Grid services concept. It consists of fully distributed registry storing valuable descriptions of available services, including information about service semantics. The second part is a semi-automatic composition system (temporary called AFC2), and it uses the registry in order to generate Grid workflows requested by the user. The AFC2 system is based on peer-to-peer technology with mechanism of weak peer migration. The concept of peer specialization enables more efficient service discovery and introduces the base for system learning capability.

The main part of the paper presents the proposal of a new system design, and describes it on various levels of abstraction. Starting from a conceptual diagrams the discussion steps into more detailed description, trying to show both static and dynamic aspect of system behavior. Afterwards, we discuss possible technologies which can be applied for system implementation; it is accompanied by the list of particular technology advantages.

We conclude with a discussion of expected improvements the new system should have demonstrate.

References:

  1. Bubak, M., T. Gubala, M. Malawski, K. Zajac: Automatic Flow Building for Component Grid Applications. Presented at PPAM 2003 Conference, Czestochowa, Poland, September 7-10, 2003, to be printed in LNCS

Presentation
(PDF: 173 KB)


Andreas Hoheisel and Uwe Der

Dynamic Workflows for Grid Applications

Abstract:

There are several approaches in the Grid computing community to execute not only single tasks on single Grid resources but also to support workflow schemes that enable the composition and execution of complex Grid applications. The most commonly used workflow model for this purpose is the Directed Acyclic Graph (DAG). DAGs have a very simple structure and are easy to use; they possess, however, two relevant disadvantages: they do not support bidirectional coupling and it is not possible to explicitly define loops.

Within the establishment of the Fraunhofer Resource Grid, we developed a Grid Job Definition Language (GJobDL) that is based on the concept of Petri nets instead of DAGs. Petri nets are graphical representations of the workflow of discrete systems. In contrast to DAGs, which only describe the dynamical behaviour of the system, Petri nets also describe the system's state. The type of Petri nets we introduced here corresponds to the concept of Petri nets with individual tokens (coloured Petri net) and constant arc expressions.

The Grid Job Definition Language is used to describe the workflow of a Grid application on an abstract level. This description is independent from the Grid infrastructure and defines the relationships between the software components (transitions) and the data (places). Transitions can be annotated with conditions that are dependent from the tokens that are moving along the arcs of the Petri net. During the workflow execution, the abstract workflow must be concretized in order to be mapped onto the real Grid environment. This requires dynamic completion of the workflow based on actual information. It may be necessary to introduce new tasks - such as data transfers, deployment of software, authorization request, and data retrievals. These tasks can be represented by sub Petri nets that replace parts of the existent Petri net during runtime of the Grid application.

Only few Grid initiatives include advanced fault management. Mostly the fault management is predefined implicitly by the Grid architecture, and results in re-scheduling, recovering or migration of single tasks in case of a fault. We propose a concept for fault management of entire job workflows, by explicitly modelling the fault management within the workflow model. This can be done user-defined or automatically by introducing new tasks enabling fault management, based on fault management templates.

Presentation
(PDF: 1.3 MB)


Andreas Gellrich, Jacek Nowak, Maxim Vorobiev

EDG 1.4 RPM Package Installation on DESY Linux 4
(SuSe 7.2) Machines


Abstract:

The workstations at DESY (Deutsches Elektronen Synchrotron) run DESY Linux v.4. It is a customized version of SuSe Linux v.7.2. It uses AFS user accounts and also most of DESY libraries and applications are shared through AFS. This customized Linux distribution has been used as the base for an EDG 1.4 testbed installation. The installation was performed based on the manual installation from the "EDG installation guide". Binary RPM packages from the official EDG 1.4 distribution were used. As the only officially supported platform for EDG 1.4 is RedHat Linux 6.2, a lot of small modifications to the installation procedure are necessary and many problems need to be fixed. The testbed at DESY proves that it is possible to install all main Grid nodes (WN, CE, SE, = RB, BDII, RC, UI) on other Linux flavors than the officially supported one.

In this paper we would like to present the main issues which arise when trying to install EDG 1.4 on DESY Linux v.4, and how to overcome them.

Presentation
(PDF: 252 KB)


Klaus-Dieter Oertel

EuroGrid Overview

Abstract:

Presentation
(PowerPoint: 1.63 MB)


Katarzyna Rycerz, Marian Bubak, Maciej Malawski, Peter Sloot

Execution Support for HLA-based Distributed Iteractive Applications

Abstract:

This paper presents the design of a system that supports execution of a HLA [1] distributed interactive simulations in an unreliable Grid environment. The design of the architecture is based on the OGSA [2] concept that allows for modularity and compatibility with Grid Services already being developed. First of all, we focus on the part of the system that is responsible for migration of a HLA-connected component or components of the distributed application in the Grid environment. The pleriminary results can be fount in [3]. We present a runtime support Migrator Library (ML) for easily plugging HLA simulations into the Grid Services Framework. We also present the impact of execution management (namely migration) on overall system performance.

As HLA [1] is explicitly designed as a support for interactive distributed simulations, it provides various services needed for that specific purpose, such as time management useful for time-driven or event-driven interactive simulations. It also takes care of data distribution management and allows all application components to see the entire application data space in an efficient way. On the other hand, the HLA standard does not provide automatic setup of HLA distributed applications. In HLA there is no mechanism for migrating federates according to the dynamic changes of host loads or failures, which is essential for Grid applications. In our opinion, the OGSA [2] concept provides a good starting point for building and connecting independent blocks of different functionality of the HLA execution management system.

Our solution introduces HLA functionality to the Grid Services framework extended by specialized high-level Grid Services. This allows for execution control through Grid Service interfaces; the internal control and data of distributed interactive simulations flows through HLA. The design also supports migration of federates (components) of HLA applications according to environmental conditions. In the full version of the paper we also present performance results of migration.

This research is partly funded by the European Commission the IST-2001-32243 Project "CrossGrid".

References:

  1. HLA specification, http://www.sisostds.org/stdsdev/hla/
  2. Foster I., Kesselman C., Nick J., Tuecke S.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002
  3. Zajac, K., Bubak, M., Malawski, M. and Sloot, P.: Towards a Grid Management System for HLA-based Interactive Simulations. To appear in: Proceedings of the 7th IEEE International Symphosium on Distributed Simulation and Real-Time Applications 2003

Presentation
(PDF: 291 KB)
 


Jaroslaw Wypychowski, Krzysztof Nowinski, Piotr Bala

Generic Plugin - New Concept for Developing Users Interfaces in Grid

Abstract:

UNICORE becomes one of the important grid middleware. The main advantage of the UNICORE environment is powerful Graphical User Interface which allows for job preparation, job submission and job management. These generic features can be extended using plugin concept which allows for creation of application specific interfaces adjusted to the users needs. However, plugin development is still time consuming and requires significant knowledge of Java. We have proposed and developed generic plugin, which allows for creation of the typical graphical user interface for the application based on the XML definition. This allows for even non experienced users to build powerful application interface in the grid environment.

back



Peter Praxmarer, Paul Heinzlreiter, Dieter Kranzlmueller

GMF: A Framework for Module Management on the Grid

Abstract:

Utilization of grid environments often requires parallel and distributed programming to solve a single but large-scale problem. In the traditional approach the workload of different modules is distributed over various heterogeneous grid resources, which are then interconnected in some kind of pipeline or graph structure. The basic functionality to achieve this distribution and interconnection is provided by the Globus-Toolkit [2], which includes software for security, information infrastructure, resource management, data management, communication, fault detection, and portability. While the functionality of Globus is fundamental for the application of grid environments, its low-level interface requires substantial knowledge and effort to utilize it in applications.

The Grid Management Framework GMF addresses this problem by providing a higher level of abstraction for Globus services. The aim of GMF is to build a framework that encapsulates common tasks necessary for building modules within an object-oriented framework, and to instance these objects as modules within a configurable pipeline or graph structure over the grid. The main benefit of GMF is thus that the application programmer is freed from tedious and error prone tasks necessary to use Globus, and instead focus on the actual grid-enabled application.

The functionality of GMF is provided by two kinds of modules, an arbitrary set of worker modules and a master module. The worker modules implement user-defined functionality for solving the application tasks. In addition, they receive control events from the master module such as start, stop, accept a connection, connect to another module, perform a user-defined checkpoint (if supported) and migrate to another resource. Each of these worker modules is instanced through GRAM by the master module, which is thus capable of generating the worker pipeline (or graph) and controlling its operation. The commands from the master module are forwarded to the worker modules via separate control connection, which is installed at module startup.

In addition to the basic functionality of GMF, various other parts of the Globus-API are encapsulated in the object-oriented C++ framework with integrated error handling (for services such as GlobusIO, GlobusFTPClient, GlobusCommon, GlobusGRAMClient). The error handling procedures can be easily exchanged on a per operation basis to meet the user requirements. Besides that, GMF enriches the functionality of GlobusIO with multiplexed connections and a buffered I/O mode to increase the throughput of I/O connections. Furthermore, the module framework of GMF is designed to support user-defined checkpoints that allow migration of a module within in a heterogeneous computing environment.

The current version of GMF is an integral part of the Grid Visualization Kernel (GVK) [1]. The idea of GVK is to build a visualization pipeline over a set of grid resources, and thus enables scientific visualization on the grid. In addition to GVK, the utilization of GMF is also investigated for the distributed program analysis environment DeWiz, which delegates the analysis of program state data to different modules on the grid.

Acknowledgement:
The work described in this abstract is partially supported by the EU CrossGrid Project under contract number IST-2001-32243. We highly appreciate the contribution of our colleagues at GUP Linz, most notably Herp Rosmanith.

References:

  1. Paul Heinzlreiter, Dieter Kranzlmueller, "Visualization Services on the Grid", Parallel Processing Letters (PPL), Vol.13, No.2, pp. 135-148 (June 2003)
  2. Ian Foster, Carl Kesselman, Steven Tuecke, "The Anatomy of the Grid: Enabling Scalable Virt ual Organizations", International J. Supercomputer Applications, 15(3), 2001

Presentation
(PDF: 300 KB)


Thilo Ernst, Jochen Wauer

Grid Content Evolution and Grid Content Management

Abstract:

A complex distributed Grid-based Science Portal or similar platform in which data and model resources from various authors/organziations are integrated on an ongoing basis over substantial time horizons, to be offered for shared use in virtual organizations cannot be realistically considered a "static" system which is "finished" at some defined point in time.

Of course, in projects developing such platforms, designated deliverables and milestones must be committed to in order to control project progress, make sure demonstrations and pilot operations can be carried out, etc.

However in order to be successfull beyound the start phase, such platforms also need to offer strong support for further evolution - integration of new data sources and models, versioning, etc.

The VirtualLab project, an ongoing collaboration of Fraunhofer FIRST and DLR (German Areospace Center) in which a specific Science Portal has been built and which in the next version will fully rely on Grid technology, provided valuable insights here.

DLR primarily views this platform as a new technology transfer channel, for identifying and trying out "external application potential" for in-house developed scientific software. Such software is produced and improved at DLR (and at similar author/vendor organizations likewise) not in isolated, in frequent activities but rather on a continuous, day-to-day basis. Unsurprisingly, end users who can easily access this software (and data resources) remotely from their desktops don't expect outdated material.

It thus should be as easy as possible to integrate new or improved models and data resources into the platform. If this is not easy enough (that is: possible for people with basic programming skills but without substantial Grid or Web development background), the platform risks "starvation" in the long run.

For these reasons, we are carefully studying the Grid resource integration process in ongoing relevant projects and aim at designing adequate support for this process in the next generation of Virtual Lab.

Strong parallels to traditional web content management are expected here - just that the concept of "content" now extends beyound the traditional interpretation of hypertext and multimedia documents to cover general data and executable resources as well.

Presentation
(PowerPoint: 430 KB)


Ludek Matyska

Grid Infrastructure Monitoring and Management - Lessons from the EU DataGrid and GridLab Projects

Abstract:

Presentation
(PDF: 1,38 MB)


Zsolt Nemeth

Grid Performance, Grid Benchmarks, Grid Metrics

Abstract:

Grids, as an emerging infrastructure for novel ways of computation, raises various theoretical and technical questions. Despite the intensive research work, these questions are sometimes even inarticulated. Is there a common understatnding what grids are? Is there an accepted definition for grid performance? Are traditional performance analysis techniques adequate for grids? The presentation points out problems and questions related to performance analysis in the framework of the new computing paradigm and proposes a new scenario for benchmarking.

Presentation
(PowerPoint: 537 KB)


Sergio Andreozzi, Antonia Ghiselli, Cristina Vistoli, Sergio Fantinel, Gennaro Tortone, Natascia De Bortoli

GridICE: a Monitoring Service for the Grid

Abstract:

The Grid is a new paradigm of distributed computing that enables the coordination of resources and services not subject to centralized control. These resources may span multiple administrative domains, machine architectures, and software boundaries.

The management of this complex system, that is distributed by nature, has to deal with the heterogeneity of the resources and the decentralization of the ownership. An appropriate organization type that can monitor and control a multi-institutional grid is under investigation. Such an organization is called Grid Operation Center (GOC). Its characteristics, capabilities and use cases are being designed.

With this paper we want to present to the grid community our research and development results in the area of grid monitoring infrastructure for Grid Operation Centers. Our work gain from our experience within the WorldGrid event, within the information modeling of grid services related to the GLUE Schema, and within a close collaboration with the LHC Grid Computing (LCG) Project.

We define the Grid Monitoring as the activity of measuring significant grid resources related parameters in order to analyze usage, behavior and performance of the grid, detect and notify fault situations, contract violations, and user-defined events.

The current outcome of our activity is a monitoring infrastructure called GridICE. In its first release, we have privileged aspects such as easy integration with the current production grid middleware, and modularity of the components in respect of the separation of concern design principle. The current architecture is structured in six layers, from initial producers of monitoring data to final consumer of monitoring information.

The first layer is the measurement service, which task is to probe resources for simple/composite metrics defined in the information model. These metrics mostly refer to quality aspects. The second er is the publisher service and its task is to offer the gathered data to potential consumer. Our decision was to rely on the available grid information service, that is the Globus MDS 2.x, an LDAP-based solution for the access to distributed time-sensitive data. Advantages of this choice are the availability of a distributed query engine and a standard interface to data of different resources. The drawbacks are the need for continuous polling (no event-based data delivery is implemented), and the absence of persistence storage for historical data. With proper design choices, we have managed to smooth the former limitation, while we solved the latter limitation by introducing a data collector service in the fourth layer. This service is provided with a self-discovery feature that automatically can detect new grid services and configure them to be properly monitored. The fifth layer is a set of two services: detection and notification service, providing a flexible and configurable mean for event detection and notification actions, and a data analyzer service, providing performance analysis, usage level and general reports and statistics. The sixth layer is the presentation service, a web-based graphic user interface that offer concise monitoring information. This is designed on a role-based strategy, providing for different views depending on the type of information consumer.

GridICE, release one, has been already selected to monitor several multi-institutional grids. The first large deployment started with the LCGpre1-CMS grid. After, it has been selected for LCG1 grid monitoring in both testing and deployment testbeds. And finally, has been deployed for the Italian Grid of Sciences.

Our future work will focus on the consolidation and improvement of the current infrastructure. Moreover, new solutions to overcome the aforementioned limitations will be investigated.

back



Holger Marten, K.-P. Mickel

GridKa – the German Regional Tier-1 Computing Centre: Status, Strategy and Future Plans

Abstract:

The Grid Computing Centre Karlsruhe, GridKa, will be one of the 10-15 largest computing centres within the LHC Computing Grid Project. In 2001, 40 groups of German particle physicists initiated the construction of GridKa as a German Regional Tier-1 Centre in the LHC framework, and the official inauguration of GridKa took place at the end of October 2002. Today, GridKa already hosts 460 Linux processors, 110 TeraBytes of usable disk space and 170 TeraBytes of tape storage. Part of these resources are made available to already running high energy physics experiments: BaBar at SLAC, CDF and D0 at FermiLab and Compass at CERN will generate a data volume of about 1/10 of LHC during the next years. They are optimal candidates to test and validate the scaling of the hard- and software infrastructure. In 2007 - at the startup of LHC - GridKa will host about 4000 processors, 1500 TeraBytes online and 3800 TeraBytes tape storage, and co-operate via a few dozens of Gigabit connections with hundreds of other grid installations worldwide. The paper summarizes the organizational structure of GridKa, current strategies for the infrastructure setup, and outlines some of the related R&D projects.

Presentation
(PowerPoint: 15,1 MB)


Ladislav Hluchy, Ondrej Habala, Branislav Simo, Jan Astalos, Viet D. Tran, Miroslav Dobrucky

Grid-based System for Flood Forecasting

Abstract:

This paper presents our experience and current status of a Collaborative Problem Solving Environment for Flood Forecasting under development as a part of the IST CROSSGRID project. Over the past few years, floods have caused severe damages throughout the world. Most of the Europe was heavily threatened. Therefore, modeling and simulation of flood forecasting in order to predict and to make necessary prevention became very important matter. The environment described here uses Grid technology to interconnect experts, data and computation resources needed for quick and correct flood management decisions. In the core of the system lies a coupled set of simulation models used to predict precipitation and temperature, hydrological river status and hydraulic events in target areas. The environment and its web-based interface also provides some basic communication tools, enabling its users to cooperate. Virtual Organization for Flood Forecasting, using this environment may consis of several cycle providers, storage providers, end users, experts and developers.

Forecasting of flood events requires quantitative precipitation forecasts as well as forecasting of temperature (to determine snow accumulation/melting). The system makes use of the ALADIN/SLOVAKIA model. ALADIN is a LAM (Limited Area Model) developed jointly by Meteo France and cooperating countries. In the next stage we are using several hydrological simulation models, depending on conditions and needs which will be applied model for which situation and territory, they can be also used in combined way. For hydraulic predictions, FESWMS (Finite Element Surface-Water Modeling System) Flo2DH is used, which is a 2D hydrodynamic, depth averaged, free surface, finite element model. Flo2DH computes water surface elevations and flow velocities for both super- and sub-critical flow at nodal points in a finite element mesh representing a body of water (such as a river, harbor, or estuary). imulation of floods is very computation-expensive. Several days of CPU-time may be needed to simulate large areas. For critical situations, e.g. when a coming flood is simulated in order to predict which areas will be threatened, and to make necessary prevention, long computation times are unacceptable. Therefore, FESWMS Flo2DH was parallelized in order to achieve better performance.

The storage space for simulation outputs and direct measurements used by the application is provided by II SAS. Hourly outputs of meteorological simulation, hydrographs provided by the hydrological part of the cascade and selected hydraulic outputs will be stored. The storage will also hold configuration files for the simulations and some other resources, needed to operate the application. The stored files are accessible through standard Grid tools used in the CrossGrid testbed. We are also working on a common description scheme for these files and a way to store the metadata in a Grid-aware database system. The metadata structure will include detailed information about origin of the file, time of its creation, the person who actually created it, etc. In case the file is the output of a simulation, the metadata will also contain names of the input files, model executable and configuration files.

Presentation
(PowerPoint: 10,6 MB)


Roger Menday

GRIP: Creating Interoperability between Grids

Abstract:

Presentation
(PDF: 618 KB)


Roger Menday and Philipp Wieder

GRIP: the Evolution of UNICORE Towards a Service Oriented Grid

Abstract:

The current UNICORE software implements a vertically integrated Grid architecture providing seamless access to various resources within different Virtual Organizations. The software is deployed and developed by companies, research and computing centres and projects throughout Europe coordinated by the UNICORE Forum (http://www.unicore.org).

Interoperability between two different Grid infrastructures, UNICORE and Globus, enlarges the range of available resources and service available to each system, and was the motivation for the Grid Interoperability Project (GRIP, funded in part through EC grant IST-2001-32257). GRIP designed and implemented an interoperability layer, the capabilities of which have been demonstrated at conferences and workshops. In addition the project is contributing to the standardization efforts within the Grid community by participating in or leading Global Grid Forum activities.

With the advent of the Open Grid Services Architecture (OGSA) and - Infrastructure (OGSI) and the increasing usage of Web Services for the operation of Grids, the focus of the project changed. One of the benefits Web Services bring to Grid computing is the concept of loosely coupled distributed services. Merging the idea of "everything being a service" with the achievements of the Grid community led to Grid Services, enabling a new approach to the design of Grid architectures. The adoption of XML and the drive for standardisation of OGSI compliant protocols provide the tools to move closer to the promise of interoperable Grids. An early demonstrator validated the correspondence of UNICORE's architectural model with the OGSA/I approach and encouraged GRIP to shift its efforts to start the development of an OGSA/I compliant Grid based on the UNICORE architecture.

In this paper we discuss the UNICORE exemplary for the evolution of a Grid system towards a service oriented Grid, primarily focussing on architectural concepts and models. Based on the current architecture and the enhancements provided by GRIP, we depict first steps already taken to integrate Web - and Grid Services into UNICORE. This includes the provision of OGSI compliant port types parallel to the proprietary interfaces as well as the design of XML based protocols. Furthermore we present the roadmap taken by GRIP to achieve a consistent development towards an OGSA implementation. In addition to the GRIP related achievements and plans we report on the current status of the OGSA/I standardisation. We also consider the evolving Web Service standards and relate them to the UNICORE architecture, particularly regarding the recent developments towards a service-oriented Grid architecture.

Presentation
(PDF: 314 KB)


Jarek Nabrzyski

GRMS: GridLab Resource Management System

Abstract:

Presentation
(PowerPoint: 2,18 MB)


Kazimierz Balos, Leszek Bizon, Michal Rozenau, Krzysztof Zielinski

Interoperability Architecture for Grid Networks Monitoring Systems

Abstract:

Grid networks are computing environments where resource brokering and load balancing require reliable monitoring system with interfaces adequate to area where they are working and security systems assuring that confidential data are transferred in secure and efficient manner. Considering typical grid networks consisting of over a dozen clusters where each cluster consists of several worker nodes there is also a need for efficient way to install, run and maintain such monitoring system.

The aim of this study is to solve the problem of creating such interfaces of monitoring systems which will be suitable for distributed and heterogeneous environments, especially for clusters and grid networks. This study looks at development of scalable and easy to maintain system, that can be used to expose monitored parameters, like network traffic and nodes' infrastructure resource availability to outer applications for further processing.

This paper covers topic of hierarchical information aggregation in context of local area networks where computing elements of clusters work, and wide area networks where clusters are cooperating and where there is need for choosing appropriate protocol for information interchange. Approach shown in this article presents current achievements in area of using Sun's early implementation of Java Management Extensions (JMXTM) technology for communication at cluster level and Web Services and SOAP protocol for communication at the grid network level. This also covers a way of dynamic monitored stations registering using an open implementation of discovery services, which can be used in all environments performing on Open Source license.

Finally, there are presented results of SOAP Gateway implementation, which can be used for comparison purposes with other existing interfaces of monitoring systems. There are presented results of long term making use of such a system, including automated installation methods on number of nodes, gathering information through RMI and SOAP protocols and maintenance of software, including development, debugging and upgrading to the newer versions of modules. There are also presented results of system monitoring performance and it's impact on monitored stations.

Using monitoring system presented in this study is adequate where there is developed security infrastructure for SOAP traffic encryption. Although this study does not present any security subsystem which can be used for this purpose, there is possibility to use existing and coherent with Web Services technology in form of SOAP requests' interceptors what is covered in the last section.

Presentation
(PowerPoint: 900 KB)


Witold Alda, Tomasz Wojtowicz, Piotr Bys, Michal Gabor, Dariusz Gocol, Jacek Kitowski

Java Applications for Web-based Visualisation
of Biological Data


Abstract:

We present three java applications for visual presentation of biological data, developed within the PROGRESS project. All programs can be loaded as applets from the given address, but they can also be run as applications under control of the migrating desktop - one of the interface tools in PROGRESS portal. Similarly the data can be taken directly from the given address, or from the database available through the portal. The first program is designed for simple 3D visualisation of biomolecules: proteins and DNA structures. 3D graphics is based on Java3D library. Special attention is paid to details which could help visualisation and analysis of secondary structures. The second program is used for two-dimensional visualization of the results from genome assemblation calculations. The third program shows the structure of phylogenetic trees generated on the basis of the evolution data.

Presentation
(JPEG: 867 KB)
 


Jakub Moscicki

Master-Worker Workflow Management for Distributed Biomedical Applications in the GIRD

Abstract:

GRID middleware provides basic services and foundation frameworks for building global environments for scientific computing. Low level issues are addressed such as security, virtual organizations, data replication, job submission and execution. DIANE framework (http://cern.ch/diane) is a generic workflow manager for distributed master-worker applications, which builds on top of the existing GRID middleware providing high-level facilities and idioms for application development and deployment. DIANE may be easily deployed into a concrete GRID environment such as Globus Toolkit or used in standalone clusters with popular workload management systems such as LSF or PBS.

DIANE is a callback framework which controls the job execution, creates tasks and workers, passes data messages between the master and the workers and finally integrates the task output. Applications do not have to open communication channels explicitly -- this setup is done automatically. Fixing master-worker parallel computation model limits the generality of the applications but enormously increases the flexibility of the framework itself in a way completely transparent to applications. Runtime flexibility allows to switch between in-process application loading based on shared libraries and IPC-based application execution based on statically-linked executables. Both setups have practical implications and DIANE offers a convenience to choose the appropriate one in an easy way.

Architecture of DIANE is based on component-container object model, which allows creation of various Application Adapters to enhance the framework easily. Application Adapters provided by default support python and C++ bindings, as well as mixing the two in single job.

The paper presents basic architecture and design principles of DIANE and benchmark results of distributed simulation in biomedical applications.

back



Claus-Juergen Lenz, Detlev Majewski

Meteo-GRID: Performing Local Weather Forecast Using GRID Computing

Abstract:

Today there is an increasing demand for reliable high-resolution short range (up to 48 hours) weather forecasts for government, industry, traffic and media. These local forecasts are most valuable in cases of high-impact weather, that is for weather systems such as tropical (hurricanes, typhoons) and violent extra-tropical storms and severe thunderstorms which may result in loss of lifes and property due to wide-spread flooding and gale-force winds. Many national weather services as the Deutscher Wetterdienst (DWD) run regional (local) numerical weather prediction models with a mesh size of 10 km or less up to four times a day to provide the necessary forecast products for the general public.

Within the framework of the European Union shared research and technology development project EUROGRID (Application Testbed for European GRID Computing, IST-1999-20247; funding period: November 2000 until January 2004) the aim of the Deutscher Wetterdienst is to provide local weather prediction for arbitrary regions in the world via Internet and EUROGRID (subproject Meteo-GRID) using the relocatable non-hydrostatic numerical weather prediction model LM (Local Model). This ASP (Application Service Provider) solution will allow virtually anyone to run a high-resolution numerical weather prediction model on demand for his/her domain of interest and hence to calculate his/her own weather prediction. For this purpose the user will be able to specify the model domain, grid resolution, initial date and time, forecast range and forecast products via a Java based Graphical User Interface (GUI).

Taking into account the user specifications the following steps are executed. All steps are performed within the EUROGRID software environment.

  • Derivation of topographical data for the model domain selected by the user from high resolution (1 km x 1 km) data sets stored in a global geographical information system (GIS) at DWD.
  • Preparation of initial and lateral boundary data sets for the Local Model (LM). These data are derived from analyses and forecasts of the Global Model GME, which are stored in an ORACLE data base at DWD.
  • Transfer of the topographical data and the GME data to a high performance computer within the HPC GRID of EUROGRID.
  • Execution of the LM forecast run on the supercomputer mentioned in the foregoing step. The job consists of two separate tasks which run in parallel, namely an interpolation program GME2LM and the numerical weather prediction model LM itself.
  • Dissemination of the LM forecast data to the user's computer or visualization of the LM results on a computer within the HPC GRID and transfer of the graphs to the user using the Internet and EUROGRID software.
In the presentation a more detailed description of the steps of the EUROGRID application of the LM will be shown. In addition, some specifications and requirements of a numerical weather forecast model on supercomputers will be discussed.

Presentation
(PowerPoint: 6.78 MB)


Bob Dobinson, Piotr Golonka, Andreas Hirstius, Mihai Ivanovici, Catalin Meirosu, Stefan Stancu

Moving the Decimal Point: 10 Gigabit Ethernet between Geneva and Amsterdam

Abstract:

Presentation
(PowerPoint: 1,7 MB)


Marian Bubak, Wlodzimierz Funika, Marcin Smetek and Roland Wismueller

OMIS-compliant Monitoring System for Java-Distributed Applications

Abstract:

A prototype monitoring system, the J-OCM, compliant with the On-line Monitoring Interface Specification (OMIS) [2,3] concept, provides the ability to observe and manipulate a whole distributed Java application's execution. The major concept of the Java oriented On-line Monitoring Interface Specification (J-OMIS) [1], which extends the original OMIS and underlies, a Java-oriented monitoring system, the J-OCM, is a set of object types with support for object specific services. The tool e.g. performance analyzer is provided with access to such objects as node objects, JVM objects, threads, class objects etc. provided with appropriate services.

The architecture of the J-OCM [6] comprises a central component, responsible for distributing tool requests and assembling replies, and a distributed part, across nodes,Local Monitors, and across JVMs, JVM Local Monitors which are agents embedded into JVM processes. The target Java system it is considered in terms of the client-server distributed system architecture with focusing on its components: interface definition, proxy, object manager, naming service, and communication protocol.

In our event-based monitoring system, basic events are captured by sensors which are inserted in the target system and sent to the monitoring system. The monitoring system takes some action(s) - a sequence of instructions associated with the event. These actions can either carry out data collection, or also manipulate the running program. The original event model provided in the OCM has been extended by a Java specific event submodel which covers the functioning of the basic application and execution entities of the Java distributed application. >
A greater part of the designed services underlying the functionality of the J-OCM has implemented [4,5,6]. The results of this research may be applied in monitoring of grid services.

References:

  1. M. Bubak, W. Funika, P. Metel, R. Orlowski, and R. Wismueller: Towards a Monitoring Interface Specification for Distributed Java Applications. In Proc. 4th Int. Conf. PPAM 2001, Naleczow, Poland, September 2001, LNCS 2328, pp. 315-322, Springer, 2002
  2. T. Ludwig, R. Wismueller, V. Sunderam, and A. Bode: OMIS - On-line Monitoring Interface Specification (Version 2.0). Shaker Verlag, Aachen, vol. 9, LRR-TUM Research Report Series, (1997)
    http://wwwbode.in.tum.de/~omis/OMIS/Version-2.0/version-2.0.ps.gz
  3. R. Wismueller, J. Trinitis and T. Ludwig: A Universal Infrastructure for the Run-time Monitoring of Parallel and Distributed Applications. In: Euro-Par'98, Parallel Processing, volume 1470 of Lecture Notes in Computer Science, pages 173-180, Southampton, UK, September 1998. Springer-Verlag
  4. M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Request processing in the Java-oriented OMIS Compliant Monitoring System. Presented at 5th Int. Conf. PPAM 2003, Czestochowa, Poland, September 2003 (to be printed)
  5. M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Event Handling in the J-OCM Monitoring System. Presented at 5th Int. Conf. PPAM 2003, Czestochowa, Poland, September 2003 (to be printed)
  6. M. Bubak, W. Funika, M. Smetek, Z. Kilianski, and R. Wismueller: Architecture of Monitoring System for Distributed Java Applications. Presented at Euro PVM/MPI Int. Workshop 2003, Venice, Italy, September 2003

Presentation
(GIF: 2.8 MB)
 


Bartosz Balis, Marian Bubak, Wlodzimierz Funika, Marcin Radecki, Tomasz Szepieniec, Roland Wismueller

OMIS/OCM-G and Other Application Monitoring Approaches for the Grid

Abstract:

While the current Grid technology is oriented more towards batch processing, the CrossGrid project is focused on interactive applications, where there is a person `in the computing loop'. Monitoring of interactive applications is only possible in the on-line mode in which the information is immediately delivered to the visualization tools with low latencies. Only then can the user's interactions can be related to the performance results. On-line monitoring is also essential to enable manipulations on the target application.

In this paper, we provide an overview of three grid application monitoring approaches currently being developed and compare it to our approach based on OMIS/OCM-G [1]. The projects/systems mentioned are GrADS (Autopilot) [2], GridLab (based on GRM) [3], and DataGrid (GRM) [4].

The GrADS project introduces a framework for grid application development. A part of it is the Autopilot toolkit, which can gather real-time application and infrastructure data, and analyse it as well as allows for the modification of the application's behavior. Autopilot is, however, more oriented towards automatic steering than providing feedback to the programmer. It gives a rather general view of application and environment, e.g. to explore patterns in behaviour instead of particular performance loss.

Application Monitoring system developed within the GridLab project implements on-line steering guided by performance prediction routines deriving results from low level, infrastructure-related sensors (CPU, network load). However, this approach is not suitable for interactive applications. First, it does not allow for manipulations on the target application. Second, the approach seems to rely only on full traces. Finally, the semantics of all metrics is fixed which does not allow for user-defined metrics.

In DataGrid project, the GRM monitoring system is introduced. The GRM is a semi-on-line monitor which collects information about application and delivers them to the PROVE visualisation tool. While the GRM/PROVE environment is well suited for the DataGrid project where only batch processing is supported, it is less usable for the monitoring of interactive applications. First, the R-GMA communication infrastructure used by GRM is based on Java servlets, which introduce a rather high communication latency. Second, achieving low latency and low intrusion at the same time is basically impossible when monitoring is based on trace data. If the traces are buffered, the latency increases, if not, the overhead for transmitting the events is too high.

The monitoring infrastructure created in the CrossGrid project is the OCM-G which is a distributed, decentralized, autonomous system, running as a permanent Grid service and providing monitoring services accessible via a standardized interface OMIS. The system works in on-line mode, and allows for manipulations. Its general philosophy is to provide a flexible set of monitoring services which allow to define metrics with a semantics the user needs, instead of providing a predefined set of high-level metrics.

References:

  1. B. Balis, M. Bubak, W. Funika, T. Szepieniec, R. Wismueller, Monitoring and Performance Analysis of Grid Applications. In Proc. ICCS 2003, St. Petersburg, Russia, June 2003. Springer 2003
  2. J.S. Vetter and D.A. Reed. Real-time Monitoring, Adaptive Control and Interactive Steering of Computational Grids. In: The International Journal of High Performance Computing Applications, vol. 14, 2000
  3. The GridLab project web pages: http://www.gridlab.org
  4. Z. Balaton, P. Kacsuk, N. Podhorszki, and F. Vajda. From Cluster Monitoring to Grid Monitoring Based on GRM. In Proc. Euro-Par 2001 Parallel Processing, August 2001, Manchester, UK, Springer 2001

Presentation
(PDF: 323 KB)


Mathilde Romberg

OpenMolGRID: Complex Problem Solving in Molecular Design

Abstract:

Presentation
(PDF: 391 KB)


Pawel Plaszczak

Porting Applications to Globus Toolkit 3.0 and Designing an OGSI-based Architecture

Abstract:

This tutorial will explore the issues involved in the process of engineering the applications on top of the OGSA and Globus Toolkit 3.0.

The latest major edition of the Globus Toolkit allows developers to build applications in a grid service oriented architecture. This new technology based on web services opens a wide range of new possibilities, but at the same time requires a certain discipline from the software designer and poses new challenges to the project manager.

We will explore the pros and cons of moving applications into an OGSI compliant architecture. The question whether it makes sense to grid enable a software component can be answered by identifying and comparing the potential benefits and dangers of such move.

We will later move to discuss the features of the Open Grid Service Architecture. We will demonstrate how to apply them to chosen applications.
Finally the technology issues will be discussed, and participants will get a feel of the problems expected at the implementation level.

The workshop will be strongly based on real examples. One of the examples we wish to explore is the process of designing and builging the NeesGRID's NTCP component, the first application of GT3.0 deployed in production.

We encourage the participants of the tutorial to submit their own questions, problems and suggestions related to their own experience or plans of building a GT3.0-based application layer infrastructure. We will spend time analyzing the submitted questions and exploring them on the tutorial. Information is available at http://perfringo.com/events/

Presentation
(PowerPoint: 524 KB)


Marian Bubak, Andrzej Jozwik, Maciej Malawski, Katarzyna Rycerz, Dominik Ziembinski

Porting Irregular and Out-of-core Computations on Grids

Abstract:

Parallelization of irregular problems is a non-trivial task due to the access pattern to data, where data arrays are accessed by one or more levels of indirection arrays. Our work on parallelization of such problems resulted in the implementation of the LIP [1] library that was providing the runtime support for irregular programs in MPI on clusters.

Our current work concentrated on a feasibility study of possible migration of the LIP library towards the Grid environment. Grid offers more computing power to solve large-scale problems, but its distributed nature makes programming more difficult and less effective. Our new G-LIP library tries to support irregular computations on the Grid.

There are two phases for computing irregular problems when using LIP library: inspector and executor. To support applications with significant computation effort those two phases have been split like in master-workers schema. Role of masters play inspectors and workers are nodes that process irregular loop. At the beginning of running application all data elements are distributed among inspectors. Then, inspectors examine data References:, build communication schedules, gather non-local data elements and translate global indexes into local. After that, each inspector broadcasts pre-packed data arrays to its workers. Meanwhile workers do computation, inspectors build consecutive communication schedules for the next time step. When workers finish their job and return computed results, inspectors scatter received data according to first build communication schedule and gather data for the next time step of computation using previously prepared schedules. This approach was subject to preliminary tests using MPICH-G and Globus Toolkit 2.x.

The second improvement to the LIP library was support for adaptive problems. In such problems, data arrays accessed via indirection arrays and data access patterns change during computation. Hence the schedule must be regenerated each time step and communication volume rises as data partitioning does not cover data elements migration. Since most of index analysis can be reused, communication schedule should be only updated and so cost of processing inspector--phase is minimized. Moreover, to keep the communication volume at the same level, shifting data elements across processors should be enabled.

References:

  1. Brezany, P., Bubak, M., Luszczek, P., Malawski, M., Zajac, K., A Runtime Support for Large-Scale Irregular Computing on Clusters and Grids, Annual Review of Scalable Computing, in: Kwong, Y. C. (E ds.), Series on Scalable Computing, vol. 5, Singapore University Press and World Scientific, 2003, p p. 30-64

Presentation
(GIF: 981 KB)
 


Bartosz Balis, Marian Bubak, Michal Wegiel

Proposal of Adaptation of Legacy C/C++ Software
to Grid Services


Abstract:

The adaptation of legacy C/C++ software to web/grid servieces are important from both scientific [1] and commercial [2] point of view. Yet, existing approaches are mainly concentrated on the web services technology. In consequence, they lack a number of crucial concepts and may prove unacceptable in the grid services context. Along with the emergence of the Open Grid Services Architecture [3] a novel, significantly different approach was introduced. It extends the model of web services by imposing additional requirements concerning security and lifetime management of stateful service instances.

We focus on the adaptation of exsiting legacy libraries and applications to work as grid services. The clients interact with these services in a standard fashion using WSDL, SOAP and message-level security. They are expected to follow the paradigm of service factories and create service instances for their own exclusive usage. The central concept behind the proposed architecture is as follows. Each created service instance is automatically associated with an internal proxy client. This client is hidden to the external world and bears responsibility for translating the actual client method invocations to the underlying native calls. It receives client requests and supplies the corresponding results via interaction with the specialized proxy service. One instance of a proxy service is introduced for each service instance created by a client. The proxy client can be run on arbitrary host with the required legacy libraries installed and can even migrate during its execution. This conforms to the ideas of brokering and job submission mechanisms. Proxy clients are in fact jobs scheduled fo r execution on behalf of the corresponding client. This enables to achieve a high degree of scalability. Obviously, proxy clients need to be implemented in native languages (C/C++) since they directly cooperate with the legacy code. However they are not different from other clients in the respect of the manner in which they perform communication (language-neutral SOAP messages are used).

There are a number of advantages of the above approach, to name only the most important ones:

  • no changes are required in the existing legacy code,
  • the solution ensures hight scalability and security (no open ports needed, authentication, authorization, credentials delegation are incorporated),
  • high flexibility is achieved by introducing a universal framework,
  • substantial development effort is avoided due to the use of tools facilitating migration to grid services (e.g. gSOAP, Java2WSDL etc.),
  • the solution is portable between various grid hosting environments,
  • compatibility with grid service requirements (lifetime management, job submission, notifications, etc.).
As a proof of the concept, the presented solution was partially implemented in a project aiming at adapting grid application monitoring system, OCM-G, to grid services. Globus Toolkit 3.0 together with gSOAP package and GSI plugin were employed for this purpose.

References:
  1. Yan Huang, Ian Taylor, David W. Walker, Robert Davies Wrapping Legacy Codes for Grid-Based Appli cations. To be published in proceedings on the HIPS 2003 workshop.
  2. Dietmar Kuebler, Wolfgang Eibach (IBM Germany) Adapting Legacy Applications as Web Services. http://www-106.ibm.com/developerworks/webservices/library/ws-legacy/
  3. The OGSA project homepage: http://www.globus.org/ogsa

Presentation
(PowerPoint: 225 KB)


Witold Alda, Remigiusz Górecki, Marek Budyn, Michal Ciesielski,
Jacek Kitowski


Prototype System for Distributed Scientific Visualisation

Abstract:

We present the architecture of distributed visualisation system which reads the data from the remote server or servers and distributes further processing, such as data extraction, mapping, visualisation on both remote and local computers. The decision how the processing should be distributed is in the users' hand. The architecture of the is designed to fulfill the needs of general purpose visualisation. Thus system is flexible and can be easily enriched by adding new modules on both server and client sides. The user can build the visualisation pipeline by picking available modules and constructing his/her own projects and scenarios. The entire system is written using Java/XML technologies. Existing visualisation modules use Java3D, but tests with OpenGL and GL4Java show that they can also be applied in the system with any problems. Currently,the prototype modules for chemical molecules visualisation are available.

Presentation
(JPEG: 715 KB)
 


Fabrizio Gagliardi
CERN European Organization for Nuclear Research


Review of DataGrid Progress and Plans for the EGEE Project

Abstract:

We present the architecture of distributed visualisation system which reads the data from the remote server or servers and distributes further processing, such as data extraction, mapping, visualisation on both remote and local computers. The decision how the processing should be distributed is in the users' hand. The architecture of the is designed to fulfill the needs of general purpose visualisation. Thus system is flexible and can be easily enriched by adding new modules on both server and client sides. The user can build the visualisation pipeline by picking available modules and constructing his/her own projects and scenarios. The entire system is written using Java/XML technologies. Existing visualisation modules use Java3D, but tests with OpenGL and GL4Java show that they can also be applied in the system with any problems. Currently, the prototype modules for chemical molecules visualisation are available.

Presentation
(PowerPoint: 2,54 MB)


Wolfgang Mertz

SAN over WAN - a New Way of Solving the GRID Data Access Bottleneck

Abstract:

After solving the user access, job scheduling and system monitoring remote data access of big amounts of data still remains a challenge in many GRID environments. Due to the bandwidth limitations and the mismatch between peak and sustained throughput performance of the wide area networks (WAN) big files have to be copied well in advance of running the job or sometimes even sent on tape to the targeted GRID node. Obviously the results have to undergo the same procedure.

This fact sometimes makes it inefficient to run a job on a remote, better suited or less loaded GRID node.

SGI's engineering spent the last couple of years in extending industry standard storage architectures which are traditionally tailored to the needs of commercial applications and environments to the slightly different requirements of the technical and scientific world.

The lastest storage technology is SAN (Storage Area Network). The idea was to solve the bandwidth bottlenecks traditional file servers have due to the fact that they use shared or dedicated TCP/IP networks for data transport. SANs use their own storage networks based on the Fibre Channel protocol which is better suited for high performance data access to storage devices like raid arrays and tape libraries.

However those SANs are limited to local sites, usually buildings, due to the distance limitation of Fibre Channel. Also only one system can access the data on one partition of the storage array and therefore data sharing between systems again has to be done via TCP/IP networks.

The paper presents a new concept of sharing and accessing data from different nodes in a GRID environment which has been co-developed by SGI and LightSand, a company producing gateways for long distance network connections. It discusses the implementation details and also the resulting benefits for a GRID environment. One of those is that the application has direct access to the data on the "home" storage array from every GRID node without the need of copying the data to the GRID node where the job is executed.

Case studies and other areas of deployment are shown as well.

Presentation
(PowerPoint: 1.88 MB)


Konrad Karczewski, Lukasz Kuczynski and Roman Wyrzykowski

Secure Data Transfer and Replication Mechanisms
in Grid Environments


Abstract:

Data transfer issues are amongst the most important in the modern Grid environments. The applications running currently in Grid environments become more real--life oriented, they base on and generate data sets of growing importance and confidentiality. All this implies that data management systems must provide great reliability and should incorporate mechanisms of data safety and confidentiality.

In the data replication service confidentiality of data becomes a major problem, because of the need for the geographically and logically distributed data storage. This implies either chosing only "trusted" data storage places, which may prove difficult in many cases. More importantly, it negates the idea of transparent data replication with intelligent load balancing. It is necessary to create data replication service providing not only data security by the means of replication and/or encrypted transmision but including as well methods of keeping replicated data confidential.

Confidentiality should be understood on multiple levels. The first one is keeping data on the storage device illegible for unathorised users. The second one is keeping unauthorised users not aware even of existence of a given data set. This implies the necessity of keeping replica content unknown even for the owner of the storage device it is being kept on. This requirement imposes development of distributed replica catalog keeping track of all data entering the data management system and providing unique identifiers replacing the original URIs. Moreover the catalog service has to include access control lists and authentication mechanisms to ensure the required level of data privacy.

Additional level of confidentiality can be achieved by data partitioning. In addition to URI encoding, every data set can be divided into multiple parts stored in different storage elements. Original data can be only reconstructed when the information about physical location and ordering of partitions is retrieved from the replica catalog by an authorised user. This solution allows not only for greater confidentiality and security of data but also can improve data management system performance by making parallel transfers from multiple sources possible.

In such confidentiality-oriented environment it is crucial to remove every possible single point of failure, especially it is essential to keep multiple, synchronised instances of replica catalog. In this case development of distributed replica catalog seems the most promising solution. In addition, creation of a distributed data broker is necessary. Its main task would be user authentication and authorisation, passing user requests to the replica catalog, making decisions about data partitioning, replication and reconstruction.

Presentation
(PDF: 2,39 MB)


Syed Naqvi, Michel Riguidel

Security Risk Analysis for Grid Computing

Abstract:

The security and privacy issues are coming to the fore with the growing size and profile of the Grid community. The forthcoming generations of the Computational Grid will make available a huge number of computing resources to a large and wide variety of users. The diversity of applications and mass of data being exchanged across the Grid resources will attract the attention of hackers to a much higher extent. A comprehensive security system, capable of responding to any attack on its resources, is indispensable to guarantee the anticipated adoption of Grid by both the Grid users and the resource providers. In this article, the authors argue that the first brick of an effective plan of countermeasures against these threats is an analysis of the potential risks associated with Grid computing.

This article presents a pragmatic analysis of the vulnerability of existing Grid systems and the potential threats posed to their resources once their spectrum of users is broadened. Various existing Grid projects and their security mechanisms are analyzed. The experience of using common Grid software and an examination of Grid literature served as the basis for this analysis. Legal loopholes in the implementation of Grid applications across the geopolitical frontiers, and the ethical issues that could obstruct the wide acceptance and trustworthiness of Grids are also discussed. The weaknesses revealed are classified with respect to their sources and possible remedies are discussed. The results show that the main reason for the vulnerability is the fact that Grid technology has been little used except by a certain kind of public (mainly academics and government researchers). This public benefit greatly from being able to share resources on the Grid, and have no intention of harming the resource owners or fellow users. Thus there was no need to address security in depth. This is all about to change. The number of people who know about the Grid is growing fast, as are the worthwhile targets for the potential attackers. The security nightmare can not be avoided unless the problem is addressed urgently.

This detailed taxonomy of potential threats and the sources of vulnerability in the existing Grid architectures is the first milestone on the road to a robust Grid security system. It provides a comprehensive overview which shall enable us to effectively plan the countermeasures against the existing risk. Our future direction includes the definition of a Protection Profile (Common Criteria) followed by the formulation of a comprehensive security policy and finally its implementation.

NB: This research work is a part of ongoing research activities of the Information Technology Security Group at the Computer Sciences and Networks Department under the patronage of European Union Information Society Technologies funded projects.

back



Marian Bubak, Maciej Malawski, Grzegorz Mlynarczyk, Piotr Nowakowski, Robert Pajak, Michal Turala, Katarzyna Rycerz

Software Development in the EU CrossGrid Project

Abstract:

CrossGrid is one of the largest European projects in Grid research, uniting 21 separate institutions and funded by the 5th European Framework Programme. As such, it demands the creation and implementation of custom-tailored software development, testbed integration and quality assurance procedures. The Project demands that the following be achieved:

  • all software developed within CrossGrid should follow a uniform set of rules and coding conventions,
  • similar testing and validation procedures should be employed by all partners responsible for developing software,
  • CrossGrid testbeds need to be prepared on time for software releases, with a uniform set of middleware, pre-tested and approved by the Project Architecture Team,
  • consistent quality assurance must be implemented throughout all stages of the Project, pursuant to terms and conditions set out by the European Union for large, multinational collaborations.
All the above mentioned goals can only be realized through a systematic approach to software design, verification and quality control. This article will outline some aspects of this approach, as developed by CrossGrid Project Management and implemented throughout the Project Consortium. In particular, the CrossGrid Quality Assurance plan will be discussed, as it explains the relevant procedures and organizational bodies dealing with all aspects of CrossGrid software development, testing and integration. Special attention will be paid to unit testing and the notion of Quality Indicators, very well suited for ensuring quality within a project of this scope and magnitude.

References:
  1. Eric J. Braude, Software Engineering: An Object-Oriented Perspective (John Wiley & Sons, Inc., 2001)
  2. The CrossGrid Quality Assurance Plan; available at: http://www.eu-crossgrid.org/Deliverables/1st Year-revised_deliverables/CG5.2-D5.2.1-v3.0-CYF055-QualityAssurancePlan.pdf
  3. CrossGrid Deliverable D5.2.3 (Standard Operating Procedures); available at: http://www.eu-crossgrid.org/Deliverables

Presentation
(PDF: 292 KB)
 


Maciej Malawski, Marek Wieczorek, Marian Bubak, Elzbieta Richter-Was

Storage and Analysis System for Data Intensive High Energy Physics Applications

Abstract:

The presented work is devoted to the problem of storage and analysis of data originating from the great experiments of particle physics. Detector simulation software produces a number of large datasets that are subject to further analysis resulting in generation of specific histograms [1]. The problem of managing such large number of files becomes a non-trivial task for the researcher. Our work is aiming for making this problem easier.

We introduce and describe the Lhcmaster - a system storing the data and performing basic analysis of data (production of the histograms for each data file).

The Lhcmaster is the database system based on the relational model of data, providing several user interfaces including command line administrative interface and web interface used by external user. Main use case of the Lhcmaster is to get the files from the base, relying on the graphic index of files (sets of histograms generated for each file). The basic analysis if performed within the ROOT framework. There is also a mechanism of an authentication and authorization of the users, based on the model of user groups. The system is implemented with use of basic languages and technologies like MySQL RDBS and Perl CGI.

The Lhcmaster-G is project of a system equivalent to the Lhcmaster, but realized with use of completely different Grid technologies. We believe that this approach allows to create a more powerful and flexible tool, adapted to operate within a distributed and heterogeneous environment. Furthermore, it would be also possible to add new functionalities to the Lhcmaster-G in comparison with the Lhcmaster. In particular, production of new data files and more sophisticated data analysis might be performed in the future within the Lhcmaster-G. The outline of Lhcmaster-G assumes that the system will be based on the widespread Grid tools, such as the Globus Toolkit and the European Data Grid [2]. The Lhcmaster-G hasn't been implemented yet, but the present work contains detailed description of how the system might be implemented.

References:

  1. ATLFAST 2.0 a fast simulation package for ATLAS, ATL-PHYS-98-131 13 Nov 1998
  2. European DataGrid project http://www.eu-datagrid.org

Presentation
(PDF: 360 KB)
 


Dieter Kranzlmueller

The Grid Visualization Kernel - Scientific Visualization on the Grid

Abstract:

Presentation
(PowerPoint: 34,4 MB)


Marian Bubak, Wlodzimierz Funika, Roland Wismüller, Tomasz Arodz,
Marcin Kurdziel


The G-PM Performance Measurement Tool for Interactive Grid Applications

Abstract:

The task of performance analysis of interactive applications, which feature highly distributed nature and dynamical changes in the execution environment, is connected with the run-time measurement definition, selective instrumentation, and using of counters/timers mechanism rather than extensive tracing as it is common with most performance tools. This implies the necessity to focus on interaction of distributed application components and to provide data meaningful in the context of an application. For providing this kind of information, the G-PM tool uses three sources of data: performance measurement data related to the running application, measured performance data on the execution environment, and results of micro-benchmarks, providing reference values for the performance of the execution environment.

Via GUI one can select a preferred measurement type, an appropriate display type and features within a dialog. High-level performance properties are defined by the user via loading from a file or via measurement definition dialog. The user-defined metric is transformed into an appropriate set of standard metrics. Measurements are realized either by sampling or by determining monitoring events relevant to the value to be measured. Application-specific metrics are associated with inserting user-defined procedure calls, probes resulting in generating special events to captured by the monitoring system. Active measurements are created whenever a concrete measurement is defined.

The tool interacts with the OCM-G and collects information about a selected application, by connecting to the monitoring system via the monitoring interface call, by sending conditional or unconditional requests to the monitoring system, which result in the obtaining of monitoring data or executing some manipulations on the application.

References:

  1. Bubak, M., Funika, W., and Wismueller, R.: The CrossGrid Performance Analysis Tool for Interactive Grid Applications. In: Kranzlmueller, D. and Kacsuk, P. and Dongarra, J. and Volkert, J. (Eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, September - October 2002, Linz, Austria, 2474, Lecture Notes in Computer Science, 50-60, Springer-Verlag, 2002
  2. Balis, B., Bubak, M., Funika, W., Szepieniec, T., and Wismueller, R.: An Infrastructure for Grid Application Monitoring. In: Kranzlmueller, D. and Kacsuk, P. and Dongarra, J. and Volkert, J. (Eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, September - October 2002, Linz, Austria, 2474, Lecture Notes in Computer Science, 41-49, Springer-Verlag, 2002

Presentation
(GIF: 1.96 MB)
 


Bartosz Balis, Marian Bubak, Wojciech Rzasa, Tomasz Szepieniec, Roland Wismueller

Two Aspects of Security Solution for Distributed Systems in the Grid on the Example of the OCM-G

Abstract:

This paper presents security solution for the OCM-G -- Grid enabled monitoring system. Grid applications become complex, thus tools facilitating the development process are required. The OCM-G is designed as an agent between such tools and application processes running on numerous nodes belonging to distributed Grid sites.

The OCM-G is designed as distributed and decentralized system, thus scalability required in the Grid environment is achieved. Monitoring system consists of two parts -- permanent, handling multiple applications of numerous users and transient, belonging to the monitored application owner. The OCM-G is designed to support on-line monitoring, which results in requirements concerning delivery time between the user and the application processes.

Security issues are essential for the OCM-G, for the sake of support for multiple users and wide abilities of controlling processes. The monitoring system should not lower security of the site.

This paper presents analysis of a solution proposed to ensure security of the OCM-G. We distinguish two aspects of the solution: inter-component communication and forge-component attack. The second security aspect results from the fact, that secure communication between system components does not ensure security of whole system. Results of the test are shown in order to estimate overhead caused by the solution. We extend concepts presented in [1] and describe the idea of different security levels for different network environments. We show that the forge-component security aspect is universal and can be used to secure other systems similar to the OCM-G.

References:

  1. Balis B., Bubak M., Rzasa W., Szepieniec T., Wismuller R.: Security in the OCM-G Grid Application Monitoring System, accepted for PPAM'2003 Czestochowa, Poland, 2003

Presentation
(PowerPoint: 559 KB)


Piotr Bala

UNICORE - Towards Production Quality Grid Environment

Abstract:

Presentation
(PowerPoint: 3,61 MB)


Krzysztof Benedyczak, Michal Wronski

UNICORE Plugins - How to Design Application Specific Interfaces

Abstract:

Presentation
(PowerPoint: 757 KB)


Tomas Hebelka

VAN - Visual Area Network
Interactive and Collaborative Visualization of Large Data Sets in Distributed Environments


Abstract:

Visual Area Networking (VAN) solving the biggest problems in science and industry requires the best minds. However, people are increasingly globally mobile or in locations remote from an organization's advanced computing resources. VAN solve these challenges by integrating high-productivity computing, visualization, scalable storage and networking technologies that make it easier to bring people together with the visual information they need to do their jobs effectively. VAN not only allows individuals and groups to solve complex problems while drawing on the company's best experts. Furthermore, it delivers these capabilities directly to the end users or groups, wherever they may be, so shared visualization becomes part of user's standard environment. As a result, users are free to focus on creativity and insight in collaborative setting rather than on the technical details of computing, visualization, and data management.

Visual area networking represent a shift from focusing only on advancing the power needed for the most precise rendering to include consideration of the location and availability of visualized data across the network. VAN is driven by two core technologies developed by SGI: the SGI Onyx visualization system and a new software component called OpenGL Vizserver. OpenGL Vizserver allows users remote workstations, laptops, and even wireless tablet computers to use existing unmodified applications to access and control the power of SGI Onyx family visualization systems and collaborate with one another using existing visualization applications based on OpenGL API.

OpenGL Vizserver Architecture
OpenGL Vizserver software has two primary components: a server and a client. The server runs on SGI graphics supercomputers managing graphics resources (e.g., graphics pipelines) and monitoring the visualization application activity. Once a visualization application is started, the OpenGL Vizserver server will assign the application the requested graphics resources and begin serving the applicatio n rendered frames to the OpenGL Vizserver client. This visual serving is the basis of the OpenGL Vizserver technology and SGI's VAN strategy. Only after a visual application has rendered a frame will OpenGL Vizserver intercede and capture that frame. The captured frame can be small fraction of the original data set size and orders of magnitude less complex, because only the pixels associated with the screen representation of the data are captured.

Each frame captured is compressed using either lossy or lossless data compressors that take advantage of interframe coherency to minimize the amount of data sent to the OpenGL Vizserver clients. Once compressed, the image stream is sent to the client. An OpenGL Vizserver client is a lightweight application that reads the image stream from the OpenGL Vizserver server, uncompresses the stream, and display the uncompressed image on the client computer. The OpenGL Vizserver client directs all user interaction back to the OpenGL Vizserver server, creating a seamless visualization environment on the client as if the user were interacting locally with the SGI graphics supercomputer. The OpenGL Vizserver client runs on a variety of operating systems including IRIX, Linux, Windows, or Solaris, and the client system need not have extensive graphics or computational power.

Presentation
(PowerPoint: 5.07 MB)


Tomasz Kuczynski, Roman Wyrzykowski and Jaroslaw Zola

Web Access to Distributed Condor Pools

Abstract:

The original goal of WebCI project is creation of a tool that allows monitoring and management of a Condor pool using WWW.

Main pressure is put on job submittion and control ease, as well as convenient UNIX shell access. The key element of the project is the portal security and platform independence. These requirements constrain us to use only the standard system tools.

All the above leads to a concept of using SSH sessions and scp tool through pseudo terminals.

At the beginning no support for interactive jobs was planned, taking into account non-persistent nature of HTTP protocol. The use of a local server keeping SSH connections states between subsequential HTTP transactions allows addition of this functionality and performance gain.

The usage of SSH and SCP enables to separate the portal from the access node of the pool. This in turn allows for adding the functionality of multiple Condor pools interaction and monitoring.

Not without importance is the ability of seamless attachment of new Condor pools by simple addition of a domain or IP address of the access node to the WebCI config file. Every pool may be accessed by an unrestricted number of portals, allowing for removing a single point of failure and increasing the system stability.

The use of mainly server-side technologies allows to use a thin client, and create WAPCI, in the future. This will provide the full support for mobile devices.

It is possible to easily adapt the WebCI system architecture to Grid structures, thus creating a secure and efficient WWW interface. Among others tasks, this interface will enable monitoring of resources and job queues, job submitting and management, exchange of files between a web browser and a user account, management of files and directories on user accounts. The important advantage of the WebCI Grid portal will be convenient use of shell commands using a tool similar to the Midnight Commander.

back



Günter Kickinger, Jürgen Hofer, A Min Tjoa, and Peter Brezany

Workflow Management in GridMiner

Abstract:

Knowledge discovery in data resources (files, file collections, relational databases, XML databases and semistructured data, etc.) managed within Computational Grids is a challenging research and development problem. The GridMiner project aims to cover all aspects of knowledge discovery and implement them as an advanced Grid application. We focus our effort on data mining and On-Line Analytical Processing (OLAP), two complementary technologies, which, if applied in conjunction, can provide a highly efficient and powerful data analysis and knowledge discovery solution on the Grid.

Knowledge discovery is a high interactive process. To achieve appealing results the user must permanently have the possibility to influence this process by applying different algorithms or adjusting parameters of them. So it is essential that GridMiner provides a powerful, flexible and simple to use user-interface to support the knowledge discovery process.

The research on GridMiner can be divided into two tasks. The first one is to provide a set of Grid services, which realize the single steps of the knowledge discovery process. This set of services includes, for example, the integration of different data sets, the pre-processing of the data, like cleaning and normalization, various data mining algorithms, and the presentation of the knowledge discovered to the user.

A special service class is represented by services for data mining on top of OLAP, so called On-Line Analytical Mining (OLAM) including services for cube creation, and interactive OLAP and OLAM services. The second task of our research deals with the integration of all these different services into one system. The knowledge discovery process can be interpreted as connecting the different available services to a workflow which is executed by an appropriate engine.

Despite the intensive research on workflow languages for Grid or Web Services, no approach is suitable for GridMiner. This is due to the fact, that all existing workflow languages are targeting to building a new orchestrated service out of existing services. This new orchestrated service is then published and can be used by a client like a conventional service. GridMiner needs a high dynamic workflow concept, where a client can compose the workflow to its individual needs.

In our approach, we design a new language for dynamic service composition called the Dynamic Service Composition Language (DSCL) which is based on XML and develop a workflow engine called the Dynamic Service Composition Engine (DSCE). DSCL allows the description of a workflow consisting of various Grid services and the specification of the parameter values for the single underlying Grid services. DSCE is implemented as a Grid service and can be controlled interactively by a client, which has the possibilities to execute, stop, resume or even to change the workflow and its parameters. Moreover, DSCE can used for processing in batch mode. The development of Dynamic Service Composition is not exclusively linked with GridMiner. Every Grid application which needs highly dynamic workflows can make use of this novel concept.

Presentation
(PowerPoint: 223 KB)


Coorganizers: Institute of Nuclear Physics
Institute of Nuclear
Physics
Institute of 
          Computer Science UST
Institute of Computer Science UST
ATM

ATM S.A.