Selected Publications

2016
Th. Lippert, D. Mallmann, M. Riedel
Scientific Big Data Analytics by HPC
Publication Series of the John von Neumann Institute for Computing (NIC) NIC Series 48, 417, ISBN 978-3-95806-109-5, pp. 1 - 10, 2016
[ PID ] [ Juelich ]
Storing, managing, sharing, curating and especially analyzing huge amounts of data face an immense visibility and importance in industry and economy as well as in science and research. Industry and economy exploit ’Big Data’ for predictive analysis, to increase the efficiency of infrastructures, customer segmentation, and tailored services. In science, Big Data allows for addressing problems with complexities that were impossible to deal with so far. The amounts of data are growing exponentially in many areas and are becoming a drastical challenge for infrastructures, software systems, analysis methods, and support structures, as well as for funding agencies and legislation. In this contribution, we argue that the Helmholtz Association, with its objective to build and operate large-scale experiments, facilities, and research infrastructures, has a key role in Tackling the pressing Scientific Big Data Analytics challenge. DataLabs and SimLabs, sustained on a long-term basis in Helmholtz, can bring research groups together on a synergistic level and can transcend the boundaries between different communities. This allows to translate methods and tools between different domains as well as from fundamental research to applications and industry. We present an SBDA framework concept touching its infrastructure building blocks, the
targeted user groups and expected benefits, also concerning industry aspects. Finally, we give a preliminary account on the call for “Expressions of Interest” by the John von Neumann-Institute for Computing concerning Scientific Big Data Analytics by HPC.

-
2015
G. Cavallaro, M. Riedel, M. Richerzhagen, J.A. Benediktsson, A. Plaza
On Understanding Big Data Impacts in Remotely Sensed Image Classification Using Support Vector Machine Methods
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Issue 99, pp. 1-13, 2015
[ DOI ] [ Juelich ]
Owing to the recent development of sensor resolutions onboard different Earth observation platforms, remote sensing is an important source of information for mapping and monitoring natural and man-made land covers. Of particular importance is the increasing amounts of available hyperspectral data originating from airborne and satellite sensors such as AVIRIS, HyMap, and Hyperion with very high spectral resolution (i.e., high number of spectral channels) containing rich information for a wide range of applications. A relevant example is the separation of different types of land-cover classes using the data in order to understand, e.g., impacts of natural disasters or changing of city buildings over time. More recently, such increases in the data volume, velocity, and variety of data contributed to the term big data that stand for challenges shared with many other scientific disciplines. On one hand, the amount of available data is increasing in a way that raises the demand for automatic data analysis elements since many of the available data collections are massively underutilized lacking experts for manual investigation. On the other hand, proven statistical methods (e.g., dimensionality reduction) driven by manual approaches have a significant impact in reducing the amount of big data toward smaller smart data contributing to the more recently used terms data value and veracity (i.e., less noise, lower dimensions that capture the most important information). This paper aims to take stock of which proven statistical data mining methods in remote sensing are used to contribute to smart data analysis processes in the light of possible automation as well as scalable and parallel processing techniques. We focus on parallel support vector machines (SVMs) as one of the best out-of-the-box classification methods.
-
M. Goetz, C. Bodenstein, M. Riedel
HPDBSCAN – Highly Parallel DBSCAN
Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC2015), Machine Learning in HPC Environments (MLHPC) Workshop, Austin, TX, USA, to be published
[ Event ]
Abstract:
Clustering algorithms in the field of data-mining are used to aggregate similar objects into common groups. One of the best-known of these algorithms is called DBSCAN. Its distinct design enables the search for an apriori unknown number of arbitrarily shaped clusters, and at the same time allows to filter out noise. Due to its sequential formulation, the parallelization of DBSCAN renders a challenge. In this paper we present a new parallel approach which we call HPDBSCAN. It employs three major techniques in order to break the sequentiality, empower workload-balancing as well as speed up neighborhood searches in distributed parallel processing environments i) a computation split heuristic for domain decomposition, ii) a data index preprocessing step and iii) a rule-based cluster merging scheme. As a proof-of-concept we implemented HPDBSCAN as an OpenMP/MPI hybrid application. Using real-world data sets, such as a point cloud from the old town of Bremen, Germany, we demonstrate that our implementation is able to achieve a significant speed-up and scale-up in common HPC setups. Moreover, we compare our approach with previous attempts to parallelize DBSCAN showing an order of magnitude improvement in terms of computation time and memory consumption.
-
G. Cavallaro, M. Riedel, M. Goetz, C. Bodenstein, M. Richerzhagen, P. Glock, J.A. Benediktsson
Scalable Developments for Big Data Analytics in Remote Sensing
Proceedings of the IEEE 35th Canadian Symposium on Remote Sensing (IGARSS), Milan, Italy, 2015
[ DOI ] [ Juelich ]
Abstract:
Big Data Analytics methods take advantage of techniques from the fields of data mining, machine learning, or statistics with a focus on analysing large quantities of data (aka ‘big datasets’) with modern technologies. Big data sets appear in remote sensing in the sense of large volumes, but also in the sense of an ever increasing amount of spectral bands (i.e., high-dimensional data). The remote sensing has traditionally used the above described techniques for a wide variety of application such as classification (e.g., land cover analysis using different spectral bands from satellite data), but more recently scalability challenges occur when using traditional (often serial) methods. This paper addresses observed scalability limits when using support vector machines (SVMs) for classification and discusses scalable and parallel developments used in concrete application areas of remote sensing. Different approaches that are based on massively parallel methods are discussed as well as recent developments in parallel methods.
-
M. Goetz, M. Richerzhagen, G. Cavallaro, C. Bodenstein, P. Glock, M. Riedel, J.A. Benediktsson
On Scalable Data Mining Techniques for Earth Science
Sixth Workshop on Data Mining in Earth System Science (DMESS 2015), International Conference on Computational Science (ICCS), Reykjavik, 2015,pp. 2188 - 2197
[ DOI ] [ Juelich ]
Abstract:
One of the observations made in earth data science is the massive increase of data volume (e.g, higher resolution measurements) and dimensionality (e.g. hyper-spectral bands). Traditional data mining tools (Matlab, R, etc.) are becoming redundant in the analysis of these datasets, as they are unable to process or even load the data. Parallel and scalable techniques, though, bear the potential to overcome these limitations. In this contribution we therefore evaluate said techniques in a High Performance Computing (HPC) environment on the basis of two earth science case studies: (a) Density-based Spatial Clustering of Applications with Noise (DBSCAN) for automated outlier detection and noise reduction in a 3D point cloud and (b) land cover type classification using multi-class Support Vector Machines (SVMs) in multi- spectral satellite images. The paper compares implementations of the algorithms in traditional data mining tools with HPC realizations and ’big data’ technology stacks. Our analysis reveals that a wide variety of them are not yet suited to deal with the coming challenges of data mining tasks in earth sciences.
-
M. Riedel, M. Goetz, M. Richerzhagen, P. Glock, C. Bodenstein, A.S. Memon, M.S. Memon
Scalable and Parallel Machine Learning Algorithms for Statistical Data Mining – Practice & Experience
Proceedings of the 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, 2015
[ DOI ] [ Juelich ]
Abstract:
Many scientific datasets (e.g. earth sciences, medical sciences, etc.) increase with respect to their volume or in terms of their dimensions due to the ever increasing quality of measurement devices. This contribution will specifically focus on how these datasets can take advantage of new ’big data’ technologies and frameworks that often are based on parallelization methods. Lessons learned with medical and earth science data applications that require parallel clustering and classification techniques such as support vector machines (SVMs) and density-based spatial clustering of applications with noise (DBSCAN) are a substantial part of the contribution. In addition, selected experiences of related ’big data’ approaches and concrete mining techniques (e.g. dimensionality reduction, feature selection, and extraction methods) will be addressed too. In order to overcome identified challenges, we outline an architecture framework design that we implement with open available tools in order to enable scalable and parallel machine learning applications in distributed systems.
-
M.S. Memon, M. Riedel, C. Koeritz, A. Grimshaw
Interoperable Job Execution and Data Access through UNICORE and the Global Federated File System
Proceedings of the 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, 2015
[ DOI ] [ Juelich ]
Abstract:
Computing middlewares play a vital role for abstracting complexities of backend resources by providing a seamless access to heterogeneous execution management services. Scientific communities are taking advantage of such technologies to focus on science rather than dealing with technical intricacies of accessing resources. Multi-disciplinary communities often bring dynamic requirements which are not trivial to realize. Specifically, to attain massivley parallel data processing on supercomputing resources which require an access to large data sets from widely distributed and dynamic sources located across organizational boundaries. In order to support this abstract scenario, we bring a combination that integrates UNICORE middleware and the Global Federated File System. Furthermore, the paper gives architectural and implementation perspective of UNICORE extension and its interaction with Global Federated File System space through computing, data and security standards.
-
T.K. Samuel, S. Wan, P.V. Coveney, M. Riedel, S. Memon, S. Gesing, N. Wilkins-Diehr
Overview of XSEDE-PRACE collaborative projects in 2014
Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, 2015
[ DOI ] [ Juelich ]
Abstract:
In this paper we give a brief overview of the three projects that were chosen for XSEDE-PRACE collaboration in 2014. We begin this paper with an introduction of the XSEDE and PRACE organizations and the motivation for a collaborative effort between these two organizations. We then talk about the three projects that were involved in this collaboration. We provide an overview of the projects themselves and what was in scope for this collaboration. We also outline the hurdles and issues faced during this unique collaborative effort and also discuss the benefits the projects derived from this collaboration. We finally outline the future steps envisioned for XSEDE-PRACE collaborative efforts going forward.
-
2014
A. S. Memon, J. Jensen, A.Cernivec, K. Benedyczakx, M. Riedel
Federated Authentication and Credential Translation in the EUDAT Collaborative Data Infrastructure
IEEE First International Workshop on Cloud Federation Management (Identity, Resources, and Applications), London, UK, 2014
[ Event ] [ Juelich ]
Abstract:
One of the challenges in a distributed data infrastructure is how users authenticate to the infrastructure, and how their authorisations are tracked. Each user community comes with its own established practices, all different, and users are put off if they need to use new, difficult tools. From the perspective of the infrastructure project, the level of assurance must be high enough, and it should not be necessary to reimplement an suthentication and authorisation infrastructure (AAI). In the EUDAT project, we chose to implement a mostlyloosely coupled approach based on the outcome of the Contrail and Unicore projects. We have preferred a practical approach, combining the outcome of several projects who have contributed parts of the puzzle. The present paper aims to describe the experiences with the integration of these parts. Eventually, we aim to have a full framework which will enable us to easily integrate new user communities and new services.
-
M.S. Memon, M. Riedel, F. Janetzko, B. Demeler, G. Gorbet, S. Marru, A. Grimshaw, L. Gunathilake, R. Singh, N. Attig, Th. Lippert
Advancements of the UltraScan scientific Gateway for open Standards-based cyberinfrastructures
Concurrency and Computation: Practice and Experience, Vol. 26, 13, pp. 2280-2291, 2014
[ DOI ] [ Juelich ]
Abstract:
The UltraScan data analysis application is a software package that is able to take advantage of computational resources in order to support the interpretation of analytical ultracentrifugation experiments. Since 2006, the UltraScan scientific gateway has been used with Web browsers in TeraGrid by scientists studying the solution properties of biological and synthetic molecules. UltraScan supports its users with a scientific gateway in order to leverage the power of supercomputing. In this contribution, we will focus on several advancements of the UltraScan scientific gateway architecture with a standardized job management while retaining its lightweight design and end user interaction experience. This paper also presents insights into a production deployment of UltraScan in Europe. The approach is based on open standards with respect to job management and submissions to the Extreme Science and Engineering Discovery Environment in the USA and to similar infrastructures in Europe such as the European Grid Infrastructure or the Partnership for Advanced Computing in Europe (PRACE). Our implementation takes advantage of the Apache Airavata framework for scientific gateways that lays the foundation for easy integration into several other scientific gateways.
-
G. Cavallaro, M. Riedel, J.A. Benediktsson, M. Goetz, T. Runarsson, K. Jonasson, Th. Lippert
Smart Data Analytics Methods for Remote Sensing Applications
Proceedings of the IEEE 35th Canadian Symposium on Remote Sensing (IGARSS), Quebec, Canada, 2014
[ DOI ] [ Juelich ]
Abstract:
The big data analytics approach emerged that can be interpreted as extracting information from large quantities of scientific data in a systematic way. In order to have a more concrete understanding of this term we refer to its refinement as smart data analytics in order to examine large quantities of scientific data to uncover hidden patterns, unknown correlations, or to extract information in cases where there is no exact formula (e.g. known physical laws). Our concrete big data problem is the classification of classes of land cover types in image-based datasets that have been created using remote sensing technologies, because the resolution can be high (i.e.large volumes) and there are various types such as panchromatic or different used bands like red, green, blue, and nearly infrared (i.e. large variety). We investigate various smart data analytics methods that take advantage of machine learning algorithms (i.e. support vector machines) and state-of-the-art parallelization approaches in order to overcome limitations of big data processing using non-scalable serial approaches.
-
M. Riedel, A. Memon, M. Memon
High Productivity Data Processing Analytics Methods with Applications
Proceedings of the 37th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, 2014
[ DOI ] [ Juelich ]
Abstract:
The term ‘big data analytics’ emerged in order to engage in the ever increasing amount of scientific and engineering data with general analytics techniques that support the often more domain-specific data Analysis process. It is recognized that the big data challenge can only be adequately addressed when knowledge of various different fields such as data mining, machine learning algorithms, parallel processing, and data Management practices are effectively combined. This paper thus describes some of the ‘smart data analytics methods’ that enable a high productivity data processing of large quantities of scientific data in order to enhance the data analysis efficiency. The paper thus aims to provide new insights how various fields can be successfully combined. Contributions of this paper include the concretization of the cross-industry standard process for data mining (CRISPDM) process model in scientific environments using concrete machine learning algorithms (e.g. support vector machines that enable data classification) or data mining mechanisms (e.g. outlier detection in measurements). Serial and parallel approaches to specific data analysis challenges are discussed in the context of concrete earth science application data sets. Solutions also include various data visualizations that enable a better insight in the corresponding data analytics and analysis process.
-
2013
M. Riedel, P. Wittenburg, J. Reetz, M. van de Sanden, J. Rybicki, B. von St. Vieth, G. Fiameni, G. Mariani, A. Michelini, C. Cacciari, W. Elbers, D. Broeder, R. Verkerk, E. Erastova, M. Lautenschlaeger, R. Budig, H. Thielmann, P. Coveney, S. Zasada, A. Haidar, O. Buechner, C. Manzano, S. Memon, S. Memon, H. Helin, J. Suhonen, D. Lecarpentier, K. Koski and Th. Lippert
A Data Infrastructure Reference Model with Applications: Towards Realization of a ScienceTube Vision with a Data Replication Service
Journal of Internet Services and Applications, Volume 4, Issue 1
[ DOI ] [ Juelich ]
Abstract:
The wide variety of scientific user communities work with data since many years and thus have already a wide variety of data infrastructures in production today. The aim of this paper is thus not to create one new general data architecture that would fail to be adopted by each and any individual user community. Instead this contribution aims to design a reference model with abstract entities that is able to federate existing concrete infrastructures under one umbrella. A reference model is an abstract framework for understanding significant entities and relationships between them and thus helps to understand existing data infrastructures when comparing them in terms of functionality, services, and boundary conditions. A derived architecture from such a reference model then can be used to create a federated architecture that builds on the existing infrastructures that could align to a major common vision. This common vision is named as ‘ScienceTube’ as part of this contribution that determines the high-level goal that the reference model aims to support. This paper will describe how a well-focused use case around data replication and its related activities in the EUDAT project [4] aim to provide a first step towards this vision. Concrete stakeholder requirements arising from scientific end users such as those of the European Strategy Forum on Research Infrastructure (ESFRI) projects underpin this contribution with clear evidence that the EUDAT activities are bottom-up thus providing real solutions towards the so often only described ‘high-level big data challenges’. The followed federated approach taking advantage of community and data centers (with large computational resources) further describes how data replication services enable data-intensive computing of terabytes or even petabytes of data emerging from ESFRI projects.
-
M. Riedel, M. Memon, A. Memon, G. Fiameni, C. Cacciari, and Th. Lippert
High Productivity Processing - Engaging in Big Data around Distributed Computing
Proceedings of the 36th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, 2013, to be published
[ Event ]
Abstract:
The steadily increasing amounts of scientific data and the analysis of 'big data' is a fundamental characteristic in the context of computational simulations that are based on numerical methods or known physical laws. This represents both an opportunity and challenge on different levels for traditional distributed computing approaches, architectures, and infrastructures. On the lowest level dataintensive computing is a challenge since CPU speed has surpassed IO capabilities of HPC resources and on the higher levels complex cross-disciplinary data sharing is envisioned via data infrastructures in order to engage in the fragmented answers to societal challenges. This paper highlights how these levels share the demand for 'high productivity processing' of 'big data' including the sharing and analysis of 'large-scale science data-sets'. The paper will describe approaches such as the high-level European data infrastructure EUDAT as well as low-level requirements arising from HPC simulations used in distributed computing. The paper aims to address the fact that big data analysis methods such as computational steering and visualization, map-reduce, R, and others are around, but a lot of research and evaluations still need to be done to achieve scientific insights with them in the context of traditional distributed computing infrastructures.
-
S.M. Memon, M. Riedel, F. Janetzko, N. Attig, Th. Lippert, B. Demeler, G. Gorbet, S. Marru, R. Singh, L. Gunathilake and A. Grimshaw
Improvements of the UltraScan Scientific Gateway to Enable Computational Jobs on Large-scale and Open-standards based Cyberinfrastructures
Proceedings of the XSEDE 2013 - Gateway to Discovery - Conference, San Diego, USA, 2013, to be published
[ Event ]
Abstract:
The UltraScan data analysis application is a software package that is able to take advantage of computational resources in order to support the interpretation of analytical ultracentrifugation (AUC) experiments. Since 2006, the UltraScan scientific gateway has been used with ordinary Web browsers in TeraGrid by scientists studying the solution properties of biological and synthetic molecules. Unlike other applications, UltraScan is implemented on a gateway architecture and leverages the power of supercomputing to extract very high resolution information from the experimental data. In this contribution, we will focus on several improvements of the UltraScan scientific gateway that enable a standardized job submission and management to computational resources while retaining its lightweight design in order to not disturb the established working habits of its endusers. This paper further presents a walkthrough of the architectural design including one real installation deployment of UltraScan in Europe. The aim is to provide evidence for the added value of open standards and resulting interoperability enabling not only UltraScan application submissions to resources offered in the US cyber infrastructure Extreme Science and Engineering Discovery Environment (XSEDE), but also submissions to similar infrastructures in Europe and around the world. The use of the Airavata framework for scientific gateways within our approach bears the potential to have an impact on several other scientific gateways too.
-
M. Memon, S. Holl, B. Schuller, M. Riedel, A. Grimshaw
Enhancing the Performance of Workflow Execution in e-Science Environments by using the Standards based Parameter Sweep Model
Proceedings of the XSEDE 2013 - Gateway to Discovery - Conference, San Diego, USA, 2013, to be published
[ Event ]
Abstract:
Certain scientific use cases possess complex requirements to have Grid jobs executed in collections where the jobs’ re- quest contains only some variation in different parts. These scenarios can easily be tackled by a single job request which abstract this variation and can represent the same collection. The Open Grid Forum (OGF) standards community modeled this requirement through the Job Submission and Description Language (JSDL) Parameter Sweep specification, which takes a modular approach to handle different variations of parameter sweeps (e.g. document and file sweep). In this paper we present the UNICORE server environment implementing this specification build upon its existing JSDL implementation. In order to validate our appraochof the parameter sweep extension we have taken a scientific use case and empirically evaluated it with and without applying parameter sweep extensions.
-
M. Riedel, S. Memon, F. Janetzko, N. Attig, Th. Lippert, B. Demeler, G. Gorbet, S. Marru and R. Singh
On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments
Proceedings of the UNICORE Summit 2013, Leipzig, Germany, 2013, to be published
[ Event ]
Abstract:
Since the early 1990s, digital data acquisition from the analytical ultracentrifuge laid the foundation to analyse sedimentation data using computational resources. The UltraScan data analysis application became a well-known multi-platform software package that is able to take advantage of such resources in order to support the interpretation of analytical ultracentrifugation (AUC) experiments. UltraScan not only provides guidance for the design of sedimentation experiments, but also addresses data management challenges that arise from the wealth of AUC data. For example, it includes a laboratory information management system (LIMS) implemented as a relational database and several scientific domain-specific data analysis methods. These methods allow for designing, analysing, interpreting and visualizing sedimentation equilibrium and velocity experiments in a graphical interface. More recently, an UltraScan scientific gateway was developed to make the UltraScan software package and its inherent data analysis methods available to a broader community, including new and lessexperienced scientific users thus abstracting from low-level technical details of computational resources. The UltraScan scientific gateway is used across the world with dominant users in the US and Europe. This paper describes the architectural design approach to enable modern hydrodynamics data analysis on computational resources that offer access with the UNICORE middleware system. The challenges to overcome in this application enabling process are threefold. First, the UltraScan scientific gateway needs to be used to retain the working practice of scientists. Second, the architectural design approach should be re-used in other scientific gateways than just UltraScan. Third, mature and broadly used open standards should be used in order to provide the flexibility to use the approach with a wide variety of computational resources that are exposed with standard-compliant middleware. The full paper will describe more details on how we overcome these challenges with the following briefly described techniques. In order to abstract from the complexity of the wide variety of computational resources, middleware systems, and standard protocols, we used the Apache Airavata software package that is a framework specifically designed to build scientific gateways. It provides functionality to construct, execute, manage, and monitor applications on computational resources and is integrated into the UltraScan scientific gateway. We augmented this framework with clients for broadly used standards such as the Basic Execution Service (BES) and Job Submission and Description Language (JSDL) that enable the computational job submission to resources that offer standard-compliant middleware such as UNICORE, GENESIS, or GridSAM. Since Apache Airavata is considered to be used also by other scientific gateways this solution has the potential to enable also other scientific applications than just UltraScan. This paper, however, focusses on a specific deployment and implementation of the aforementioned solution. It describes how the UltraScan scientific gateway can take advantage of the Apache Airavata framework without changing the way scientists have worked over years with it. Furthermore, we will describe how the standards-compliant approach enables an UltraScan data analysis by using High Performance Computing (HPC) methods (i.e. Message Passing Interface for parallel programming) to support 2-dimensional spectrum analysis, genetic algorithms, and Monte Carlo analysis techniques in UNICORE environments. Although the paper will specifically focus on one concrete computational resource (i.e. JUROPA system at the Juelich Supercomputing Centre) to illustrate the solution and its complex challenges, we will further outline how this approach influences activities in the context of the US Extreme Science and Engineering Discovery Environment (XSEDE) infrastructure. A key summary of this contribution will be provided with lessons learned on the application enabling process and with making the case that the UNICORE community needs to work more closely with domain-specific scientific gateway developers to increase the overall usage of UNICORE by orders of magnitude.
-
2012
D. Mallmann, B. von St. Vieth, M. Riedel, J. Rybicki, K. Koski, D. Lecarpentier, P. Wittenburg
EUDAT - Towards a pan-European Collaborative Data Infrastructure
Inside, Vol. 10, 1, pp. 84-85, 2012
[ Online ] [ Juelich ]
Abstract:
On October 1, 2011 the EUDAT project was launched to target a pan-European solution to the challenge of data prolif- eration in Europe's scientific and research communities. Aiming to contribute to the production of a Collaborative Data Infrastructure driven by researchers' needs, the project is coordinated by CSC - IT Center for Science, Finland, and co-funded by the European Commission's Framework Programme 7. EUDAT aims at providing Europe's scientific and research communities with a sustainable pan-European infrastructure for improved access to scientific data. Burgeoning volumes of valuable and complex data - newly available from powerful new scientific instruments, simulations and digitization of library resources - represents a fantastic opportunity for science, but has created new challenges related to data manage- ment, access and preservation. The EUDAT consortium comprises 25 European partners, including data centers, technology providers, research communities and funding agencies from 13 countries, who will work together to deliver a Collaborative Data Infrastruc- ture that can sustainably meet future researchers' needs.
-
C. Aiftimiei, A. Aimar, A. Ceccanti, M. Cechi, A. Di Meglio, F. Estrella, P. Fuhrmann, E. Giorgio, B. Konya, J. Nilsen, M. Riedel,
J. White
Towards Next Generations of Software for Distributed Infrastructures: The European Middleware Initiative
Proceedings of the 8th IEEE Conference on e-Science, Chicago, USA, 2012, ISBN 978-1-4673-4467-8
[ IEEE ] [ Juelich ]
Abstract:
The last two decades have seen an exceptional increase of the available networking, computing and storage resources. Scientific research communities have exploited these enhanced capabilities developing large scale collaborations, supported by distributed infrastructures. In order to enable usage of such infrastructures, several middleware solutions have been created. However such solutions, having been developed separately, have been resulting often in incompatible middleware and infrastructures. The European Middleware Initiative (EMI) is a collaboration, started in 2010, among the major European middleware providers (ARC, dCache, gLite, UNICORE), aiming to consolidate and evolve the existing middleware stacks, facilitating their interoperability and their deployment on large distributed infrastructures, establishing at the same time a sustainable model for the future maintenance and evolution of the middleware components. This paper presents the strategy followed for the achievements of these goals : after an analysis of the situation before EMI, it is given an overview of the development strategy, followed by the most notable technical results, grouped according to the four development areas (Compute, Data, Infrastructure,
Security). The rigorous process ensuring the quality of provided software is then illustrated, followed by a description the release process, and of the relations with the user communities. The last section provides an outlook to the future, focusing on the undergoing actions looking toward the sustainability of activities.

-
A. Di Meglio, F. Estrella, M. Riedel
On realizing the concept study ScienceSoft of the European Middleware Initiative: Open Software for Open Science
Proceedings of the 8th IEEE Conference on e-Science, Chicago, USA, 2012, ISBN 978-1-4673-4467-8
[ IEEE ] [ Juelich ]
Abstract:
In September 2011 the European Middleware Initiative (EMI) started discussing the feasibility of creating an open source community for science with other projects like EGI, StratusLab, OpenAIRE, iMarine, and IGE, SMEs like DCore, Maat, SixSq, SharedObjects, communities like WLCG and LSGC. The general idea of establishing an open source community dedicated to software for scientific applications was understood and appreciated by most people. However, the lack of a precise definition of goals and scope is a limiting factor that has also made many people sceptical of the initiative. In order to understand more precisely what such an open source initiative should do and how, EMI has started a more formal feasibility study around a concept called ScienceSoft – Open Software for Open Science. A group of people from interested parties was created in December 2011 to be the ScienceSoft Steering Committee with the short-term mandate to formalize the discussions about the initiative and produce a document with an initial high-level description of the motivations, issues and possible solutions and a general plan to make it happen. The conclusions of the initial investigation were presented at CERN in February 2012 at a ScienceSoft Workshop organized by EMI. Since then, presentations of ScienceSoft have been made in various occasions, in Amsterdam in January 2012 at the EGI Workshop on Sustainability, in Taipei in February at the ISGC 2012 conference, in Munich in March at the EGI/EMI Conference and at OGF 34 in March. This paper provides information this concept study ScienceSoft as an overview distributed to the broader scientific community to critique it.
-
M.S. Memon, J. Rybicki, M. Riedel, A.S. Memon, E. Yen
Bridging the Gaps: Federation of Clouds Using Grid Services and Standards
Proceedings of the 35th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, Ed.: P. Biljanovic, Z. Butkovic, K. Skala, S. Golubic, N. Bogunovic, S. Ribaric, M. Cicin-Sain, D. Cisic, Z. Hutinski, M. Baranovic, M. Mauher, J.Ulemek, IEEE, ISBN 978-1-4673-2577-6, pp. 411 - 416, 2012
[ IEEE ] [ Juelich ]
Abstract:
Clouds have emerged as a new paradigm to access compute, storage, and networked resources in secure and cost effective manner. Their major benefits are seen in the commercial domain with its key features such as on-demand and more flexible resource provisioning, pay per use, and customized application environments. Also research communities such as High Energy Physics (HEP), Biology, and Neuroscience are investigating the applicability of Clouds, with their strengths and weaknesses in scientific environments. In this paper we will show that in scientific environments there are certain areas where cloud services should be exploited to support the challenging e-Science requirements. Among them are, support for virtual communities, dynamic service and resource discovery, identity and resource federation, and access to data catalogues. The Grid community has actively contributed to address some of these issues, thus we propose to reuse existing efforts to complement Cloud services with Grid computing best practices, production services, and experiences, including standardization. In this paper we will provide guidelines of how to realize multi-cloud federated deployments based on a survey of existing Grid technologies in context augmenting it with lessons learned gained in scientific environments. The contribution focuses on the areas of compute, data, information, and security. We will also show potential benefits that scientists can gain by adopting proposed solutions in cloud-based deployments.
-
S. A. Memon, M. Riedel, L. Field, G. Szigeti and I. Marton
EMIR: An EMI Service Registry for Federated Grid Infrastructures
Proceedings of Science, Proceedings of the EGI Community Forum 2012 / EMI Second Technical Conference, Munich, Germany, 2012
[ Online ] [ Juelich ]
Abstract:
The European Middleware Initiative(EMI) is a European project that represents a collaboration of four middlewares namely ARC, dCache, gLite, and UNICORE.All these middleware services should be easily deployable in a Grid Infrastructure. However the immediate challenge is the discovery of those services in a particular infrastructure that is typically done via so-called registries. This is a major requirement for operational systems, and the middleware itself. Existing registries such as ARC Information Index or UNICORE registry are designed to index middleware specific services. Given the centralized nature, the scope of these registries can become limited when considering a federated infrastructure that relies on service of different technology providers. Distributed Grid infrastructures such as EGI are be federated, therefore a unified registry should reflect this requirement. In this paper, a common registry EMIR is proposed, which attempts to overcome the challenges of federation, robustness, and performance implications of ever expanding Grids.
-
S.M. Memon, M. Riedel, B. Hagemeier, B. Schuller and M. Carpené
UNICORE EMI-Execution Service Realization towards Improved Open Standards
Proceedings of Science, Proceedings of the EGI Community Forum 2012 / EMI Second Technical Conference, Munich, Germany, 2012
[ Online ] [ Juelich ]
Abstract:
The EMI project unites a set of production Grid middleware technologies providing scientific communities a secure access to distributed and heterogeneous, compute and data resources. Within the EMI compute area, job management and monitoring are considered to be the most significant areas of work. Based on earlier Open Grid Forum (OGF) Production Grid Infrastructure (PGI) activities the existing standards and their adoption in the domain of job management on distributed computing infrastructures have been reviewed. As a consequence, several advanced execution service concepts have been identified that influenced the EMI-ES specification. The goal of this paper is to present the concepts of the EMI-ES interface and its information model that is required to manage, monitor, and model activities in production Grids. In this paper, we will delineate the architectural details of EMI-ES, and one of its ‘proof of concept’ realizations in UNICORE. The feedback of these activities is already part of the standardization process in OGF, and this paper puts existing Grid standards in context by comparing them with the proposed specification.
-
M. Riedel, J. Rybicki and A. Di Meglio
Software for Distributed Systems - The EMI Product Portfolio
Proceedings of Science, Proceedings of the EGI Community Forum 2012 / EMI Second Technical Conference, Munich, Germany, 2012
[ Online ] [ Juelich ]
Abstract:
The European Middleware Initiative (EMI) brings together ARC, dCache, gLite, and UNICORE to provide a harmonised set of products and streamlined releases to the DCI community. While there are many technical solutions around, EMI is one of the key players in providing software for large-scale distributed systems that are operated around the world today. Having products and solutions in various technical areas such as compute, data, information, and security, it is interesting to understand that these products also implement many of the principles and paradigms of distributed systems. This contribution will provide an overview of the EMI product portfolio focusing on its key features and their role in distributed systems based on comparisons with known literature such as books offered by Tanenbaum.
-
M. Riedel, A. Grimshaw, Th. Lippert
UNICORE 2020 – Strategic Options for the Future
Proceedings of the UNICORE Summit 2012, Dresden, Germany, ed.: V. Huber, R. Müller-Pfefferkorn, M. Romberg, Forschungszentrum Jülich, 2012, IAS Series Vol. 15., ISBN 978-3-89336-829-7, 2012
[ IAS Series ] [ Juelich ]
Abstract:
International e-Infrastructures offer a wide variety of Information and Communications Technology (ICT) services that federate computing, storage, networking and other hardware in order to create an ’innovative toolset’ for multidisciplinary research and engineering. UNICORE services are known to be secure, reliable, and fast providing researchers all over the world with powerful software that enables the use of those e-Infrastructures as a ’commodity tool’ in daily geographically distributed activities. As key technology provider of the European Grid Infrastructure (EGI), UNICORE is available as part of the Unified Middleware Distribution (UMD) serving the needs of researchers that require mainly High Throughput Computing (HTC). On the other end of the scale, UNICORE offers specifically optimized resources within the Partnership for Advanced Computing in Europe (PRACE) today. Beyond Europe, UNICORE installations are emerging more and more such as within the Extreme Science and Engineering Discovery Environment (XSEDE) US multi-disciplinary e-Infrastructure (aka Cyberinfrastructure) offering both HTC and HPC resources. The grand challenges in science, engineering, and in society that need to be solved towards 2020 and beyond will increasingly require both geographical and intellectual collaboration across multiple disciplines. International e-Infrastructures are considered to be one key toolset to tackle those grand challenges and this contribution will outline several options how UNICORE can remain one ’technology of choice’ towards 2020. A strategic roadmap is presented that illustrates the role of UNICORE alongside the European Commission’s (EC) vision for Europe in 2020, including the opportunities that arise for UNICORE in the context of the Digital Agenda for Europe. The roadmap also includes how UNICORE can play a role to tackle, or perhaps rather contribute with processing to the avoidance of ’big data waves’ arising from a wide varity of e-Infrastructure users emerging from the European Strategy Forum on Research Infrastructures (ESFRIs).
2011
M. Riedel, B. Demuth
UNICORE in XSEDE: Towards a large-scale scientific Environment based on Open Standards
Inside, Vol. 9, 2, pp. 52-53, 2011
[ Online ] [ Juelich ]
Abstract:
Starting in 2001, the National Science Foundation program TeraGrid has developed into one of the world’s largest and most comprehensive Grid projects offering resources and services to more than 10,000 scientists. It’s successor, the Extreme Science and Engineering Discovery Environment (XSEDE www.xsede.org), has started in July 2011 and is expected to excel the previous program in terms of service quality while lowering technological entry barriers at the same time. These and other goals are to be achieved in the project’s five year grant period with an overall budget of $121 million. Among the partnership of 17 institutions, the Jülich Supercomputing Centre (JSC) is the only organization located outside the USA.
-
M. Riedel
e-Science Infrastructure Interoperability Guide: The Seven Steps Toward Interoperability for e-Science
Book chapter, Guide to e-Science, Next Generation Scientific Research and Discovery, Computer Communications and Networks, Part 3, edited by Xiaoyu Yang, Lizhe Wang and Wei Jie, Springer, ISBN 978-0-85729-438-8, pp. 233-264, 2011
[ DOI ] [ Juelich ]
Abstract:
This chapter investigates challenges and provides proven solutions in the con-text of e-science infrastructure interoperability, because we want to guide world-wide infrastructure interoperability efforts. This chapter illustrates how an increas-ing amount of e-scientists can take advantage of using different types of e-science infrastructures jointly together for their e-research activities. The goal is to give readers that are working in computationally-driven research infrastructures (e.g. as within ESFRI scientific user community projects) the opportunity to transfer processes to their particular situations. Hence, although the examples and processes of this chapter are closely aligned with specific setups in Europe, many lessons learned can be actually used in similar environments potentially arising from ESFRI projects that seek to use the computational resources within EGI and PRACE via their own research infrastructure, techniques, and tools. Furthermore, we emphasize that readers should get a sense of interoperability thoughts and its benefits especially using sustainable standard-based approaches. Since several decades, traditional scientific computing can be seen as a third pillar alongside theory and experiment and since 10 years the Grid community provides a solid e-science infrastructure basement for these pillars to achieve e-science. E-Science is known for new kinds of collaboration in key areas of science through resource sharing using those infrastructures. But a closer look reveals that this basement is realized by a wide variety of e-science infrastructures today while we observe an increasing demand by e-scientists of using more than one infra-structure to achieve e-science. One of a rather new ’e-science design pattern’ in this context is the use of algorithms through scientific workflows that use both concepts of High Throughput Computing (HTC) and High Performance Compu-ting (HPC) on production e-science infrastructures today. This chapter illustrates ways and examples of realizing this infrastructure in-teroperability e-science design pattern and will therefore review existing reference models and architectures that are known to promote interoperability. such as the OGF Open Grid Services Architecture (OGSA), the Common Component Archi-tecture (CCA), and the OASIS Service Component Architecture (SCA). The review of these reference models and architectures provides insights in numerous limitations arising by having not suitable reference models in the community or following numerous proprietary approaches in case-by-case interoperability ef-forts not using any standards at all. As its main contribution, this chapter therefore reveals a concrete seven-step plan to guide infrastructure interoperability processes. So far, reference models in Grids only address component-level interoperability aspects such as concrete functionality and semantics. In contrast, we change the whole process of reaching production e-science infrastructure interoperability into a concrete seven step-based plan to achieve it while ensuring a concrete production Grid impact. This impact is in turn another important contribution of this chapter what we can see under the light of separating the ‘e-science hype’ from ‘e-science production in-frastructure reality’ today. Hence, this chapter not only presents how technical in-teroperability can be achieved on today’s production infrastructures, but also gives insights on operational, policy and sustainability aspects thus giving a complementary guidance for world-wide Grids and emerging research infrastructures (i.e. ESFRIs or other virtual science communities) as well as their technology providers and e-scientists. This chapter illustrates how the aforementioned steps can significantly support the process of establishing Grid interoperability and furthermore the chapter gives concrete examples for each step in the context of real e-research problems and activities. Hence, the chapter also put the processes also in context with interoperability field studies and use cases in the field of fusion science (EUFORIA) and bio-informatics (WISDOM and Virtual Physiological Human).
-
W. Gentzsch, D. Girou, A. Kennedy, H. Lederer, J. Reetz, M. Riedel, A. Schott, A. Vanni, M. Vazquez, J. Wolfrat
DEISA - Distributed European Infrastructure for Supercomputing Applications
Journal of Grid Computing, 9 (2011) 2, 2011, pp. 259 - 277
[ DOI ] [ Juelich ]
Abstract:
The paper presents an overview of the current research and achievements of the DEISA project, with a focus on the general concept of the infrastructure, the operational model, application projects and science communities, the DEISA Extreme Computing Initiative, user and application support, operations and technology, services, collaborations and interoperability, and the use of standards and policies. The paper concludes with a discussion about the long-term sustainability of the DEISA infrastructure.
-
M. Riedel, M.S. Memon, A.S. Memon, D. Mallmann, Th. Lippert, D. Kranzlmueller, A. Streit
e-Science Infrastructure Integration Invariants to Enable HTC and HPC Interoperability Applications
Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium, Shanghai, IEEE, ISBN 978-1-61284-425-1, pp. 922 - 931, 2011

[ DOI ] [ Juelich ]
Abstract:
During the past decade, significant international and broader interdisciplinary research is increasingly carried out by global collaborations that often share resources within a single production e-science infrastructure. More recently, increasing complexity of e-science applications embrace multiple physical models (i.e. multi-physics) and consider longer and more detailed simulation runs as well as a larger range of scales (i.e. multi-scale). This increase in complexity is creating a steadily growing demand for cross-infrastructure operations that take the advantage of multiple e-science infrastructures with a more variety of resource types. Since interoperable e-science infrastructures are still not seamlessly provided today we proposed in earlier work the Infrastructure Interoperability Reference Model (IIRM) that represents a trimmed down version of the Open Grid Service Architecture (OGSA) in terms of functionality and complexity, while on the other hand being more specifically useful for production and thus easier to implement. This contribution focuses on several important reference model invariants that are often neglected when infrastructure integration activities are being performed thus hindering seamless interoperability in many aspects. In order to indicate the relevance of our invariant definitions, we provide insights into two accompanying cross-infrastructure use cases of the bio-informatics and fusion science domain.
-

A. Di Meglio, M. Riedel, M.S. Memon, C. Loomis, D. Salomoni
Grids and Clouds Integration and Interoperability: An Overview
Proceedings of Science, Proceedings of the the International Symposium on Grids and Clouds and the Open Grid Forum, ISGC 2011 & OGF 31, Taipei, Taiwan, 2011

[ Online ] [ Juelich ]
Abstract:
Are grids and clouds different solutions to the same problems? Or are they simply different aspects of the same solution? Or maybe different solutions to different problems? Are they independent or complementary? Can they and should they be used together or is one a replacement for the other? Starting from the most accepted definitions of grids and clouds, this presentation describes the main differences and commonalities between the two models and the typical scenarios where grids and clouds can be used together or even merged into a common set of technologies and services. The talk gives an overview of the work being done in various contexts to make grids and clouds interoperable or integrable. Technological and operational aspects like virtualization, security, dynamic provisioning and standardization are briefly assessed. Finally the current work and future directions on cloud and grid integration explored by a number of projects like EGI-InSPIRE, EMI and StratusLab in the context of European Research Infrastructures are introduced.
-

M. Riedel, A. Streit, D. Kranzlmüller, D. Mallmann, Th. Lippert
Requirements of an e-Science Infrastructure Interoperability Reference Model
Proceedings of the 34th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO, Opatija, Croatia, IEEE, ISBN 978-1-4577-0996-8, pp. 221 - 226, 2011
[ IEEE ] [ Juelich ]
Abstract:
Many production Grid and e-science infrastructures offer their broad range of resources via services to end-users during the past several years with an increasing number of scientific applications that require access to a wide variety of resources and services in multiple Grids. But the vision of world-wide federated Grid infrastructures in analogy to the electrical power Grid is still not seamlessly provided today. This is partly due to the fact, that Grids provide a much more variety of services (job management, data management, information, security, etc.) in comparison with the electrical power Grid, but also we observe a rather slow adoption of the Open Grid Services Architecture (OGSA) concept initially defined as the major Grid reference model architecture roughly one decade ago. This contribution critically reviews OGSA and other related reference models in the field while pointing to significant requirements of an e-science infrastructure interoperability reference model that satisfies the needs of end-users today. We give insights to our findings of the core factors of such reference model requirements including its important major indicators to overcome the known limitations of OGSA. We then compare OGSA and the identified factors with our approach to give evidence for its applicability, relevance, and impact in European production infrastructures today.
2010
M. Riedel, B. Schuller, M. Rambadt, M.S. Memon, A.S. Memon, A. Streit, F. Wolf, Th. Lippert, S.J. Zasada, S. Manos, P.V. Coveney, F. Wolf, D. Kranzlmüller
Exploring the Potential of Using Multiple e-Science Infrastructures with Emerging Open Standards-based e-Health Research Tools
Proceedings of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Melbourne, Australia, pp. 341 - 348, 2010
[ DOI ] [ Juelich ]
Abstract:
E-health makes use of information and communication methods and the latest e-research tools to support the understanding of body functions. E-scientists in this field take already advantage of one single infrastructure to perform computationally-intensive investigations of the human body that tend to consider each of the constituent parts separately without taking into account the multiple important interactions between them. But these important interactions imply an increasing complexity of applications that embrace multiple physical models (i.e. multi-physics) and consider a larger range of scales (i.e. multi-scale) thus creating a steadily growing demand for interoperable infrastructures that allow for new innovative application types of jointly using different infrastructures for one application. But interoperable infrastructures are still not seamlessly provided and we argue that this is due to the absence of a realistically implementable infrastructure interoperability reference model that is based on lessons learned from e-science usage. Therefore, the goal of this paper is to explore the potential of using multiple infrastructures for one scientific goal with a particular focus on e-health. Since e-scientists gain more interest in using multiple infrastructures there is a clear demand for interoperability between them to enable a use with one e-research tool. The paper highlights work in the context of an e-Health blood flow application while the reference model is applicable to other e-science applications as well.
-
M. Riedel, M.S. Memon, A.S. Memon, A. Streit, F. Wolf, Th. Lippert, B. Konya, A. Konstaninov, O. Smirnova, M. Marzolla, L. Zangrando, J. Watzl, D. Kranzlmüller
Improvements of Common Open Grid Standards to Increase High Throughput and High Performance Computing Effectiveness on Large-scale Grid and e-Science Infrastructures
Proceedings of the 7th High-Performance Grid Computing (HPGC) Workshop at International Parallel and Distributed Processing Symposium (IPDPS), Atlanta, USA, pp. 1-7, 2010
[ DOI ] [ Juelich ]
Abstract:
Grid and e-science infrastructure interoperability is an increasing demand for Grid applications but interoperability based on common open standards adopted by Grid middle-wares are only starting to emerge on Grid infrastructures and are not broadly provided today. In earlier work we have shown how open standards can be improved by lessons learned from cross-Grid applications that require access to both, High Throughput Computing (HTC) resources as well as High Performance Computing (HPC) resources. This paper provides more insights in several concepts with a particular focus on effectively describing Grid job descriptions in order to satisfy the demands of e-scientists and their cross-Grid applications. Based on lessons learned over years gained with interoperability setups between production Grids such as EGEE, DEISA, and NorduGrid, we illustrate how common open Grid standards (i.e. JSDL and GLUE2) can take cross-Grid application experience into account.
-
B. Schuller, M. Riedel, A. Streit
Recent Advances in the UNICORE 6 Middleware
Inside, Vol. 8, 1, pp. 46-49, 2010
[ Online ] [ Juelich ]
Abstract:
The UNICORE Grid system provides a seamless, secure and intuitive access to distributed computational and data resources such as supercomputers, clusters, and large server farms. UNICORE serves as a solid basis in many European and international research projects that use existing UNICORE components to implement advanced features, higher-level services, and support for scientific and business applications from a growing range of domains. Since its initial release in August 2006, the current version UNICORE 6 has been significantly enhanced with new components, features and standards, offering a wide range of functionality to its users from science and industry. After giving a brief overview of UNICORE 6, this article introduces some of these new features and components.
-
A. Streit, P. Bala, A. Beck-Ratzka, K. Benedyczak, S. Bergmann, R. Breu, J. M. Daivandy, B. Demuth, A. Eifer, A. Giesler, B.
Hagemeier, S. Holl, V. Huber, N. Lamla, D. Mallmann, A. S. Memon, M. S. Memon, M. Rambadt, M. Riedel, M. Romberg, B.
Schuller, T. Schlauch, A. Schreiber, T. Soddemann, W. Ziegler
UNICORE 6 - Recent and Future Advancements
Annals of Telecommunications, Springer, Volume 65, Issue 11-12, pp 757-762, 2010
[ DOI ] [ Juelich ]
Abstract:
UNICORE is a European Grid Technology with more than 10 years of history. Originating from the Supercomputing domain, the latest version UNICORE 6 has turned into a general-purpose Grid technology that follows established standards and offers a rich set of features to its users. The paper starts with an architectural insight into UNICORE 6, highlighting the workflow features, standards and the different clients. Next, the current state of advancement is presented by describing recent developments. The paper closes with an outlook on future planned developments.
-
M. Riedel, W. Frings, Th. Eickermann, S. Habbinga, P. Gibbon, D.Mallmann, A. Streit, F. Wolf, Th. Lippert
Collaborative Interactivity in Parallel HPC Applications
Proceedings of the Instrumenting the Grid (InGrid) 2008 Workshop, Island of Ischia, Italy, published in F. Davoli et al. (eds.), Remote Instrumentation and Virtual Laboratories, Springer, pp. 249-262, 2010
[ DOI ] [ Juelich ]
Abstract:
Large-scale scientific research often relies on the collaborative use of massive computational power, fast networks, and large storage capacities provided by e-science infrastructures (e.g. deisa, egee, etc.) since the past several years. Especially within e-science infrastructures driven by high-performance computing (hpc) such as deisa, collaborative online visualization and computational steering (covs) has become an important technique to enable hpc applications with interactivity and visualized feedback mechanisms. In earlier work we have shown a prototype covs technique implementation based on the visualization interface toolkit (visit) and the Grid middleware of deisa named as Uniform Interface to Computing Resources (unicore). Since then the approach grew to a broader covs framework. More recently, we investigated the impact of using the computational steering capabilities of the covs framework implementation in unicore on large-scale hpc systems (i.e. ibm BlueGene/P with 65536 processors) and the use of attribute-based authorization. In this paper we emphasize on the improved collaborative features of the covs framework and present new insights of how we deal with dynamic management of n participants, transparency of Grid resources, and virtualization of hosts of end-users. We also show that our interactive approach to hpc systems fully supports the necessary single sign-on feature required in Grid and e-science infrastructures.
-
M. Riedel, A. Streit, D. Mallmann, F. Wolf, and Th. Lippert
Experiences and Requirements for Interoperability between HTC- and HPC-driven e-Science Infrastructures
Proceedings of the Korea e-Science All Hands Meeting, Workshop HPC for e-Science, Daejeon, Korea, 2008, published in: Future Application and Middleware Technology on e-Science,Ok-Hwan Byeon, Jang Hyuk Kwon, Thom Dunning, Kum Won Cho and Aurore Savoy-Navarro (Editors), Springer, ISBN 978-1-4419-1724-9, pp. 113-123, 2010
[ DOI ] [ Juelich ]
Abstract:
Recently, more and more e-science projects require resources in more than one production e-science infrastructure, especially when using HTC and HPC concepts together in one scientific workflow. But the interoperability of these infrastructures is still not seamlessly provided today and we argue that this is due to the absence of a realistically implementable reference model in Grids. Therefore, the fundamental goal of this paper is to identify requirements that allows for the definition of the core building blocks of an interoperability reference model that represents a trimmed down version of OGSA in terms of functionality, is less complex, more fine-granular and thus easier to implement. The identified requirements are underpinned with gained experiences from world-wide interoperability efforts.
-
S. Holl, M. Riedel, B. Demuth, M. Romberg and A. Streit
Supporting Scientific Biological Applications with Seamless Database Access in Interoperable E-Science Infrastructures
Proceedings of the First International Conference on Bioinformatics, Valencia, Spain, 2010,ed.:/ A.L. N. Fred, J. Filipe, H.Gamboa, INSTICC Press, ISBN 978-989-674-019-1, pp. 203 - 206, 2010
[ INSTICC ] [ Juelich ]
Abstract:
In the last decade, scientific biological applications have become very well integrated into e-Science environments, hence they are typically very demanding in both terms of computational capabilities and data capacities. Data capacities are very important in the domain of bioinformatics, thus there already exists and will be produced a lot of biological data in all biological areas. This data needs to be analysed and evaluated with the help of biological applicaions, where in turn computational capabilities are very important. The data is typically stored in biological databases and recently also in private biological databases. Furthermore, researcher would like to use the databases in a global collaborative usage over interoperable e-Science environments. To provide scientists a centralized access to distributed computational resources and these database resources, this paper provides the development of database access within an graphical client, based on the infrastructure interoperability reference model. This approach is very fundamental, although the most graphical clients lack in supporting an easy access and conjunctional usage of both, computational resources and database resources.
-
W. Gentzsch, A. Kennedy, H. Lederer, G. Pringle, J. Reetz, M. Riedel, B. Schuller, A. Streit, and J. Wolfrat
DEISA: e-Science in a Collaborative, Secure, Interoperable and User-Friendly Environment
Proceedings of e-Challenges 2010 Conference, Warsaw, Poland, ISBN: 978-1-905824-20-5, pp. 1-10, 2010
[ IEEE ]
Abstract:
The paper presents actual results and current research of the DEISA project, with a focus on major contributions to the field of e-Infrastructures, collaborative working environments, security and identity management, interoperability and standardisation, and high performance computing applications. Aim is to focus on research areas and results so far not published in such detail elsewhere. In addition, we describe three examples of high performance computing applications which strongly benefitted from running on the DEISA infrastructure, as represented by the DEISA Extreme Computing Initiative (DECI) or more recently, by so-called virtual communities and their scientific endeavours.
-
M. S. Memon, M. Riedel, A. S. Memon, F. Wolf, A. Streit, Th. Lippert, M. Plociennik, M. Owsiak, D. Tskhakaya, Ch. Konz
Lessons Learned From Jointly Using HTC- and HPC-driven e-Science Infrastructures in Fusion Science
2010 International Conference on Information and Emerging Technologies (ICIET), Karachi, Pakistan, ISBN 978-1-4244-8001-2, pp. 1-6, 2010
[ IEEE ]
Abstract:
The interoperability of e-Science infrastructures like DEISA/PRACE and EGEE/EGI is an increasing demand for a wide variety of cross-Grid applications, but interoperability based on common open standards adopted by Grid middleware is only starting to emerge and is not broadly provided today. In earlier work, we have shown how refined open standards form a reference model, which is based on careful academic analysis of lessons learned obtained from production cross-Grid applications that require access to both, High Throughput Computing (HTC) resources as well as High Performance Computing (HPC) resources. This paper provides insights in several concepts of this reference model with a particular focus on the finding of using HPC and HTC resources with the fusion applications BIT1 and a cross-infrastructure workflow based on the HELENA and ILSA fusion applications. Based on lessons learned over years gained with production interoperability setups and experimental interoperability work between production Grids like EGEE, DEISA, and NorduGrid, we illustrate how open Grid standards (e.g. OGSA-BES, JSDL, GLUE2, etc) can be used to overcome several limitations of the production architecture of the EUFORIA framework paving the way to a more standards-based and thus more maintainable and efficient solution.
-
M. Riedel, A.S. Memon, M.S. Memon, S. Holl, D. Mallmann, N. Lamla, A. Streit, Th. Lippert
The Key Role of the UNICORE Technology in European Distributed Computing Infrastructures Supporting e-Science Applications in the Decades to Come
Proceedings of the UNICORE Summit 2010, Jülich, Germany, ed.: A. Streit, M. Romberg, D. Mallmann, Forschungszentrum Jülich, 2010, IAS Series Vol. 5., ISBN 978-3-89336-661-3, pp. 83 - 94, 2010
[ IAS Series ] [ Juelich ]
Abstract:
The interoperability of worldwide distributed computing infrastructures (i.e. EGI, PRACE, NAREGI, TeraGrid, etc.) represents an increasing demand for real e-science applications but interoperability based on common open standards adopted by Grid technologies only starts to emerge on these infrastructures and is not broadly provided today. In earlier work we have shown how emerging open standards can be improved by lessons learned from crossinfrastructure applications that require access to both High Throughput Computing (HTC) resources as well as High Performance Computing (HPC) resources. This contribution provides more insights into several concepts that promote interoperability with a focus on the UNICORE technology in general and its key role to satisfy the demands of e-scientists and their cross-infrastructure applications in European computing as well as data-driven science in particular.
-
A. Streit, S. Bergmann, R. Breu, J. Daivandy, B. Demuth, A. Giesler, B. Hagemeier, S. Holl, V. Huber, D. Mallmann, A.S. Memon, M.S. Memon, R. Menday, M. Rambadt, M. Riedel, M. Romberg, B. Schuller, and Th. Lippert
UNICORE 6 - A European Grid Technology
L. Grandinetti, G. Joubert, W. Gentzsch (Eds.), Trends in High Performance and Large Scale Computing, IOS Press, Advances in Parallel Computing Vol. 18, ISBN 978-1-60750-073-5, pp. 157 - 173, 2010
[ DOI ] [ Juelich ]
Abstract:
This paper is about UNICORE, a European Grid Technology with more than 10 years of history. Originating from the Supercomputing domain, the latest version UNICORE 6 has matured into a general-purpose Grid technology that follows established Grid and Web services standards and offers a rich set of features to its users. An architectural insight into UNICORE is given, highlighting the workflow features as well as the different client options. The paper closes with a set of example use cases and e-infrastructures where the UNICORE technology is used today.
2009
M. Riedel, E. Laure, T. Soddemann, L. Field, J.P. Navarro, J. Casey, M. Litmaath, J.P. Baud, B. Koblitz, C. Catlett, D. Skow, C. Zheng, PM. Papadopoulos, M. Katz, N. Sharma, O. Smirnova, B. Kónya, P. Arzberger, F. Würthwein, A.S. Rana, T. Martin, M. Wan, V. Welch, T. Rimovsky, S. Newhouse, A. Vanni, Y. Tanaka, Y. Tanimura, T. Ikegami, D. Abramson, C. Enticott, G. Jenkins, R. Pordes, S. Timm, G. Moont, M. Aggarwal, D. Colling, O. van der Aa, A. Sim, V. Natarajan, A. Shoshani, J. Gu, G. Galang, R. Zappi, L. Magnoni, V. Ciaschini, M. Pace, V. Venturi, M. Marzolla, P. Andreetto, B. Cowles, S. Wang, Y. Saeki, H. Sato, S. Matsuoka, P. Uthayopas, S. Sriprayoonsakul, O. Koeroo, M. Viljoen, L. Pearlman, S. Pickles, D. Wallom, G. Moloney, J. Lauret, J. Marsteller, P. Sheldon, S. Pathak, S. De Witt, J. Mencák, J. Jensen, M. Hodges, D. Ross, S. Phatanapherom, G. Netzer, A.R. Gregersen, M. Jones, S. Chen, P. Kacsuk, A. Streit, D. Mallmann, F. Wolf, T. Lippert, T. Delaitre, E. Huedo, N. Geddes
Interoperation of World-Wide Production e-Science Infrastructures
Concurrency and Computation: Practice and Experience, Vol. 21, 8, pp. 961-990, 2009
[ DOI ] [ Juelich ]
Abstract:
Many production Grid and e-Science infrastructures have begun to offer services to end-users during the past several years with an increasing number of scientific applications that require access to a wide variety of resources and services in multiple Grids. Therefore, the Grid Interoperation Now—Community Group of the Open Grid Forum—organizes and manages interoperation efforts among those production Grid infrastructures to reach the goal of a world-wide Grid vision on a technical level in the near future. This contribution highlights fundamental approaches of the group and discusses open standards in the context of production e-Science infrastructures.
-
M. Riedel, F. Wolf, D. Kranzlmüller, A. Streit, T. Lippert
Research Advances by Using Interoperable e-Science Infrastructures - The Infrastructure Interoperability Reference Model Applied in e-Science
Cluster Computing, Vol. 12, 4, pp. 357-372, 2009
[ DOI ] [ Juelich ]
Abstract:
Computational simulations and thus scientific computing is the third pillar alongside theory and experiment in todays science. The term e-science evolved as a new research field that focuses on collaboration in key areas of science using next generation computing infrastructures (i.e. co-called e-science infrastructures) to extend the potential of scientific computing. During the past years, significant international and broader interdisciplinary research is increasingly carried out by global collaborations that often share a single e-science infrastructure. More recently, increasing complexity of e-science applications that embrace multiple physical models (i.e. multi-physics) and consider a larger range of scales (i.e. multi-scale) is creating a steadily growing demand for world-wide interoperable infrastructures that allow for new innovative types of e-science by jointly using different kinds of e-science infrastructures. But interoperable infrastructures are still not seamlessly provided today and we argue that this is due to the absence of a realistically implementable infrastructure reference model. Therefore, the fundamental goal of this paper is to provide insights into our proposed infrastructure reference model that represents a trimmed down version of OGSA in terms of functionality and complexity, while on the other hand being more specific and thus easier to implement. The proposed reference model is underpinned with experiences gained from e-science applications that achieve research advances by using interoperable e-science infrastructures.
-
M. Riedel and G. Terstyanszky
Grid Interoperability for e-Research
Special Issue 'Interoperability' of Journal of Grid Computing Vol. 7, No. 3, pp. 285--286, 2009
[ DOI ] [ Juelich ]
Abstract:
computing is the third pillar alongside theory and experiment in science and engineering today. The term e-science evolved as a new research field that focus on collaboration in key areas of science using next generation computing infrastructures such as Grids to extend the potential of scientific computing. More recently, increasing complexity
of e-science applications that embrace multiple physical models (i.e. multi-physics) and consider a larger range of scales (i.e. multi-scale) is creating a steadily growing demand for world-wide interoperable Grid infrastructures that allow for new innovative types of e-science by jointly using a broader variety of computational resources. Since such interoperable Grid infrastructures are still not seamlessly provided today, the topic ‘Grid interoperability’ emerged as a broader research field in the last couple of years.This journal special issue highlights selected contributions to the greater research field of Grid interoperability in general and provides an interesting set of information about world-wide projects that work in this particular research field. It thus represents a good supplement to the proceedings of the International Grid Interoperability and Interoperation Workshops (IGIIW) that we have organized in the past.

-
D. Becker, M. Riedel, A. Streit, F. Wolf
Grid-Based Workflow Management for Automatic Performance Analysis of Massively Parallel Applications
Proceedings of the 3rd CoreGRID Workshop on Grid Middleware, 2008, Barcelona, Spain / ed.: N. Meyer, D. Talia, R. Yahyapour. - Springer, 2009, ISBN 978-0-387-85965-1, pp. 103 - 118, 2009
[ DOI ] [ Juelich ]
Abstract:
Many Grid infrastructures have begun to offer services to end-users during the past several years with an increasing number of complex scientific applications and software tools that require seamless access to different Grid resources via Grid middleware during one workflow. End-users of the rather hpc-driven DEISA Grid infrastructure take not only advantage of Grid workflow management capabilities for massively parallel applications to solve critical problems of high complexity (e.g. protein folding, global weather prediction), but also leverage software tools to achieve satisfactory application performance on contemporary massively parallel machines (e.g., IBM Blue Gene/P). In this context, event tracing is one technique widely used by software tools with a broad spectrum of applications ranging from performance analysis, performance prediction and modeling to debugging. In particular, automatic performance analysis has emerged as an powerful and robust instrument to make the optimization of parallel applications both more effective and more efficient. The approach of automatic performance analysis implies multiple steps that can perfectly leverage the workflow capabilities in Grids. In this paper, we present how this approach is implemented by using the workflow management capabilities of the unicore Grid middleware, which is deployed on deisa, and thus, demonstrate by using a Grid application that the approach taken is feasible.
-
M.S. Memon, A.S. Memon, M. Riedel, A. Streit, F. Wolf
Enabling Grid Interoperability by Extending HPC-driven Job Management with an Open Standard Information Model
Eighth IEEE/ACIS International Conference on Computer and Information Science (ICIS), Shanghai, China, ISBN 978-0-7695-3641-5, pp. 506 - 511, 2009
[ IEEE ] [ Juelich ]
Abstract:
Many e-science applications take already advantage of numerous e-science infrastructures that evolved differently over the last couple of years. Along with this evolution, we observe still slow adoption of the open grid services architecture (OGSA) concept and thus interoperability between these infrastructures is still not seamlessly provided today. We argue that this is due to the absence of a realistically implementable reference model in Grids. In this contribution, we present our approach as one element of this reference model that focuses on the missing link between two emerging standards in the field of job management and information models in order to facilitate common open standards-based Grid interoperability.
-
M. Riedel, A. Streit, Th. Lippert, F. Wolf, D. Kranzlmueller
Concepts and Design of an Interoperability Reference Model for Scientific- and Grid Computing Infrastructures
Proceedings of the Applied Computing Conference, in Mathematical Methods and Applied Computing, Volume II, WSEAS Press, ISBN 978-960-474-124-3, pp. 691 - 698, 2009
[ ACM ]
Abstract:
Many production Grid and e-science infrastructures offer their broad range of resources via services to endusers during the past several years with an increasing number of scientific applications that require access to a wide variety of resources and services in multiple Grids. But the vision of world-wide federated Grid infrastructures in analogy to the electrical power Grid is still not seamlessly provided today. This is partly due to the fact, that Grids provide a much more variety of services (job management, data management, data transfer, etc.) in comparison with the electrical power Grid, but also the emerging open standards are still partly to be improved in terms of production usage. This paper points exactly to these improvements with a well-defined design of an infrastructure interoperability reference model that is based on open standards that are refined with experience gained by production Grid interoperability use cases. This contribution gives insights into the core building blocks in general, but focuses significantly on the computing building blocks of the reference model in particular.
-
F. Hedman, M. Riedel, P. Mucci, G. Netzer, A. Gholami, M. S. Memon, A. S. Memon, Z. A. Shah
Benchmarking of Integrated OGSA-BES with the Grid Middleware
Proceedings of 4th UNICORE Summit 2008, Springer Lecture Notes in Computer Science (LNCS) 5415, Euro-Par 2008 workshop proceedings, pp. 113-122, 2009
[ DOI ] [ Juelich ]
Abstract:
This paper evaluates the performance of the emerging OGF standard OGSA - Basic Execution Service (BES) on three fundamentally different Grid middleware platforms: UNICORE 5/6, Globus Toolkit 4 and gLite. The particular focus within this paper is on the OGSA-BES implementation of UNICORE 6. A comparison is made with baseline measurements, for UNICORE 6 and Globus Toolkit 4, using the legacy job submission interfaces. Our results show that the BES components are comparable in performance to existing legacy interfaces. We also have a strong indication that other factors, attributable to the supporting infrastructure, have a bigger impact on performance than BES components.
-
S. Holl, M. Riedel, B. Demuth, M. Romberg, A. Streit, V. Kasam
Life Science Application Support in an Interoperable E-Science Environment
Proceedings of IEEE Computer-based Medical Systems (CBMS) 2009, special Track: Healthgrid Computing - Applications to Biomedical Research and Healthcare, pp. 1 - 8, 2009
[ IEEE ] [ Juelich ]
Abstract:
In the last decade, life science applications have become more and more integrated into e-Science environments, hence they are typically very demanding, both in terms of computational capabilities and data capacities. Especially the access to life science applications, embedded in such environments via Grid clients still constitutes a major hurdle for scientists that do not have an IT background. Life science applications often comprise a whole set of small programs instead of a single executable. Many of the graphical Grid clients are not perfectly suited for these types of applications, as they often assume that Grid jobs will run a single executable instead of a set of chained executions (i.e. sequences). This means that in order to execute a sequence of multiple programs on a single Grid resource, piping data from one program to the next, the user would have to run a hand-written shell script. Otherwise each program is independently scheduled as a Grid job, which causes unnecessary file transfers between the jobs, even if they are scheduled on the same resource. We present a generic solution to this problem and provide a reference implementation, which seamlessly integrates with the Grid middleware UNICORE. Our approach focuses on a comfortable user interface for the creation of such program sequences, validated in UNICORE-driven HPC-based Grids. Thus, we applied our approach in order to provide support for the usage of the AMBER package (a widely-used collection of programs for molecular dynamics simulations) within Grid workflows. We finally provide a scientific use case of our approach leveraging the interoperability of two different scientific infrastructures that represents an instance of the infrastructure interoperability reference model.
2008
M. Riedel, A. Streit, F. Wolf, T. Lippert, D. Kranzlmüller
Classification of Different Approaches for e-Science Applications in Next Generation Computing Infrastructures
Proceedings of the 4th IEEE Conference on e-Science, Indianapolis, USA, pp. 198-205, 2008
[ DOI ] [ Juelich ]
Abstract:
Simulation and thus scientific computing is the third pillar alongside theory and experiment in todays science and engineering. The term e-science evolved as a new research field that focuses on collaboration in key areas of science using next generation infrastructures to extend the powers of scientific computing. This paper contributes to the field of e-science as a study of how scientists actually work within currently existing Grid and e-science infrastructures. Alongside numerous different scientific applications, we identified several common approaches with similar characteristics in different domains. These approaches are described together with a classification on how to perform e-science in next generation infrastructures. The paper is thus a survey paper which provides an overview of the e-science research domain.
-
M. Riedel, A.S. Memon, M.S. Memon, D. Mallmann, A. Streit, F.Wolf, Th. Lippert, V. Venturi, P. Andreetto, M. Marzolla, A. Ferraro, A. Ghiselli, F. Hedman, Zeeshan A. Shah, J. Salzemann, A. Da Costa, V. Breton, V. Kasam, M. Hofmann-Apitius, D. Snelling, S. van de Berghe, V. Li, S. Brewer, A. Dunlop, N. De Silva
Improving e-Science with Interoperability of the e-Infrastructures EGEE and DEISA
Proceedings of the 31st International Convention MIPRO, Conference on Grid and Visualization Systems (GVS), Opatija, Croatia, Croatian Society for Information and Communication Technology, Electronics and Microelectronics, ISBN 978-953-233-036-6, pp. 225 - 231, 2008
[ CiteSeerX ] [ Juelich ]
Abstract:
In the last couple of years, many e-Science infrastructures have begun to offer production services to e-Scientists with an increasing number of applications that require access to different kinds of computational resources. Within Europe two rather different multi-national e-Science infrastructures evolved over time namely Distributed European Infrastructure for Supercomputing Applications (DEISA) and Enabling Grids for E-SciencE (EGEE). DEISA provides access to massively parallel systems such as supercomputers that are well suited for scientific applications that require many interactions between their typically high numbers of CPUs. EGEE on the other hand provides access to a world-wide Grid of university clusters and PC pools that are well suited for farming applications that require less or even no interactions between the distributed CPUs. While DEISA uses the HPC-driven Grid technology UNICORE, EGEE is based on the gLite Grid middleware optimized for farming jobs. Both have less adoption of open standards and therefore both systems are technically non-interoperable, which means that no e-Scientist can easily leverage the DEISA and EGEE infrastructure with one suitable client environment for scientific applications. This paper argues that future interoperability of such large e-Science infrastructures is required to improve e-Science in general and to increase the real scientific impact of world-wide Grids in particular. We discuss the interoperability achieved by the OMII-Europe project that fundamentally improved the interoperability between UNICORE and gLite by using open standards. We also outline one specific scientific scenario of the WISDOM initiative that actually benefits from the recently established interoperability.
-
M. Riedel, D. Mallmann, A. Streit
Usage of the UNICORE Grid Technology in Scientific und Economic Domains
In Proceedings of the Workshop on Grid and Scientific and Engineering Applications (Grid&SEA) at Applications of Mathematics in Engineering and Economics (AMEE), Sozopol, Bulgaria, AIP Conference Proceedings 1067, pp. 559-566, 2008
[ DOI ] [ Juelich ]
Abstract:
In the past years, many scientific and economic‐related applications from various domains have taken advantage of Grid infrastructures that share storage or computational resources such as supercomputers or clusters across multiple organizations. Especially within Grid infrastructures driven by high‐performance computing (HPC) such as the Distributed European Infrastructure for Supercomputing Applications (DEISA), the UNICORE Grid middleware has become an important tool to seamlessly access distributed resources by providing strong security and workflow capabilities. This paper highlights different usage models of UNICORE from a wide variety of scientific applications, and provides insights of approaches in economic‐related scenarios with UNICORE.
-
M. Riedel, W. Frings, S. Habbinga, Th. Eickermann, D. Mallmann, A. Streit, F. Wolf, Th. Lippert
Extending the Collaborative Online Visualization and Steering Framework for Computational Grids with Attribute-based Authorization
In Proceedings of the 9th IEEE/ACM International Conference on Grid Computing (Grid 2008), Tsukuba, Japan, pp. 104 - 111, 2008
[ IEEE ] [ Juelich ]
Abstract:
Especially within grid infrastructures driven by high-performance computing (HPC), collaborative online visualization and steering (COVS) has become an important technique to dynamically steer the parameters of a parallel simulation or to just share the outcome of simulations via visualizations with geographically dispersed collaborators. In earlier work, we have presented a COVS framework reference implementation based on the UNICORE grid middleware used within DEISA. This paper lists current limitations of the COVS framework design and implementation related to missing fine-grained authorization capabilities that are required during collaborative COVS sessions. Such capabilities use end-user information about roles, project membership, or participation in a dedicated virtual organization (VO). We outline solutions and present a design and implementation of our architecture extension that uses attribute authorities such as the recently developed virtual organization membership service (VOMS) based on the security assertion markup language (SAML).
-
M. Riedel, B. Schuller, D. Mallmann, R. Menday, A. Streit, B. Tweddell, M.S. Memon, A.S. Memon, B. Demuth, Th. Lippert, D. Snelling, S. van den Berghe, V. Li, M. Drescher, A. Geiger, G. Ohme, K. Benedyczak, P. Bala, R. Ratering, A. Lukichev
Web Services Interfaces and Open Standards Integration into the European UNICORE 6 Grid Middleware
In Proceedings of 2007 Middleware for Web Services (MWS 2007) Workshop at 11th International IEEE EDOC Conference "The Enterprise Computing Conference", 2007, Annapolis, USA, IEEE Computer Society, ISBN 978-0-7695-3338-4, pp. 57-60, 2008
[ IEEE ] [ Juelich ]
Abstract:
The UNICORE grid system provides a seamless, secure and intuitive access to distributed grid resources. In recent years, UNICORE 5 is used as a well-tested grid middleware system in production grids (e.g. DEISA, D-Grid) and at many supercomputer centers world-wide. Beyond this production usage, UNICORE serves as a solid basis in many European and International research projects and business scenarios from T-Systems, Philips Research, Intel, Fujitsu and others. To foster ongoing developments in multiple projects, UNICORE is open source under BSD license at SourceForge. More recently, the new Web services-based UNICORE 6 has become available that is based on open standards such as the Web services addressing (WS-A) and the Web services resource framework (WS-RF) and thus conforms to the open grid services architecture (OGSA) of the open grid forum (OGF). In this paper we present the evolution from production UNICORE 5 to the open standards-based UNICORE 6 and its various Web services-based interfaces. It describes the interface integration of emerging open standards such as OGSA-BES and OGSA-RUS and thus provides an overview of UNICORE 6.
-
V. Venturi, M. Riedel, A.S. Memon, M.S. Memon, F. Stagni, B. Schuller, D. Mallmann, B. Tweddell, A. Gianoli, V. Ciaschini, S. van de Berghe, D. Snelling, A. Streit
Using SAML-based VOMS for Authorization within Web Services-based UNICORE Grids.
In Proceedings of 3rd UNICORE Summit 2007 in Springer Lecture Notes in Computer Science (LNCS) 4854, Euro-Par 2007 Workshops: Parallel Processing, pp.112-120, 2008
[ DOI ] [ Juelich ]
Abstract:
In recent years, the Virtual Organization Membership Service (VOMS) emerged within Grid infrastructures providing dynamic, fine-grained, access control needed to enable resource sharing across Virtual Organization (VOs). VOMS allows to manage authorization information in a VO scope to enforce agreements established between VOs and resource owners. VOMS is used for authorization in the EGEE and OSG infrastructures and is a core component of the respective middleware stacks gLite and VDT. While a module for supporting VOMS is also available as part of the authorization service of the Globus Toolkit, there is currently no support for VO-level authorization within the new Web services-based UNICORE 6. This paper describes the evolution of VOMS towards an open standard compliant service based on the Security Assertion Markup Language (SAML), which in turn provides mechanisms to fill the VO-level authorization service gap within Web service-based UNICORE Grids. In addition, the SAML-based VOMS allows for cross middleware VO management through open standards.
2007
M. Riedel, B. Schuller, A. Streit
The European UNICORE 6 Grid Middleware
Inside, Vol. 5, 1, pp. 32-35, 2007
[ Online ] [ Juelich ]
Abstract:
The UNICORE Grid system provides a seamless, secure and intuitive access to distributed Grid resources such as supercomputers, clusters, and large server farms. In recent years, UNICORE 5 is used as a well-tested Grid middleware in scientific production Grids (e.g. DEISA, D-Grid) and for business use cases (e.g. T-Systems, Philips Research). In addition, UNICORE serves as a solid basis in many European and international research projects that use existing UNICORE components to implement advanced features, higher-level services, and support for scientific and business applications from a growing range of domains. More recently, the new Web servicesbased UNICORE 6 has become available in beta state. It is based on common open standards that have emerged from various standardization bodies such as OASIS (Organization for the Advancement of Structured Information
Standards) and OGF (Open Grid Forum).

-
M. Riedel, Th. Eickermann, S. Habbinga, W. Frings, P. Gibbon, D. Mallmann, A. Streit, Th. Lippert, F. Wolf, W. Schiffmann, A. Ernst, R. Spurzem, W.E. Nagel
Computational Steering and Online Visualization of Scientific Applications on Large-Scale HPC Systems within e-Science Infrastructures
In Proceedings of 3rd IEEE International Conference on e-Science and Grid Computing, Bangalore, India, IEEE Computer Society, ISBN 0-7695-3064-8, pp. 483-490, 2007
[ IEEE ] [ Juelich ]
Abstract:
In the past several years, many scientific applications from various domains have taken advantage of e-science infrastructures that share storage or computational resources such as supercomputers, clusters or PC server farms across multiple organizations. Especially within e-science infrastructures driven by high-performance computing (HPC) such as DEISA, online visualization and computational steering (COVS) has become an important technique to save compute time on shared resources by dynamically steering the parameters of a parallel simulation. This paper argues that future supercomputers in the Petaflop/s performance range with up to 1 million CPUs will create an even stronger demand for seamless computational steering technologies. We discuss upcoming challenges for the development of scalable HPC applications and limits of future storage/IO technologies in the context of next generation e- science infrastructures and outline potential solutions.
-
M. Riedel, Th. Eickermann, W. Frings , S. Dominiczak, D. Mallmann, Th. Düssel, A. Streit, P. Gibbon, F. Wolf, W. Schiffmann, Th. Lippert
Design and Evaluation of a Collaborative Online Visualization and Steering Framework Implementation for Computational Grids
In Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (Grid 2007), Austin, Texas, ISBN 1-4244-1560-8, pp.169 - 177, 2007
[ IEEE ] [ Juelich ]
Abstract:
Today's large-scale scientific research often relies on the collaborative use of a Grid or c-Science infrastructure (e.g. DEISA, EGEE, TeraGrid, OSG) with computational, storage, or other types of physical resources. One of the goals of these emerging infrastructures is to support the work of scientists with advanced problem-solving tools. Many e-Science applications within these infrastructures aim at simulations of a scientific problem on powerful parallel computing resources. Typically, a researcher first performs a simulation for some fixed amount of time and then analyses results in a separate post-processing step, for instance, by viewing results in visualizations. In earlier work we have described early prototypes of a Collaborative Online Visualization and Steering (COVS) Framework in Grids that performs both -simulation and visualization -at the same time (online) to increase the efficiency of e-Scientists. This paper evaluates the evolved mature reference implementation of the COVS framework design that is ready for production usage within Web service-based Grid and e-Science infrastructures.
-
M. Riedel, W. Frings, S. Dominiczak, Th. Eickermann, Th. Düssel, P. Gibbon, D. Mallmann, F. Wolf, and W. Schiffmann
Requirements and Design of a Collaborative Online Visualization and Steering Framework for Grid and e-Science Infrastructures
In Proceedings of German e-Science Conference 2007, Baden-Baden, Germany, 2007
[ CiteSeerX ] [ Juelich ]
Abstract:
Many production e-Science infrastructures (e.g. DEISA, D-Grid) have begun to offer a wide variety of services for end-users during the past several years. Many e-Scientists solve their scientific problems by using parallel computing applications on clusters and collaborative online visualization and steering (COVS) is known as a tool for analyzing and better understanding of these applications. In absence of a widely accepted COVS framework within Grids, visualizations are often created using proprietary technologies assuming a dedicated scenario. This makes it feasible to analyze the usual requirements to provide a blueprint for a more general COVS framework that can be integrated into Grid middleware systems such as UNICORE, gLite, or Globus Toolkits. These requirements lead to a design that was successfully implemented as a higher-level service in UNICORE and presented at numerous places such as the Open Grid Forum 19 and 20, Europar 2006, Supercomputing 2006 and DEISA trainings.
-
M. Marzolla, P. Andreetto, V. Venturi, A. Ferraro, A.S. Memon, M.S. Memon, B. Twedell, M. Riedel, D. Mallmann, A. Streit, S. van de Berghe, V., Li, D. Snelling, K. Stamou, Z.A. Shah, F. Hedman
Open Standards-based Interoperability of Job Submission and Management Interfaces across the Grid Middleware Platforms gLite and UNICORE
In Proceedings of International Interoperability and Interoperation Workshop (IGIIW) 2007 at 3rd IEEE International Conference on e-Science and Grid Computing, Bangalore, India, IEEE Computer Society, ISBN 0-7695-3064-8, pp. 592- 599, 2007
[ IEEE ] [ Juelich ]
Abstract:
In a distributed grid environment with ambitious service demands the job submission and management interfaces provide functionality of major importance. Emerging e-science and grid infrastructures such as EGEE and DEISA rely on highly available services that are capable of managing scientific jobs. It is the adoption of emerging open standard interfaces which allows the distribution of grid resources in such a way that their actual service implementation or grid technologies are not isolated from each other, especially when these resources are deployed in different e-science infrastructures that consist of different types of computational resources. This paper motivates the interoperability of these infrastructures and discusses solutions. We describe the adoption of various open standards that recently emerged from the open grid forum (OGF) in the field of job submission and management by well-known grid technologies, respectively gLite and UNICORE. This has a fundamental impact on the interoperability between these technologies and thus within the next generation e-science infrastructures that rely on these technologies.
-
M.S. Memon, A.S. Memon, M. Riedel, B. Schuller, D. Mallmann, B. Tweddell, A. Streit, S. van den Berghe, D. Snelling, V. Li, M. Marzolla, P. Andreetto
Enhanced Resource Management Capabilities using Standardized Job Management and Data Access Interfaces within UNICORE Grids
In Proceedings of 3rd Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS), The 13th International Conference on Parallel and Distributed Systems (ICPADS), Hsinchu, IEEE Computer Society, ISBN:978-1-4244-1889-3, pp. 1-6, 2007
[ IEEE ] [ Juelich ]
Abstract:
Many existing Grid technologies and resource management systems lack a standardized job submission interface in Grid environments or e-Infrastructures. Even if the same language for job description is used, often the interface for job submission is also different in each of these technologies. The evolvement of the standardized Job Submission and Description Language (JSDL) as well as the OGSA - Basic Execution Services (OGSA-BES) pave the way to improve the interoperability of all these technologies enabling cross-Grid job submission and better resource management capabilities. In addition, the BytelO standards provide useful mechanisms for data access that can be used in conjunction with these improved resource management capabilities. This paper describes the integration of these standards into the recently released UNICORE 6 Grid middleware that is based on open standards such as the Web Services Resource Framework (WS-RF) and WS-Addressing (WS-A).
-
W. Frings, M. Riedel, A. Streit, D. Mallmann, S. van den Berghe, D. Snelling, and V. Li
LLview: User-Level Monitoring in Computational Grids and e-Science Infrastructures
In Proceedings of German e-Science Conference 2007, Baden-Baden, Germany, 2007
[ CiteSeerX ] [ Juelich ]
Abstract:
Large-scale scientific research often relies on the collaborative use of Grid and e-Science infrastructures that offer a wide variety of Grid resources for scientists. While many production Grid projects and e-Science infrastructures have begun to offer services for the usage of computational resources to end-users during the past several years, the absence of a widely accepted standard for tracing resource usage of Grid users has lead to different technologies among the infrastructures. Recently, the Open Grid Forum developed a set of emerging standard specifications, namely the Usage Record Format (URF) and the Resource Usage Service (RUS) that aim to manage and expose user tracings. In this paper, we present the integration of these standards into the UNICORE Grid middleware that lays the foundation for valuable tools in the area of accounting and monitoring. We present the development of Grid extensions for the LLview application, which allows to monitor the utilization (e.g. usage of cluster nodes per users) of Grid resources controlled by Grid middleware systems such as UNICORE.
2006
M. Riedel, R. Menday, A. Streit, and P. Bala
A DRMAA-based Target System Interface Framework for UNICORE
In Proceedings of 2nd Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS), The 12th International Conference on Parallel and Distributed Systems (ICPADS), Mineapolis, IEEE Computer Society, ISBN:0-7695-2612-8, pp. 133 - 138, 2006
[ IEEE ] [ Juelich ]
Abstract:
The UNICORE grid technology provides a seamless, secure, and intuitive access to distributed grid resources. UNICORE is a full-grown and well-tested grid middleware system that is used in daily production and research projects worldwide. The success of the UNICORE technology can at least partially be explained by the fact that UNICORE consists of a three-tier architecture. In this paper, we present the evolution of one of its tiers that is mainly used for job and resource management. This evolution integrates the distributed resource management application API (DRMAA) of the Global Grid Forum providing UNICORE with a standardized interface to underlying resource management systems and other grid systems.
-
M. Riedel and D. Mallmann
Standardization Processes of the UNICORE Grid System
In Proceedings of 1st Austrian Grid Symposium, Schloss Hagenberg, Austria, ISBN:3-85403-210-2, pp. 191 - 203, 2006
[ CiteSeerX ] [ Juelich ]
Abstract:
The UNICORE Grid system has been developed since the late 1990s to support distributed computing applications and emerging Grid infrastructures. Over the years, UNICORE has evolved to a full-grown and well-tested Grid middleware system, which today is used in daily production at many supercomputing centers worldwide. Also, the UNICORE technology serves as a solid basis in many European and International research projects. In this paper, we present issues surrounding the integration of standards into the UNICORE Grid system. We summarize here the principal characteristics of the latest Web services-based Unicore/GS release, which provides significant enhancements in the areas of interoperability, standards compliance and functionality.
-
R. Ratering, A. Lukichev, M. Riedel, D. Mallmann, A. Vanni, C. Cacciari, S. Lanzarini, K. Benedyczak, M. Borcz, R. Kluszcynski, P. Bala, G. Ohme
GridBeans: Support e-Science and Grid Applications
Proceedings of the 2nd IEEE International Conference on e-Science and Grid Computing, Amsterdam, Netherlands, pp. 46-54, 2006
[ DOI ] [ Juelich ]
Abstract:
Large-scale scientific research often relies on the collaborative use of Grid and e-Science infrastructures that provide computational or storage related resources. One of the ideas of these modern infrastructures is to facilitate the routine interaction of scientists and their workflows with advanced problem solving tools and computational resources. While many production Grid projects and e-Science infrastructures have begun to offer services for the usage of resources to end-users during the past several years, the corresponding emerging standards defined by GGF and OASIS still appear to be in flux. In this paper, we present the GridBean technology that bridges the gap between the constantly changing basic Grid or e-Science infrastructures and the need of stable application development environments for the Grid users.
2005
M. Riedel, D. Mallmann, and A. Streit
Enhancing Scientific Workflows with Secure Shell Functionality in UNICORE Grids
Proceedings of the 1st IEEE International Conference on e-Science and Grid Computing, Melbourne, Australia, pp. 132-139, 2005
[ DOI ] [ Juelich ]
Abstract:
The UNICORE grid technology provides a seamless, secure and intuitive access to distributed grid resources such as computational or storage related resources. In addition, its extensible character through application-specific plug-ins and its enhancements developed in various European-funded projects leads to the UNICORE technology that is used in daily production at many supercomputer centers and research facilities world-wide today. In this paper we present an enhancement that provides the dynamic capabilities of a secure shell terminal within the UNICORE grid technology while single sign-on remains. This enhancement allows the integration of the dynamic work-behavior of scientists, or existing scientific applications, to be more integrated into the usual workflow with UNICORE and therefore in collaborative grid environments. As a well-known tool in the scientific community, a secure shell terminal provides the most flexible way of working on remote systems that no graphical user interface or advanced tooling in grid computing can ever provide.
-
M. Riedel, V. Sander, Ph. Wieder, and J. Shan
Web Services Agreement based Resource Negotiation in UNICORE
Proceedings of the 2005 International Conference on Parallel and Distributed Processing Techniques and Applications
(PDPTA’05), Hamid R. Arabnia (ed.), CSREA Press, pp. 31-37, 2005
[ WorldCat ] [ Juelich ]
Abstract:
Service Level Agreements provide the foundation to negotiate for a distinct Quality of Service level between the provider and the consumer of a service. Since the Grid community is adopting concepts of Service-Oriented Architectures and Web Services are capturing their space within the Grid landscape, resource management within Grids increasingly evolves towards the management of resources represented as services. To make allowance for this the Global Grid Forum develops the Web Services Agreement specification to support standardised creation and negotiation of guarantees related to services. This paper illustrates the integration of a Web Services Agreement-based resource management framework into the UNICORE Grid system, a development motivated by the system’s transition towards a service-oriented Grid and the limitations of the current solution..
-
M. Riedel, V. Sander, and M. Fidler
Self-managing Functions in Web Services Agreement-based Autonomic Grids
Proceedings of 1st IEEE/IFIP International Workshop on Autonomic Grid Networking and Management (AGNM`05), 1st International Week on Management of Networks and Services (Manweek05), Barcelona, pp. 11-20, 2005
[ Juelich ]
Abstract:
Recently, the Web Services Agreement work performed within the Global Grid Forum introduces a generic approach that, in principle, provides an abstraction layer that allows for the creation of distributed service management approaches by the abstraction of managing service level agreements within autonomic resources. Such resources were modelled in a service-oriented way and comprise the functionality to create Multi-Protocol Label Switching tunnels as well as routing and signaling functionality for the establishment of Label Switched Paths. Self-management capabilities, such as self-creation, self-healing and self-monitoring that enable autonomic resources in Grids are described and thus lay the foundation for distributed network control planes in autonomic Grids and end-to-end management planes for emerging Autonomic Grid Networking.
-
M. Riedel
Object Migration Pattern: Towards Stateful Web Services in Grid Environments
Proceedings of the 3rd IEEE European Conference on Web Services (IEEE ECOWS 2005), Växjö, Sweden, ISBN: 0-7695-2484-2, Online Publication, 2005
[ CiteSeerX ] [ Juelich ]
Abstract:
There is a real demand to migrate existing software architectures and business process implementations towards modern Service Oriented Architectures. In practice, however, Web Services are the most used technology for implementations of such Service Oriented Architectures. Recently, developments in this area are often combined with the advantages of Grid computing through the use of Open Grid Services Architecture concepts. This leads to stateful Web Services that are quite similar to stateful objects in Object Oriented Systems. In this paper, we formalize details of an Object Migration Pattern that provides aspects on how an existing Object Oriented System can be migrated through the use of Open Grid Services Architecture concepts to a completely heterogenous and distributed system that characterize modern Grids. This pattern lays the foundation to provide advanced tooling for the recently proposed Web Service Resource Framework and thus allows an effective use of Grid resources such as supercomputers or clusters via dedicated services.
-
A. Streit, D. Erwin, T. Lippert, D. Mallmann, R. Menday, M. Rambadt, M. Riedel, M. Romberg, B. Schuller, P. Wieder
UNICORE - From Project Results to Production Grids
Grid Computing: The New Frontier of High Performance Computing, Advances in Parallel Computing, Vol. 14, pp. 357-376, Elsevier B.V., 2005
[ DOI ] [ Juelich ]
Abstract:
The UNICORE Grid-technology provides a seamless, secure and intuitive access to distributed Grid resources. In this paper we present the recent evolution from project results to production Grids. At the beginning UNICORE was developed as a prototype software in two projects funded by the German research ministry (BMBF). Over the following years, in various European-funded projects, UNICORE evolved to a full-grown and well-tested Grid middleware system, which today is used in daily production at many supercomputing centers worldwide. Beyond this production usage, the UNICORE technology serves as a solid basis in many European and International research projects, which use existing UNICORE components to implement advanced features, high level services, and support for applications from a growing range of domains. In order to foster these ongoing developments, UNICORE is available as open source under BSD licence at Source Forge, where new releases are published on a regular basis. This paper is a review of the UNICORE achievements so far and gives a glimpse on the UNICORE roadmap.