Publications

DistBack: A low-overhead distributed back-up architecture with Snapshot support DistBack: A low-overhead distributed back-up architecture with Snapshot support

Thomas Mage, Ernst W Biersack

19th IEEE International Workshop on Local and Metropolian Area Networks. April 10-12, 2013 - Brussels, Belgium.

Abstract

There exist many distributed storage systems tolerating failures of participating nodes. However, they require high amounts of metadata and do not focus on a user's need to easily recover a snapshot of their data. In this paper, we describe DistBack, a distributed back-up system that involves always-on home network gateways with the assistance of a reliable data center. We separate the system into swarms in order to ease monitoring and limit the scope of data requests. DistBack introduces index files which comprise metadata necessary to recover a snapshot. To increase efficiency, we embed small files into these index files. We show that this is reasonable due to the low amount of storage space they account for, which in our case is less than 0.1%. As a result, DistBack requires less metadata to relocate data. It supports snapshot based back-up and provides solutions for storing files of different sizes.

Edge-centric Computing: Vision and Challenges Edge-centric Computing: Vision and Challenges

Pedro García-Lopez, Alberto Montresor, Dick Epema, Anwitaman Datta, Teruo Higashino, Adriana Imanitchi, Marinho Barcellos, Pascal Felber,  Etienne Riviere

ACM SIGCOMM Computer Communication Review

Abstract

In many aspects of human activity, there has been a continuous struggle between the forces of centralization and decentralization. Computing exhibits the same phenomenon; we have gone from mainframes to PCs and local networks in the past, and over the last decade we have seen a centralization and consolidation of services and applications in data centers and clouds. We position that a new shift is necessary. Technological advances such as powerful dedicated connection boxes deployed in most homes, high capacity mobile end-user devices and powerful wireless networks, along with growing user concerns about trust, privacy, and autonomy requires taking the control of computing applications, data, and services away from some central nodes (the "core") to the other logical extreme (the "edge") of the Internet. We also position that this development can help blurring the boundary between man and machine, and embrace social computing in which humans are part of the computation and decision making loop, resulting in a human-centered system design. We refer to this vision of human-centered edge-device based computing as Edge-centric Computing. We elaborate in this position paper on this vision and present the research challenges associated with its implementation.

 

Erasure-Coded Byzantine Storage with Separate Metadata Erasure-Coded Byzantine Storage with Separate Metadata

Elli Androulaki, Christian Cachini, Dan Dobre, and Marko Vukolić

Principles of Distributed Systems, 18th International Conference, OPODIS 2014, Cortina d'Ampezzo, Italy, December

Abstract

Although many distributed storage protocols have been introduced, a solution that combines the strongest properties in terms of availability, consistency, fault-tolerance, storage complexity, and concurrency has been elusive so far. Combining these properties is difficult, especially if the resulting solution is required to be efficient and incur low cost. We present AWE, the first erasure-coded distributed implementation of a multi-writer multi-reader read/write register object that is, at the same time: (1) asynchronous, (2) wait-free, (3) atomic, (4) amnesic, (i.e., nodes store a bounded number of values), and (5) Byzantine fault-tolerant (BFT), using the optimal
number of nodes. AWE maintains metadata separately from bulk data, which is encoded into fragments with a k-out-of-n erasure code and stored on dedicated data nodes that support only simple reads and writes. Furthermore, AWE is the first BFT storage protocol that uses only n = 2t + k data nodes to tolerate t Byzantine faults, for any k ≥ 1. Metadata, on the other hand, is stored using an atomic snapshot object, which may be realized from 3t + 1 metadata nodes for
tolerating t Byzantine faults. AWE is efficient and uses only lightweight cryptographic hash functions. Moreover, we show that hash functions are needed by any BFT distributed storage
protocol that stores the bulk data on 3t or fewer data nodes.

eWave: Leveraging Energy-Awareness for In-line Deduplication Clusters eWave: Leveraging Energy-Awareness for In-line Deduplication Clusters

Raúl Gracia-Tinedo, Marc Sánchez-Artigas, Pedro García-López

7th ACM International Systems and Storage Conference. June 10-12, 2014. Haifa, Israel.

Abstract

In-line deduplication clusters provide high throughput and scalable storage/archival services to enterprises and organizations. Unfortunately, high throughput comes at the cost of activating several storage nodes on each request, due to the parallel nature of superchunk routing. This may prevent storage nodes from exploiting disk standby times to preserve energy, even for low load periods. We aim to enable deduplication clusters to exploit load valleys to save up disk energy. To this end, we explore the feasibility of deferred writes, diverted access and workload consolidation in this setting.

We materialize our insights in eWave: a novel energy-efficient storage middleware for deduplication clusters. The main goal of eWave is to enable the energy-aware operation of deduplication clusters without modifying the deduplication layer. Via extensive simulations and experiments in an 8−machine cluster, we show that eWave reduces disk energy from 16% to 60% in common scenarios with moderate impact on performance during low load periods.

Giving form to social cloud storage through experimentation: Issues and insights Giving form to social cloud storage through experimentation: Issues and insights


Raúl Gracia-Tinedo, Marc Sánchez-Artigas, Aleix Ramírez, Adrián Moreno-Martínez, Xavier León, Pedro García-López

Future Generation Computer Systems, Vol. 40. 2014, pp. 1-16.

Abstract

In the last few years, we have seen a rapid expansion of social networking. Digital relationships between individuals are becoming capital for turning to one another for communication and collaboration. These online relationships are creating new opportunities to define socially oriented computing models. In this paper, we propose to leverage these relationships to form a dynamic ``social cloud'' for storage. While at first glance, the concept of social cloud looks very appealing, a deeper analysis brings out many problems, particularly in data availability. To overcome this issue, in addition to digital friends, we propose to the members of the social cloud the use of online storage services like Amazon S3 to store data and improve data availability. Through a real deployment in our campus, we study what aspects give form to the definition of social cloud storage and determine the difficulty of realizing this concept in the real world. Our analysis reveals interesting insights of how to reap the full potential of socially oriented storage.

Giving Wings to Your Data: A First Experience of Personal Cloud Interoperability Giving Wings to Your Data: A First Experience of Personal Cloud Interoperability

Raúl Gracia-Tinedo, Cristian Cotes, Edgar Zamora-Gómez, Genís Ortiz, Adrián Moreno-Martínez, Marc Sánchez-Artigas, Pedro García-López. Raquel Sánchez, Alberto Gómez and Anastasio Illana

Elsevier Future Generation Computer Sytems (2017)

Abstract

Personal Clouds are becoming increasingly popular storage services for end-users and organizations. However, the competition among Personal Clouds, their proprietary nature and the heterogeneity of synchronization protocols have led to a complete lack of interoperability among them. Regrettably, this situation impedes that users share data transparently across multiple providers. Even worse, the lack of interoperability has associated serious risks, such as vendor lock-in, in which users get trapped in a single provider due to the cost of switching to another one.

In this work, we contribute DataWings: The first interoperability protocol for Personal Clouds. DataWings consists of an authentication management protocol and a storage API for file storage, synchronization and sharing that adhere to the current authentication (OAuth) and REST standards, respectively. Moreover, we demonstrate the feasibility of DataWings by implementing the protocol in various providers (NEC, StackSync, eyeOS) and performing a real deployment evaluated with real trace replays of production systems (UbuntuOne, NEC). To our knowledge, this is the first real-world experience of Personal Cloud interoperability.

Our experiments provide new insights on the performance implications that different types of user activity and the underlying sharing network topology have on the implementation of our protocol. We conclude that DataWings is flexible enough to leverage interoperability for heterogeneous Personal Clouds, opening the door for a broader adoption by other vendors.

Hybris: Efficient and Robust Hybrid Cloud Storage Hybris: Efficient and Robust Hybrid Cloud Storage

Dan Dobre, Paolo Viotti, Marko Vukolić

5th annual ACM Symposium on Cloud Computing. November 3-5, 2014. Seattle.

Abstract

Besides well-known benefits, commodity cloud storage also raises concerns that include security, reliability, and consistency. We present Hybris key-value store, the first robust hybrid cloud storage system, aiming at addressing these concerns leveraging both private and public cloud resources.

Hybris robustly replicates metadata on trusted private premises (private cloud), separately from data which is dispersed (using replication or erasure coding) across multiple untrusted public clouds. Hybris maintains metadata stored on private premises at the order of few dozens of bytes per key, avoiding the scalability bottleneck at the private cloud. In turn, the hybrid design allows Hybris to efficiently and robustly tolerate cloud outages, but also potential malice in clouds without overhead. Namely, to tolerate up to f malicious clouds, in the common case of the Hybris variant with data replication, writes replicate data across f + 1 clouds, whereas reads involve a single cloud. In the worst case, only up to f additional clouds are used. This is considerably better than earlier multi-cloud storage systems that required costly 3 f + 1 clouds to mask f potentially malicious clouds. Finally, Hybris leverages strong metadata consistency to guarantee to Hybris applications strong data consistency without any modifications to the eventually consistent public clouds.

We implemented Hybris in Java and evaluated it using a series of micro and macrobenchmarks. Our results show that Hybris significantly outperforms comparable multi-cloud storage systems and approaches the performance of barebone commodity public cloud storage.

Hybris: Efficient and Robust Hybrid Cloud Storage Hybris: Efficient and Robust Hybrid Cloud Storage

Dan Dobre, Paolo Viotti, Marko Vukolić

Work in progress session, 12th USENIX Conference on File and Storage Technologies

Impact of Instance Seeking Strategies on Resource Allocation in Cloud Data Centers Impact of Instance Seeking Strategies on Resource Allocation in Cloud Data Centers

Hao Zhuang, Xin Liu, Zhonghong Ou, Karl Aberer

IEEE 6th International Conference on Cloud Computing. Santa Clara Marriott, CA, USA, 2013.

Abstract

With the prosperity of cloud computing, an increasing number of Small and Medium-sized Enterprises (SMEs) move their business to public clouds such as Amazon EC2. To help tenants deploy services in the cloud, researchers either conduct performance evaluations or design mechanisms and software on seeking virtual machines of better performance. However, few studies have investigated the impact of instance seeking strategies on resource allocation in clouds if every tenant starts to apply the same method to find the better-performing virtual machine. In this paper, we propose a cloud and a tenant model in order to simulate the process of tenants' seeking better-performing instances in the cloud. We discuss, implement and evaluate six cloud resource allocation strategies and five instance seeking strategies. We perform the evaluation via simulation based on real data traces. Our results show that instance seeking strategies can cause the exhaustion of better-performing instances and significant request growth in the cloud. Furthermore, we find that tenants could save time and budget through collaborative seeking strategies. Finally, we discuss the implications of our findings from perspectives of both tenants and providers.

Implicit BPM: a Business Process Platform for Transparent Workflow Weaving Implicit BPM: a Business Process Platform for Transparent Workflow Weaving

Rubén Mondéjar, Pedro García-López, Carles Pairot, Enric Brull

Business Process Management (BPM). 2014. pp. 168-183.

Abstract

The integration of business processes into existing applications involves considerable development efforts and costs for IT departments. This precludes the pervasive implementation of BPM in organizations where important applications remain isolated from the existing workflows.

In this paper, we introduce a novel concept, Workflow Weaving, based on non-intrusive techniques, which achieves transparent integration of business processes into organizational applications. This concept relies on BPM standards, Aspect Oriented Programming, and Web patterns to transparently weave business models among current web applications. A prototype platform is presented, which includes our design of a distributed architecture, and a natural and expressive DSL.

You are here: Home Publications Publications