Pergamos - Library and Information Center of National and Kapodistrian University of Athens

Unit:

Κατεύθυνση / ειδίκευση Διαχείριση Πληροφορίας και Δεδομένων (ΔΕΔ)
Πληροφορική

Deposit date:

2021-01-11

Year:

2021

Author:

Mouratidis Theofilos

Supervisors info:

Μέμα Ρουσσοπούλου, Καθηγήτρια, Τμήμα Πληροφορικής και Τηλεπικοινωνιών

Original Title:

Optimizing the recovery of data consistency gossip algorithms on distributed object-store systems (CEPH)

Languages:

English
Greek

Translated title:

Optimizing the recovery of data consistency gossip algorithms on distributed object-store systems (CEPH)

Summary:

The data growth on the internet is increasing rapidly and systems for storing and preserving the sheer volume of information are nowadays on the rise. Ceph is a distributed storage system for handling large amounts of data, it was initially developed by Sage Weil (Redhat) and it is gaining popularity over the years. Ceph is being used as a system for big data storage in large companies such as CISCO, CERN and Deutche Telekom. Although a popular system, as any other distributed system, its individual components fail over the course of time. In this case, the recovery mechanisms need to take place to resolve any issues. In this thesis, we introduce a new way to synchronise the data between the replicas to make the data consistent, by identifying and filtering unchanged objects. The current algorithm for recovery in Ceph is a durable yet simple implementation regarding disk access and memory consumption. As the technology evolves and faster storage solutions emerge (e.g. PCIe SSDs), practices such as Write-Ahead Logging for data consistency can also introduce new problems. Having thousands of write operations logged per second under a degraded cluster can rapidly increase memory consumption and fail a storage node (degraded is a cluster state in which a storage node is down for any reason). Although, Ceph now supports an upper limit on the number of entries in its WAL, this limit is often reached and it invalidates the log, because any new entries will be lost. Therefore, the system is left to check every object of the replicas so it can synchronize them, which is a very slow process. Hence, we introduce the Merkle trees as an alternative solution to Bloom filters so the recovery procedure can identify regions where objects were not modified and thus reduce the recovery time. The recovery process has an observable impact on the users’ IO bandwidth, and the overall experience for them can be improved by reducing the cluster’s recovery times. The benchmarks show a performance increase of 10% to 400% that varies with how many objects were affected during the downtime of a node.

Main subject category:

Technology - Computer science

Keywords:

optimization, gossip algorithms, object-store, Merkle trees, bloom filters

Index:

Yes

Number of index pages:

Contains images:

Yes

Number of references:

Number of pages:

File: