Autotuning InfluxDB with Regression Techniques

Postgraduate Thesis uoadl:3390965 44 Read counter

Unit:
Κατεύθυνση Διαχείριση Δεδομένων, Πληροφορίας και Γνώσης
Πληροφορική
Deposit date:
2024-03-01
Year:
2024
Author:
Karageorgou Ioanna
Supervisors info:
Αλέξης Δελής Καθηγητής ΕΚΠΑ
Original Title:
Autotuning InfluxDB with Regression Techniques
Languages:
English
Translated title:
Autotuning InfluxDB with Regression Techniques
Summary:
In today's data-driven landscape,
the performance of database systems
plays a crucial role in the functionality
of various applications and services.
As data volumes continue to surge,
optimizing database performance
has become a critical concern
for organizations across diverse sectors.
This thesis focuses on the intricate task
of enhancing the performance
of the time-series database, InfluxDB.
The primary goal of this research is
to provide a systematic method for
tuning InfluxDB's configuration parameters
to improve query performance.
We aim to uncover insights that can be
readily applied by database administrators,
operators, and organizations
to extract optimal performance
from their InfluxDB deployments.
The foundation of this study is
a comprehensive examination of the impact of four key configuration parameters
and how they influence query rates,
making their fine-tuning a vital consideration.
Our approach involves a two-step process,
employing machine learning to predict query rates
and validating these predictions
through practical experiments under various workloads.
Notably, our findings challenge conventional assumptions,
revealing nuanced aspects of database tuning.
For instance, we observe that smaller memory allocations,
can outperform larger configurations,
underscoring the importance of empirical testing.
Our evaluation of the default configuration
under a read-heavy workload shows the robustness
of InfluxDB's default settings,
as none of the adjustments to the "cache-snapshot-memory-size"
parameter outperformed the default configuration.
In essence, this research sheds light on the
complex relationship between database parameters
and real-world workloads,
providing valuable insights
into performance optimization.
Main subject category:
Technology - Computer science
Keywords:
influxdb, knobs, machine learning, benchamark
Index:
Yes
Number of index pages:
2
Contains images:
Yes
Number of references:
21
Number of pages:
70
File:
File access is restricted only to the intranet of UoA.

Master_Thesis.pdf
872 KB
File access is restricted only to the intranet of UoA.