Summary:
MLInspect [1] is a debugging tool designed for data science pipelines. It detects issues related
to data leaks, unintended data transformations and distribution shifts in ML pipelines
by automatically inspecting their data flows. This tool generates Directed Acyclic Graph
(DAG), which allows its users to look through all stages of pipeline processing or training.
It can also observe the ML pipelines even at the time when they fail to work properly as well
as debug them. Specific nodes of this DAG can have inspections attached to them so that
any possible problems can be detected and diagnosed at. The latter can be either predefined
or customised considering the requirements of this kind of a pipeline. Users can
define custom inspections and extend the tool to meet their specific requirements. This
makes MLInspect a versatile tool for a wide range of debugging and monitoring tasks.
Explainable AI (XAI) is a collection of methods and processes that allow human users
to comprehend and have confidence in the results of machine learning algorithms. XAI
helps describe an AI model, its impacts, and potential biases, ensuring accuracy, fairness,
transparency, and reliable decision-making. There are several approaches to interpret a
machine learning model. These include LIME, SHAP, Integrated Gradients, ALE, ICE
etc. Each has potential of giving different outcomes about the model in question. Therefore,
they should be employed considering different setup steps that could exist in an ML
pipeline.
This thesis enhances MLInspect by integrating explainability features. It patches methods
like LIME, SHAP and others, some described above, into a Directed Acyclic Graph
(DAG) for visualisation. Additionally, it persists trained models in the DAG, facilitating subsequent
inspections. When triggered, a new inspection retrieves the model image from
the DAG and executes explainability methods, storing results alongside the model node.
This structured approach enables seamless integration of explainability into debugging,
enhancing understanding of machine learning models.
Keywords:
Explainability, Directed Acyclic Graph (DAG), Model Persistence, Inspection, Model Interpretation