Responsible and Explainable AI

According to the Britannica dictionary, ethics is an area of study that deals with ideas about what is good and bad behavior. It is a branch of philosophy dealing with what is morally right or wrong, which is then applied to various areas of knowledge, such as engineering. Ethics supports pillars such as integrity, responsibility, objectivity, fairness, confidentiality, honesty, and veracity.

As intelligent systems exert increasing influence over critical decisions across multiple domains, the need for transparency and fairness is paramount. The application of ethical principles to artificial intelligence (AI) becomes essential, as it protects against discrimination and guarantees the explainability of decisions, thus promoting the construction of a more inclusive, reliable, and socially responsible ecosystem.

AI ethics encompasses the principles and guidelines that govern the responsible development, deployment and use of artificial intelligence applications. It involves ensuring fairness and mitigating bias in machine learning algorithms, promoting transparency and explainability in decision-making processes, safeguarding privacy rights, establishing accountability for results/predictions, ensuring the security and maximizing the beneficial impact of AI on society while minimizing potential harm.

There are several challenges to implementing these principles, such as algorithmic biases (language bias, gender bias, stereotyping, etc.), open source, transparency, accountability, and regulation.

Responsible AI

Responsible AI refers to the practice of designing, developing, deploying, and using artificial intelligence systems in a way that aligns with ethical principles, social values, and legal standards. It involves ensuring that AI technologies are developed and implemented in a way that maximizes their benefits while minimizing risks and negative impacts on individuals, communities, and society as a whole.

The implementation of Responsible AI is concerned with aspects such as fairness, transparency, accountability, privacy, and security.

Explainable AI

Explainable AI refers to the ability of artificial intelligence systems to provide understandable explanations of their decisions to humans in a clear and interpretable way. The goal of Explainable AI is to increase transparency, accountability and trustworthiness of AI technologies, especially in critical domains where decisions impact the lives of individuals, such as healthcare, finance and law enforcement.

Explainable AI techniques vary depending on the type of AI model and application. They may include methods such as feature importance analysis implemented with algorithms such as Deep Learning Important FeaTures (DeepLIFT), model-agnostic approaches such as Local Interpretable Model-Agnostic Explanations (LIME), and SHapley Additive exPlanations (SHAP) for prediction accuracy validation and explanation.

Responsible AI and Explainable AI

Responsible AI looks at AI during the planning stages while Explainable AI looks at AI results after the results are computed. Responsible AI and Explainable AI can work together to make better AI.

In Figure 1, we have a simplified example of a AI model development pipeline. In general, we are able to improve results and look for drifts only with the model running in production, and we often cannot know why the model made the decisions/predictions/recommendations.

AI Pipeline showing Set Goals flowing into Data Engineering, which flows into AI Development, which flows into AI Decision, and finally AI Decision flows back into Data Engineering with an arrow labeled Feedback: model drift. — Figure 1: AI pipeline.

In Figure 2, we have the same pipeline but with the Responsible AI stage that aims to apply ethical principles and the Explainable AI stage that seeks to understand and explain the decisions made by the model.

Responsible and Explainable AI Pipeline, showing Set Goals flowing into Responsible AI, which then flows into Data Engineering, which flows into Explainable AI, which flows into AI Development, which flows into AI Decision. AI Decision then flows back into Data Engineering with an arrow labeled Feedback: model drift, bias detection, explainable results. — Figure 2: Responsible and Explainable AI pipeline.

TrustyAI

TrustyAI is a versatile tool designed to provide explanations of decision-making services and predictive models using the following aspects:

Explainability: Enrich model execution information through XAI algorithms, such as LIME and SHAP.
Tracing and accountability: Extract, collect, and publish metadata for auditing and compliance.
Runtime monitoring: Expose services in dashboards to assess data from both a business and an operational perspective.

Show me the code!

The best way to introduce TrustyAI is through a demonstration. The code for this implementation is available in the tarcis-io/demo_trustyai repository, in Jupyter Notebook trustyai.ipynb.

For this demo, we will use the loan dataset available on Kaggle. After downloading it, we can create the dataset object using the pandas library:

from pandas import read_csv

dataset = read_csv('loan_data_set.csv')
dataset.head()

The head method should return the first records of the dataset, as shown in Figure 3.

A table of values depicting the Loan Dataset — Figure 3: Loan dataset.

The next step is to prepare the dataset, creating the features and labels that will be used to create the machine learning classification model. To do this, we will use the scikit-learn library:

from sklearn.model_selection import train_test_split
from sklearn.preprocessing   import LabelEncoder

label_encoder = LabelEncoder()

features = dataset.drop('Loan_ID', axis = 1).drop('Loan_Status', axis = 1)
features = features.apply(label_encoder.fit_transform)

labels = dataset[['Loan_Status']]
labels = labels.apply(label_encoder.fit_transform)

features_train, features_test, labels_train, labels_test = train_test_split(features, labels)

Classification model is a supervised machine learning method where the algorithm tries to predict the correct label from given input data. In this case, the model will try to predict whether the loan was approved or not (column Loan_Status of the dataset) based on the applicant's attributes (columns Gender, Married, Dependents, Education, etc. of the dataset). For its creation and training, we will use the xgboost library:

from xgboost import XGBClassifier

model = XGBClassifier()
model.fit(features_train, labels_train)

We can use the score method, in which the output should be a percentage value, to check the accuracy of the model:

model.score(features_test, labels_test)

0.7012987012987013

Accuracy of the machine learning model

It is possible to improve the accuracy of the model by applying data techniques such as treat missing values and feature engineering.

Now, we will run predictions on randomly chosen test examples. The output should be the classification between unapproved loan and approved loan (value 0 and value 1 respectively, depending on the label_encoder or other form of encoder used in the preparation phase):

example_0 = features_test.iloc[0]
example_1 = features_test.iloc[1]

predictions = model.predict([example_0, example_1])

example_0
Gender                 0
Married                0
Dependents             0
Education              0
Self_Employed          0
ApplicantIncome      290
CoapplicantIncome      0
LoanAmount            89
Loan_Amount_Term       8
Credit_History         0
Property_Area          1
Name: 69, dtype: int64
example_1
Gender                 1
Married                1
Dependents             3
Education              1
Self_Employed          0
ApplicantIncome       96
CoapplicantIncome     66
LoanAmount           122
Loan_Amount_Term       8
Credit_History         1
Property_Area          0
Name: 340, dtype: int64
predictions
array([0, 1])

We can see that for applicant example_0, the model classified it as loan not approved. For applicant example_1, the model classified it as loan approved.

However, why did the model classify applicant example_0 as loan not approved and example_1 as loan approved? Which attributes were taken into consideration or were decisive? Was Gender a decisive parameter? Is Gender a decisive parameter?

To try to explain the predictions of this model, we will use Explainable AI techniques using the trustyai package. First, we need to create a new explainable model based on the previous model:

from trustyai.model import Model

trustyai_model = Model(
    fn              = model.predict,
    feature_names   = list(features_train),
    output_names    = list(labels_train),
    dataframe_input = True
)

Lastly, we need to create an explainer to explain the predictions made by the model. To use the LIME algorithm, we can create a LimeExplainer and use it as follows:

from trustyai.explainers import LimeExplainer

lime_explainer = LimeExplainer()

lime_explanation_0 = lime_explainer.explain(
    model   = trustyai_model,
    inputs  = example_0.astype(float),
    outputs = predictions[0]
)

lime_explanation_0.as_html()['Loan_Status']

The LIME explanation of the model's prediction for example_0 is presented in Figure 4. For this example, the attribute Credit_History (column Saliency, in red) strongly impacted the model's decision for loan not approved.

A table of values depicting LIME Explanation — Figure 4: LIME explanation.

Similarly, to use the SHAP algorithm, we can create a SHAPExplainer and use it as follows:

from trustyai.explainers import SHAPExplainer

shap_explainer = SHAPExplainer(background = features_train[:100].astype(float))

shap_explanation_0 = shap_explainer.explain(
    model   = trustyai_model,
    inputs  = example_0.astype(float),
    outputs = predictions[0]
)

shap_explanation_0.as_html()['Loan_Status']

The SHAP explanation of the model's prediction for example_0 is presented in Figure 5. Just like the LIME algorithm, SHAP considers the Credit_History attribute as the strong influencer on the model's decision (column SHAP Value, in red).

A table of values depicting SHAP Explanation — Figure 5: SHAP explanation.

Conclusion

Increasingly, aspects of our lives will be taken over by intelligent systems, such as whether we will be approved for a loan or whether we are compatible with a job vacancy. Having the ability to explain this decision-making is extremely important and even more so from an ethical perspective.

Explainable AI brings together a series of algorithms, such as LIME and SHAP, that can help with these explanations. TrustyAI is an open source tool that allows us to apply these algorithms to our models and be integrated into the model development lifecycle.

Responsible AI practices such as ethical guidelines, transparency, and bias mitigation are essential to promoting trustworthy AI systems that benefit society. By prioritizing these principles, we can build AI that is not only technically advanced, but also safe, fair, and aligned with human values and needs.

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

Responsible and Explainable AI

Responsible AI

Explainable AI

Responsible AI and Explainable AI

TrustyAI

Show me the code!

Conclusion

How to deploy and benchmark vLLM with GuideLLM on Kubernetes

Getting started with OpenShift APIs for Data Protection

How in-place pod resizing boosts efficiency in OpenShift

Automate Oracle 19c deployments on OpenShift Virtualization

Monitoring OpenShift Gateway API and Service Mesh with Kiali

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue