Beyond Explainable AI: How to Build Trust in AI Systems

According to McKinsey, AI will add $13 trillion to the global economy by 2030. Despite the explosive growth, companies are struggling to scale up their AI efforts and introduce algorithms or products without bias. The widespread adoption of AI requires new standards and practices. The field of Explainable AI (XAI) is the most popular method for tackling the black box nature of machine learning models. However, it is time to explore new methods to build trust in AI systems.

As a next step in establishing trust, researchers are focusing on assessing the biases of AI systems. This article presents a framework for assessing trust at different stages of a typical data science workflow. We argue for the need to go beyond XAI on bias assessment for automated decision-making systems.

Why we need to go beyond XAI

Bias is an integral part of the human experience. Thanks to our biased judgment, the AI ​​inherited these biases from us. It becomes important for leaders to understand the organizational and cultural barriers that AI initiatives face and work to mitigate them by educating the workforce on ethics, changing traditional mindsets, and bringing innovations without prejudice.

We have seen a lot of racial, gender and other biases in AI systems. For example, e-commerce giant Amazon canceled a model – used to score job applicants – to penalize women. Content personalization and ad ranking systems have also been in the dock for racial and gender profiling.

The bias begins long before the product is deployed. In machine learning, this is called “model bias”. Machine learning models with 100% accuracy do not always mean that the model is stable and learning; ML practitioners and data scientists should also check for biases in the data, algorithm, and model before production.

XAI is a good tool for describing model predictions in a human-interpretable way. For example, feature assignment methods such as SHAP and LIME, in most cases, can make black-box machine models explainable. But can XIA single-handedly create trustworthy AI?

To build trust in AI, we need tools to mitigate bias. Ethical AI is primarily concerned with detecting bias before and after model predictions. However, the bias mitigation prowess of Ethical AI is minimal and we have yet to develop robust and easily deployable techniques.

Below we look at the phases of a typical data science workflow where trust can be well defined and assessed. Much of the discussion centers around introducing an empirical framework to assess trust in AI systems and the metrics used for this purpose at different stages of the data science workflow.

Any data science activity such as technical robustness, security, data governance, accountability, societal and environmental well-being can be integrated seamlessly into our framework.

The Human-AI Trust Platform is inspired by a research paper by Alon Jacovi on the topic of formalizing trust in AI. First, we define the concept of contractual trust.

Contractual trust

Let’s start with a formal definition of trust given by Mayer If a anticipates that B will act in the best interest of A, and accepts vulnerability to B’s actions, then A trusts B. Here, A can be an organization/user/system that entrusts another organization/data scientist (B) with creating a trustworthy AI model.

However, collaboration between A and B carries risks. However, A anticipates that B will execute its transaction in the best interest of A despite the vulnerability of the processes executed by B. This vulnerability of B is probable, and A is aware of them. For example, having a fair AI model may be A’s anticipation, and a decline in model performance may be B’s vulnerability in the course to obtain a fair AI model. The notion of trust exists if and only if these expectations and vulnerabilities mutually coexist and are recognized by both A and B.

Contractual trust is a concept that dictates and quantifies Trustworthy AI. This is done by incorporating a set of phases into the data science workflow. The customer and the data science organization agree on their orchestration and purpose. In other words, the customer can define his anticipation on the phases to see if they meet predefined conditions or states. At the same time, the data science organization can define the probability vulnerabilities it can happen in the pursuit of anticipation.

Once the anticipation and vulnerabilities are defined and recognized, we can form an action plan consisting of several checkpoints to be assessed at different stages of the data science workflow. The checkpoint can serve as an evaluation mechanism for the data science team by the customer. Checkpoints can be weighted differently depending on customer requirements. For example, if a client requires data anonymization as a primary constraint, this may outweigh model performance or model bias. A clear advantage of the checkpoint is that we can evaluate every step with customer input and ensure transparency in the development process. This way we can assign a Human-AI Trust Score for the entire data science pipeline.


  1. Two-way solution: the customer and the data science team are part of the framework.
  2. Customers can clearly distinguish their expectations, called anticipation in the framework of.
  3. The data science team can list possible flaws that can occur, called vulnerabilities.
  4. The trust framework is well defined on a list of actions.
  5. The entire framework takes into account the interests of both parties.
  6. Checkpoints will allow for step-by-step evaluation by customers and ensure transparency.
  7. Based on the client’s assessment, checkpoints can be updated or new ones can be inserted.
  8. The framework has a scoring mechanism that evaluates the entire workflow with both parties involved.

This article is written by a member of the AIM Leaders Council. AIM Leaders Council is an invitation-only forum of senior executives from the data science and analytics industry. To check if you are eligible for membership, please complete the form here.

Comments are closed.