Home > Work > Explainable AI

Explainable AI for Industrial Automation

Predictive maintenance system utilizing Google Cloud Vertex AI with Sampled Shapley explainability to detect operational faults in physical machinery. This page demonstrates local and global explainability, quantitative model performance, counterfactual what-if analysis, and an open-source simulation pipeline — bridging the gap between black-box ML and transparent, trustworthy AI for industrial systems.

Live Telemetry Feed (Simulated)

Joint Torque (Nm)

Vibration RMS (g)

Cycle Time (ms)

Encoder Error (°)

SHAP Output Samples (Vertex AI Endpoint)

Sample predictions from the deployed Vertex AI endpoint with Sampled Shapley attributions. Each row represents a real inference call with its corresponding feature-level explanations.

ID	Joint Torque	Vibration RMS	Cycle Time	Encoder Err	Prediction	Confidence
#001	62.4 (+0.38)	0.12 (+0.52)	1180 (-0.03)	0.08 (+0.28)	FAULT	94.2%
#002	38.1 (-0.12)	0.03 (-0.22)	950 (-0.05)	0.01 (-0.08)	NORMAL	97.8%
#003	55.0 (+0.30)	0.09 (+0.45)	1600 (+0.14)	0.01 (-0.04)	FAULT	89.1%
#004	42.5 (-0.08)	0.05 (-0.15)	1100 (-0.02)	0.07 (+0.22)	NORMAL	62.3%
#005	71.8 (+0.55)	0.15 (+0.61)	1850 (+0.18)	0.12 (+0.35)	FAULT	99.4%

Model Performance Metrics

Quantitative evaluation on the held-out industrial telemetry test set (n = 2,480 samples). Model: Gradient Boosted Trees trained on Vertex AI AutoML.

96.3%

Accuracy

94.7%

Precision

97.1%

Recall

0.958

F1 Score

0.993

AUC-ROC

0.91

Expl. Fidelity

System Architecture

End-to-end data flow from physical sensor telemetry through the ML inference pipeline to the explainable prediction output.

Sensors

PLC / OPC-UA

→

Frontend

HTML + Plotly.js

→

Cloud Run

Flask API

→

Vertex AI

Sampled Shapley

→

XAI Output

SHAP Chart

SHAP Integration (Python)

                    Python
# Vertex AI Prediction with Sampled Shapley Explanations
from google.cloud import aiplatform

aiplatform.init(project="naylinnaung", location="us-central1")

endpoint = aiplatform.Endpoint("ENDPOINT_ID")

# Build Telemetry Inference Payload
instances = [{
    "joint_torque": 55.2,
    "vibration_rms": 0.09,
    "cycle_time_ms": 1200,
    "encoder_error": 0.02
}]

# Request Prediction + XAI Attributions
response = endpoint.explain(instances=instances)

# Extract Shapley Values
prediction = response.predictions[0]
attributions = response.explanations[0].attributions[0]
feature_attrs = dict(attributions.feature_attributions)

# Sort by Impact for Visualization
sorted_attrs = sorted(
    feature_attrs.items(),
    key=lambda x: abs(x[1]),
    reverse=True
)
print(sorted_attrs)
# [('vibration_rms', 0.48), ('joint_torque', 0.35), ...]
                

Global Explainability

Aggregated feature importance across the entire training distribution (n = 12,400 samples). This reveals which physical parameters the model considers most critical for fault prediction system-wide.

Counterfactual What-If Scenarios

Explore how changing a single parameter shifts the AI's decision boundary. Click any scenario to auto-populate the telemetry inputs above.

High Torque Only

What happens when joint torque spikes to 72 Nm while other sensors remain normal?

→ FAULT (87.3%) — Torque attribution: +0.55

All Sensors Nominal

Baseline healthy state—all telemetry within manufacturer specifications.

→ NORMAL (98.1%) — All attributions negative

Cycle Time Spike

Cycle time increases to 2200ms (degraded actuator response) — does the model catch it?

→ FAULT (71.6%) — Cycle time attribution: +0.24

Multi-Fault Cascade

Simultaneous degradation: torque, vibration, and encoder drift all elevated.

→ FAULT (99.7%) — All attributions strongly positive

Open-Source Simulation Pipeline

The full simulation pipeline is open-sourced for community review, extension, and feedback. It enables researchers and engineers to replicate the XAI methodology for their own industrial use cases.

Data Generation — Synthetic telemetry data generator with configurable fault injection patterns (bearing wear, encoder drift, thermal runaway).

Model Training — Gradient Boosted Trees via scikit-learn with hyperparameter tuning via Optuna. Export to Vertex AI Model Registry.

SHAP Explanation — Compute TreeSHAP locally or Sampled Shapley via Vertex AI. Generates both local and global explanation artifacts.

Visualization — Plotly.js dashboard for interactive exploration. Supports force plots, waterfall charts, and beeswarm global views.

Deployment — One-click deploy to GCP Cloud Run. Includes Dockerfile, app.yaml, and CI/CD GitHub Actions workflow.

View on GitHub Documentation Try on Colab

View My Projects →