From noise to value – the role of data in Predictive Maintenance
Impressive Predictive Maintenance dashboards look great in presentations, but in practice, every model is only as good as the data that feed it. According to ARC Advisory Group, up to 87% of industrial AI projects never move beyond the pilot phase, and the average annual losses caused by poor data quality reach USD 12.9 million.
No wonder that during Smart RDM workshops, half of the questions are no longer about algorithms but about… tags: “How do I know that P3_TEMP_A is the same signal as FILLER-TEMP-123?”. This article will show you how to create a “single source of truth” – a logically structured, secure, and real-time data repository that becomes the foundation not only for Predictive Maintenance analytics but for the entire digital factory ecosystem.
At ConnectPoint, since data quality is our top priority, our Smart RDM analytical platform goes beyond dashboards. We implement complete Predictive Maintenance use cases — from tag and data model standardization, through streaming ingestion and model training, to recommendations, alerts, and maintenance tasks with full status tracking — ensuring that predictions translate into real downtime reduction.
What does “Data Maturity” mean and where to start?
Data maturity simply reflects how well an organization manages its data — from collection to decision-making. At the lowest level, data are scattered, inconsistent, and unreliable. At the highest, they are integrated, standardized, up-to-date, and connected to business processes. This maturity determines whether a plant truly benefits from AI or just from “pretty charts.”
Before jumping into predictive models or AI, it’s worth understanding what constitutes a solid data architecture in industry. Each of the following stages represents a step toward organization, standardization, and securing information flowing from devices, sensors, and IT/OT systems.
They can be seen as layers of a “data maturity pyramid” – from diagnosing the current state to integration, codification, quality control, and cybersecurity. Each plays a specific role in building a reliable Predictive Maintenance ecosystem.
1. OT/IT Data Audit – the Factory X-Ray
Data maturity doesn’t start with building a data warehouse but with understanding what you already have. During ConnectPoint’s audits, we first map all devices and parameters — from vibration sensors to process variables — often based on ISA-95 levels or other industry-specific data organization standards.
It often turns out that the biggest gaps are not in hardware but in processes: PLC transmissions are not redundant, SCADA stores history for only some tags, and CMMS includes failure codes unknown to IT.
The audit results in a “thermal pain map” showing which data gaps threaten production KPIs and where investment brings the fastest ROI.
2. IIoT Layer – the gateway that speaks every dialect
If the audit is an X-ray, the edge gateway is the heart of the new architecture. Its task is to communicate with diverse protocols — from legacy Modbus to modern OPC UA — and transform them into a lightweight, event-driven MQTT stream.
In practice, this means that vibration data from a hydraulic press and temperature from a filler PLC reach the same queue in under five seconds.
With dual buffering, no data are lost even during temporary network failures, while X.509 certificates ensure that no unauthorized packet is injected.
This stage — not the Machine Learning model — determines whether a clean river of data will flow further, or a muddy puddle.
3. Hot / Warm / Cold Data Repository – three temperatures, three purposes
Once the stream flows, it needs proper storage. The Hot / Warm / Cold architecture works like a boiler, thermos, and refrigerator:
-
Hot stores “boiling” data from the last 48–72 hours – used for alarms and operator dashboards.
-
Warm holds averaged minutes or hours, ideal for shift reports where trends matter more than micro-anomalies.
-
Cold – the cheapest layer – archives months and years of history for ML models and audits.
An ARC study (“Industrial Data Infrastructure 2025”) showed that a three-tier retention model reduces total storage costs by about 30%, because:
-
The freshest, most frequently used data stay on fast (but expensive) NVMe storage.
-
Older data are compressed and automatically offloaded to cheaper object storage.
A side effect is faster queries: in the Hot layer, average response time drops from 220 ms to 140 ms, enabling near real-time alerts and smoother operator screens.
4. Signal Codification – the lingua franca of your factory
Even perfectly retained, millisecond-accessible data are useless if the data model doesn’t clearly specify what each signal measures and in which unit.
That’s why the ISA-5.1 standard (Instrumentation Symbols and Identification) promotes a clear Object–Section–Sensor–Unit scheme.
For example, LN3_Filler_Press_bar immediately tells you this is pressure (Press) measured in bar, at the filler station (Filler) on line 3.
Without this clarity, engineers may compare unrelated data, and algorithms lose context during model training.
The data model, part of the Central Process Data Repository, harmonizes data and provides business context. With a unified production database, operators, planners, and data scientists finally speak the same language, and every change — who, when, why — is logged and auditable.
As a result, Predictive Maintenance models learn not from mysterious “variables X1, X2,” but from clearly described signals tied to real machines and processes.
5. Process Map – the relationships algorithms need
Tag codification tells us what we measure; the process map tells us where and how everything is connected.
Imagine a plant layout showing not only each sensor’s location but also which line elements interact — both mechanically and energetically.
The Smart RDM data model follows this principle: it defines what belongs to what (line → module → element → sensor) and how components cooperate (e.g., pump → motor).
In Smart RDM, this network of dependencies is stored in the Central Data Repository: each record represents a physical asset, and each attribute — a real measurement.
When a PdM model detects increased pump vibration, it doesn’t raise a blind alert. It checks the map — how many motors drive the pump, the coupling order, and which component could cause resonance.
If dependencies point to the motor, the system assigns the event to the correct asset and creates a maintenance work order exactly where needed.
This “intelligent map” enables:
-
Fewer false alarms — the system understands how the machine works.
-
Faster root cause diagnosis — from hours to minutes.
-
Better model learning — seeing how replacing one component affects others.
The result: Predictive Maintenance stops being a generator of anonymous alerts and becomes a technical advisor pointing to the most probable cause of failure.
6. Data Quality Validation – Real-Time Control
Where data streams boil at thousands of records per second, error × time = exponential cost.
Smart RDM enables real-time “auditing” of process data and assessing its analytical usefulness. Below are some examples of input data quality tests:
| Test runtime | What It Checks | Example Reaction |
| Unit Mismatch | Does the unit match the tag dictionary? | If TEMP_C suddenly arrives as 310 (Kelvin), the system converts to 37 °C, logs incident DQ-001, and notifies the data steward. |
| Dead-Band | Has the signal remained static longer than allowed (e.g., 10 min)? | If the filler flowmeter reads 0.00 l/min for 600 s, it triggers a “Suspect Sensor Freeze” alert for maintenance. |
| Timestamp Drift | Does the sensor timestamp match the NTP server (± 500 ms tolerance)? | When deviation > 0.5 s, the gateway resynchronizes and flags the record as “stale.” |
| Range & Spike | Is the value within the process window and without spikes > 5 × σ? | A sudden vibration jump from 2 mm/s to 20 mm/s triggers a “Possible Bearing Failure” notification. |
Each test can run as a microservice, ensuring high throughput without compromising analytical performance.
Results are well-documented: IBM Research found that automatic unit and range validation reduces erroneous records by up to 80%, while McKinsey & Co. reported that implementing dead-band and drift-time tests cuts false process alarms by 25–30%.
Thanks to these mechanisms, predictive models like those in Smart RDM learn from premium-grade data, allowing maintenance teams to focus on causes — not noise filtering.
7. Data Governance and Cybersecurity – who holds the vault key?
The NIS2 Directive (Network & Information Security) from 2023 requires critical service operators — including industrial plants — to maintain a complete OT asset catalog (machines, sensors, controllers, networks) and documented Change Management procedures.
This means every PLC configuration, tag name, or alarm threshold change must leave a trace in the system and be reproducible during security audits.
Smart RDM enables RACI-based role separation, for example:
-
-
Responsible – Line Manager
-
Accountable – Maintenance Manager
-
Consulted – CISO
-
Informed – Operators
-
The Cost of Inconsistent Tagging and Data Models – Profit & Loss Account
Before looking at numbers, understand that a single naming or unit error in one tag can “infect” the entire decision chain — from shop-floor alerts to board-level reports.
The following table shows how lack of standardization translates into real financial impact:
| Area | Impact of Poor Data Quality | Average cost/risk |
| AI/ML Projects | 87% never reach production | Lost CAPEX, lost revenues |
| Operational Finance | USD 12.9 million annual loss per company | Excess inventory, delayed decisions |
| Macro Scale (U.S.) | USD 3.1 trillion cost of bad data | Higher supply-chain costs |
Order in Data Is the Fastest Path to ROI in Predictive Maintenance
When signals have clear names, models understand dependencies, and data acquisition runs flawlessly, Predictive Maintenance stops being a black box and starts delivering tangible value.
The company saves on downtime, shortens reaction times, and builds a digital organizational memory.
If your factory still loses data in a maze of protocols and spreadsheets — start with an audit. Within three months, you can have a solid foundation where Predictive Maintenance becomes not a gadget, but a competitive advantage.
Want to See How You Compare to the Best?
Download the Comprehensive Predictive Maintenance Implementation Guide and bring order to your data — from unique tag identification to OT segmentation — and turn chaos into actionable intelligence. Predictive Maintenance | A strategy tailored to your company’s needs – ConnectPoint