Input data formats
This section describes the required and optional input data formats.
Time series data format
The primary input is time series data from process sensors, provided as comma-separated value (CSV) files.
Format requirements:
The first row must be a header line, with the first column label being
Time.The first column must contain timestamps in UNIX time (seconds since epoch).
Remaining columns contain raw measurement data for each process tag – no normalization is needed, as this is handled automatically during preprocessing.
Example:
Time,TAG_001,TAG_002,TAG_003
1546300800,45.2,101.3,7.81
1546300860,45.5,101.1,7.79
1546300920,45.3,101.4,7.82
Descriptive labels
Optional descriptive labels can be associated with each process data tag.
Provide these as a CSV file named tag_descriptions.csv with two columns:
First column:
Tag nameSecond column:
Description
Example:
Tag name,Description
TAG_001,Reactor temperature
TAG_002,Feed flow rate
TAG_003,Product pH
These descriptions are used in output reports and graph labels to make results more readable.
Connectivity information
Optionally, you can constrain the analysis to only consider specific connections between process elements. This is useful when plant topology information is available.
Note
Adding connectivity information is not always beneficial. In some cases it can produce poorer root cause analysis, because higher-order connections may play an important role in amplifying a node’s centrality score. Use this option with care and compare results with and without connectivity constraints.