API Reference

faultmap package

faultmap.config_setup

Setup functions used to read configuration files.

class faultmap.config_setup.Locations(data_loc, config_loc, save_loc, infodynamics_loc)[source]

Directories used for data, configuration, results, and JIDT.

Parameters:

data_loc (Path)
config_loc (Path)
save_loc (Path)
infodynamics_loc (Path)

data_loc: Path: Alias for field number 0

config_loc: Path: Alias for field number 1

save_loc: Path: Alias for field number 2

infodynamics_loc: Path: Alias for field number 3

class faultmap.config_setup.CaseSetup(save_loc, case_config_dir, case_dir, infodynamics_loc)[source]

Directories resolved for a specific case run.

Parameters:

save_loc (Path)
case_config_dir (Path)
case_dir (Path)
infodynamics_loc (Path)

save_loc: Path: Alias for field number 0

case_config_dir: Path: Alias for field number 1

case_dir: Path: Alias for field number 2

infodynamics_loc: Path: Alias for field number 3

faultmap.config_setup.ensure_existence(location, make=True)[source]

Parameters:

location (str | Path)
make (bool)

Return type:

Path

Returns:

faultmap.config_setup.get_locations(mode='cases')[source]

Gets all required directories related to the specified mode.

TODO: Remove the need for this by using proper test fixtures

Parameters:: mode (string) – Either ‘test’ or ‘cases’. Specifies whether the test or user configurable cases directories should be set. Test directories are read from test_config.json which is bundled with the code, while cases directories are read from case_config.json which must be created by the user.
Returns:: A named tuple containing data_loc, config_loc, save_loc, and infodynamics_loc paths.
Return type:: Locations

faultmap.config_setup.run_setup(mode, case)[source]

Gets all required directories from the case configuration file.

Parameters:

mode (Literal['test', 'tests', 'cases']) – Either ‘test’ or ‘cases’. Specifies whether the test or user configurable cases directories should be set. Test directories are read from test_config.json which is bundled with the code, while cases directories are read from case_config.json which must be created by the user.
case (str) – The name of the case that is to be run. Points to dictionary in either test or case config files.

Returns:

CaseSetup named tuple containing save_loc, case_config_dir, case_dir, and infodynamics_loc paths.

Return type:

CaseSetup

faultmap.data_processing

Data processing support tasks.

faultmap.data_processing.shuffle_data(input_data)[source]

Returns a (seeded) randomly shuffled array of data. The data input needs to be a two-dimensional numpy array.

Parameters:: input_data (ndarray[tuple[Any, ...], dtype[_ScalarT]])
Return type:: ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.data_processing.gen_iaaft_surrogates(data, iterations)[source]

Generates iterative amplitude adjusted Fourier transform (IAAFT) surrogates

Parameters:

data (ndarray[tuple[Any, ...], dtype[_ScalarT]])
iterations (int)

Return type:

ndarray[tuple[Any, …], dtype[_ScalarT]]

class faultmap.data_processing.ResultReconstructionData(mode, case)[source]

Creates a data object from file and or function definitions for use in array creation methods.

Parameters:

mode (Literal['test', 'tests', 'cases'])
case (str)

setup_scenario(scenario)[source]

Retrieves data particular to each scenario for the case being investigated.

Parameters:: scenario (str)

faultmap.data_processing.process_aux_file(filename, bias_correct=True, mi_scale=False, allow_neg=False)[source]

Processes an auxiliary file and returns a list of affected_vars, weight_array as well as relative significance weight array.

Parameters:

filename (string) – path to auxiliary to process
allow_neg (bool) – if true, allows negative values in final weight arrays, otherwise sets them to zero.
bias_correct (bool) – if true, subtracts the mean of the null distribution off the final value in weight array

faultmap.data_processing.create_arrays(data_dir, variables, bias_correct, mi_scale, generate_diffs)[source]

data_dir is the location of the auxiliary data and weights folders for the specific case that is under investigation

variables is the list of variables

Parameters:: data_dir (Path)

faultmap.data_processing.create_signtested_directionalarrays(datadir, writeoutput)[source]

Checks whether the directional weight arrays have corresponding absolute positive entries, writes another version with zeros if absolutes are negative.

datadir is the location of the auxdata and weights folders for the specific case that is under investigation

tsfilename is the file name of the original time series data file used to generate each case and is only used for generating a list of variables

faultmap.data_processing.extract_trends(datadir, writeoutput)[source]

datadir is the location of the weight_array and delay_array folders for the specific case that is under investigation

tsfilename is the file name of the original time series data file used to generate each case and is only used for generating a list of variables

faultmap.data_processing.result_reconstruction(mode, case)[source]

Reconstructs the weight_array and delay_array for different weight types from data generated by run_weightcalc process.

WIP: For transient cases, generates difference arrays between boxes.

The results are written to the same folders where the files are found.

Parameters:

mode (Literal['test', 'tests', 'cases'])
case (str)

Return type:

None

faultmap.data_processing.trend_extraction(mode, case, write_output)[source]

Extracts dynamic trend of weights and delays out of weight_array and delay_array results between multiple boxes generated by the run_createarrays process for transient cases.

The results are written to the trends results directory.

Parameters:

mode (Literal['test', 'tests', 'cases'])
case (str)
write_output (bool)

Return type:

None

faultmap.data_processing.writecsv(filename, items, header=None)[source]

Write CSV directly

Parameters:

filename (str | Path)
items (list | ndarray[tuple[Any, ...], dtype[_ScalarT]])
header (list[str] | ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)

Return type:

None

faultmap.data_processing.bandgap(min_freq, max_freq, vardata)[source]: Bandgap filter based on FFT/IFFT concatenation

faultmap.data_processing.bandgapfilter_data(raw_tsdata, normalised_tsdata, variables, low_freq, high_freq, saveloc, case, scenario)[source]: Bandgap filter data between the specified high and low frequenices. Also writes filtered data to standard format for easy analysis in other software, for example TOPCAT.

faultmap.data_processing.subtract_mean(inputdata_raw)[source]

Subtracts mean from input data.

Parameters:: inputdata_raw (ndarray[tuple[Any, ...], dtype[_ScalarT]])
Return type:: ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.data_processing.read_connectionmatrix(connection_loc)[source]

Imports the connection scheme for the data. The format of the CSV file should be: empty space, var1, var2, etc… (first row) var1, value, value, value, etc… (second row) var2, value, value, value, etc… (third row) etc…

value = 1 if column variable points to row variable (causal relationship) value = 0 otherwise

Parameters:: connection_loc (str | Path)
Return type:: tuple[ndarray[tuple[Any, …], dtype[_ScalarT]], list[str]]

faultmap.data_processing.read_scale_limits(scaling_loc)[source]

Imports the scale limits for the data. The format of the CSV file should be: var, low, nominal, high, vartype (first row) var1, float, float, float, [‘D’, ‘S’] (second row) var2, float, float, float, [‘D, ‘S’] (third row) etc…

type ‘D’ indicates disturbance variable and maximum deviation will be used type ‘S’ indicates state variable and minimum deviation will be used

Parameters:: scaling_loc (str | Path)

faultmap.data_processing.read_biasvector(biasvector_loc)[source]: Imports the bias vector for faultmap purposes. The format of the CSV file should be: var1, var2, etc … (first row) bias1, bias2, etc … (second row)

faultmap.data_processing.read_header_values_datafile(location)[source]

This method reads a CSV data file of the form: header, header, header, etc… (first row) value, value, value, etc… (second row) etc…

Parameters:: location (str | Path)
Return type:: tuple[ndarray[tuple[Any, …], dtype[_ScalarT]], list[str]]

faultmap.data_processing.read_matrix(matrix_loc)[source]

This method reads a matrix scheme for a specific scenario.

Might need to pad matrix with zeros if it is non-square

Parameters:: matrix_loc (str | Path)
Return type:: ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.data_processing.build_graph(variables, gain_matrix, connections, bias_vector)[source]

Builds a directed graph using the given variables, gain matrix, connections, and bias vector.

Parameters:

variables (list) – A list of variable names.
gain_matrix (numpy.ndarray) – A 2D array of gains.
connections (numpy.ndarray) – A 2D array of connections.
bias_vector (numpy.ndarray) – A 1D array of biases.

Returns:

A directed graph with weights and biases.

Return type:

networkx.DiGraph

faultmap.data_processing.rank_backward(variables, gainmatrix, connections, biasvector, dummyweight, dummycreation)[source]

This method adds a unit gain node to all nodes with an out-degree of 1 in order for the relative scale to be retained. Therefore all nodes with pointers should have 2 or more edges pointing away from them.

It uses the number of dummy variables to construct these gain, connection and variable name matrices.

faultmap.data_processing.get_box_endates(clean_df, window, overlap, freq)[source]

Gets the end dates of boxes from dataframe that are continous over window and guarenteed to have a maximum overlap.

clean_df: clean dataframe with nan assigned to all bad data window: size of window in steps at desired frequency overlap: size of minimum overlap desired in steps at desired frequency

faultmap.data_processing.get_continuous_boxes(clean_df, window, overlap, freq)[source]

Splits a DataFrame into continuous boxes of a specified window size and overlap.

Parameters:

clean_df (pandas.DataFrame) – The DataFrame to split.
window (int) – Window size in number of time steps.
overlap (float) – Overlap between windows as a fraction.
freq (str) – Frequency of the time series, e.g. ‘1H’.

Returns:

(array_boxes, boxdates) where array_boxes is a list: of arrays per box and boxdates is a list of arrays with start/end timestamps per box.

Return type:

tuple

faultmap.data_processing.split_time_series_data(input_data, sample_rate, box_size, box_num)[source]

Splits the input data into arrays useful for analyzing the change of weights over time.

Parameters:

input_data (numpy.ndarray) – A numpy array containing values for a single variable after sub-sampling.
sample_rate (float) – The rate of sampling in time units (after sub-sampling).
box_size (int) – The size of each returned dataset in time units.
box_num (int) – The number of boxes that need to be analyzed.

Returns:

A list of numpy arrays, where each array represents a box of data.

Return type:

list

Notes

Boxes are evenly distributed over the provided dataset. The boxes will overlap if box_size * box_num is more than the simulated time, and will have spaces between them if it is less.

faultmap.data_processing.calc_signal_entropy(var_data, weight_calc_data, estimator='kernel')[source]

Calculates single signal differential entropies by making use of the JIDT continuous box-kernel implementation.

Parameters:

weight_calc_data (WeightCalcData)
estimator (Literal['gaussian', 'kernel', 'kozachenko'])

Return type:

float

faultmap.data_processing.vectorselection(data, timelag, sub_samples, k=1, l=1)[source]

Generates sets of vectors from tags time series data for calculating transfer entropy.

For notation references see Shu2013.

Takes into account the time lag (number of samples between vectors of the same variable).

In this application the prediction horizon (h) is set to equal to the time lag.

The first vector in the data array should be the samples of the variable to be predicted (x) while the second vector should be sampled of the vector used to make the prediction (y).

sub_samples is the amount of samples in the dataset used to calculate the transfer entropy between two vectors and must satisfy sub_samples <= samples

The required number of samples is extracted from the end of the vector. If the vector is longer than the number of samples specified plus the desired time lag then the remained of the data will be discarded.

k refers to the dimension of the historical data to be predicted (x)

l refers to the dimension of the historical data used to do the prediction (y)

Parameters:

data (ndarray[tuple[Any, ...], dtype[_ScalarT]])
timelag (int)
sub_samples (int)
k (int)
l (int)

Return type:

tuple[ndarray[tuple[Any, …], dtype[_ScalarT]], ndarray[tuple[Any, …], dtype[_ScalarT]], ndarray[tuple[Any, …], dtype[_ScalarT]]]

faultmap.datagen

Generates various test and demo data sets.

faultmap.datagen.connection_matrix_maker(n_dims)[source]

Parameters:: n_dims (int)
Return type:: Callable[[], tuple[list[str], ndarray]]

Returns:

faultmap.datagen.connectionmatrix_2x2(): Generates a 2x2 connection matrix for use in test.

faultmap.datagen.connectionmatrix_4x4(): Generates a 4x4 connection matrix for use in test.

faultmap.datagen.connectionmatrix_5x5(): Generates a 5x5 connection matrix for use in test.

faultmap.datagen.seed_random(method, seed, samples)[source]

Set random seed.

Parameters:

method (Callable)
seed (int)
samples (int)

Return type:

ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.datagen.seed_randn(seed, samples)

Set random seed.

Parameters:

seed (int)
samples (int)

Return type:

ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.datagen.seed_rand(seed, samples)

Set random seed.

Parameters:

seed (int)
samples (int)

Return type:

ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.datagen.autoreg_gen(params)[source]

Generates an autoregressive set of vectors.

A constant seed is used for testing comparison purposes.

Parameters:: params (list[int | float])
Return type:: ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.datagen.delay_gen(params)[source]

Generates a normally distributed random data vector and a pure delay companion.

Parameters:: params (list) – List with the first entry being the sample length of the returned signals and the second entry the delay between them.
Returns:: data – Array containing the generated signals arranged in columns.
Return type:: numpy.ndarray

faultmap.datagen.random_gen(params, n=2)[source]

Generates n independent random data vectors

Parameters:

params (list[int])
n (int)

Return type:

ndarray[tuple[Any, …], dtype[_ScalarT]]

faultmap.datagen.autoreg_datagen(delay, timelag, samples, sub_samples, k=1, l=1)[source]

Generates autoreg data for a specific timelag (equal to prediction horizon) for a set of autoregressive data.

sub_samples is the amount of samples in the dataset used to calculate the transfer entropy between two vectors (taken from the end of the dataset). sub_samples <= samples

Currently only supports k = 1; l = 1

You can search through a set of time lags in an attempt to identify the original delay. The transfer entropy should have a maximum value when timelag = delay used to generate the autoregressive dataset.

faultmap.datagen.sinusoid_shift_gen(params, period=100, noise_amplitude=0.1, n_signals=5, add_noise=False)[source]

Generates sinusoid signals together with optionally uniform noise. The signals are shifted by a quarter period.

Parameters:

params (list) – List with the first (and only) entry being the sample length of the returned signals.
period (int, default=100) – The period of the sinusoid in terms of samples.
noise_amplitude (float, default=0.5) – A multiplier for mean-centred uniform noise to be added to the signal. The amplitude of the sine is unity.
n_signals (int, default=5) – How many signals to return.
add_noise (bool, default=False) – If True, noise is added to the sinusoidal signals.

Returns:

data – Array containing the generated signals arranged in columns.

Return type:

numpy.ndarray

faultmap.datagen.sinusoid_gen(params, period=100, noise_amplitude=1.0)[source]

Generates sinusoid signals together with optionally uniform noise. The signals are shifted by a quarter period.

Parameters:

params (list) – List with the first (and only) entry being the sample length of the returned signals.
period (int, default=100) – The period of the sinusoid in terms of samples.
noise_amplitude (float, default=0.5) – A multiplier for mean-centred uniform noise to be added to the signal. The amplitude of the sine is unity.

Returns:

data – Array containing the generated signals arranged in columns.

Return type:

numpy.ndarray

faultmap.datagen.firstorder_gen(params, period=0.01, noiseamp=1.0)[source]: Simple first order transfer function affected variable with sinusoid cause.

faultmap.infodynamics

Methods used in the calculation of transfer entropy. A JIDT wrapper.

faultmap.infodynamics.check_jvm(infodynamics_path)[source]

Check if the Java Virtual Machine (JVM) is started and start it if it is not.

Parameters:: infodynamics_path (str) – The file path to the infodynamics jar file.
Returns:: None

faultmap.infodynamics.setup_te(infodynamics_path, method, **parameters)[source]

Prepares the teCalc class of the Java Infodynamics Toolkit (JIDT) in order to calculate transfer entropy according to the kernel or Kraskov estimator method. Also supports discrete transfer entropy calculation.

Parameters:

infodynamics_path (Path)
method (Literal['kernel', 'kraskov', 'discrete'])

faultmap.infodynamics.calc_te(infodynamics_path, calc_method, affected_data, causal_data, **parameters)[source]

Calculates the transfer entropy for a specific time lag (equal to prediction horizon) between two sets of time series data.

This implementation makes use of the infodynamics toolkit: https://jlizier.github.io/jidt/

The transfer entropy should have a maximum value when time lag = delay used to generate an autoregressive dataset, or will otherwise indicate the dead time between data indicating a causal relationship.

faultmap.infodynamics.setup_mi_calculator(infodynamics_path, method, **parameters)[source]

Instantiates a mutual information class of the Java Infodynamics Toolkit (JIDT) to calculate mutual information according to the kernel or Kraskov estimator method. Also supports discrete mutual information calculation.

The Kraskov method is the recommended method and also provides methods for auto-embedding. The max corr AIS auto-embedding method will be enabled as the default.

Parameters:

infodynamics_path (Path)
method (Literal['kernel', 'kraskov', 'discrete'])

faultmap.infodynamics.setup_entropy_calculator(infodynamics_path, estimator='kernel', kernel_bandwidth=0.1, multivariate=False)[source]

Instantiates an entropy calculator from a class of the Java Infodynamics Toolkit (JIDT) to calculate differential entropy (continuous signals) according to the estimation method specified.

Parameters:

infodynamics_path (path) – Location of infodynamics.jar
estimator (string, default='kernel') – Either ‘kernel’ or ‘gaussian’. Specifies the estimator to use in determining the required probability density functions.
kernel_bandwidth (float) – The width of the kernels for the kernel method. If normalisation is performed, these are in terms of standard deviation, otherwise absolute.
multivariate (bool, default=False) – Indicates whether the entropy is to be calculated on a univariate or multivariate signal.
estimator

Returns:

entropy_calc

Return type:

EntropyCalculator JIDT object

faultmap.infodynamics.calc_entropy(entropy_calculator, data, estimator)[source]

Estimates the entropy of a single signal.

Parameters:

entropy_calculator (EntropyCalculator JIDT object) – The estimation method is determined during initialisation of this object beforehand.
data (one-dimensional numpy.ndarray) – The uni-variate signal.
estimator (Literal['gaussian', 'kernel', 'kozachenko'])

Returns:

entropy – The entropy of the signal.

Return type:

float

Notes

The entropy calculated with the Gaussian estimator is in nats, while that calculated by the kernel estimator is in bits. Nats can be converted to bits by division with ln(2).

faultmap.weightcalc

This module provides methods for calculating the gains (weights) of edges connecting variables in the directed graph.

Calculation of both Pearson’s correlation and transfer entropy is supported. Transfer entropy is calculated according to the global average of local entropy method. All weights are optimized with respect to time shifts between the time series data vectors (i.e. cross-correlated).

The delay giving the maximum weight is returned, together with the maximum weights.

All weights are tested for significance. The Pearson’s correlation weights are tested for significance according to the parameters presented by Bauer2005. The transfer entropy weights are tested for significance using a non-parametric rank-order method using surrogate data generated according to the iterative amplitude adjusted Fourier transform method (iAAFT).

class faultmap.weightcalc.WeightCalcData(mode, case, single_entropies, fft_calc, do_multiprocessing, use_gpu)[source]

Creates a data object from files or functions for use in weight calculation methods.

Parameters:

mode (Literal['test', 'tests', 'cases'])
case (str)
single_entropies (bool)
fft_calc (bool)
do_multiprocessing (bool)

scenario_data(scenario_name)[source]

Retrieves data particular to each scenario for the case being investigated.

Parameters:: scenario_name (str) – Name of scenario to retrieve data for. Should be defined in config file.

faultmap.weightcalc.writecsv_weightcalc(filename, items, header)[source]: CSV writer customized for use in weightcalc function.

faultmap.weightcalc.calculate_weights(weight_calc_data, method, scenario, write_output)[source]

Determines the maximum weight between two variables by searching through a specified set of delays.

Parameters:

weight_calc_data (WeightCalcData)
method (str) – Can be one of the following: ‘cross_correlation’ ‘partial_correlation’ – does not support time delays ‘transfer_entropy_kernel’ ‘transfer_entropy_kraskov’
scenario (str)
write_output (bool)

TODO: Fix partial correlation method to make use of time delays

Returns:

faultmap.weightcalc.weight_calc(mode, case, writeoutput=False, single_entropies=False, calc_fft=False, do_multiprocessing=False, use_gpu=False)[source]

Reports the maximum weight as well as associated delay obtained by shifting the affected variable behind the causal variable a specified set of delays.

Parameters:

mode (str) – Either ‘test’ or ‘cases’. Tests data are generated dynamically and stored in specified folders. Case data are read from file and stored under organized headings in the saveloc directory specified in config.json.
case (str) – The name of the case that is to be run. Points to dictionary in either test or case config files.
single_entropies (bool) – Flags whether the entropies of single signals should be calculated.
calc_fft (bool) – Indicates whether the FFT of all individual signals should be calculated.
do_multiprocessing (bool) – Indicates whether the weight calculation operations should run in parallel processing mode where all available CPU cores are utilized.
writeoutput (bool)
use_gpu (bool)

Return type:

None

Notes

Supports calculating weights according to either correlation or transfer entropy metrics.

faultmap.weightcalc_onesource

Calculates weight and auxiliary data for each source variable and writes to files.

All weight data file output writers are now called at this level, making the process interruption tolerant up to a single source variable analysis.

faultmap.weightcalc_onesource.writecsv_weightcalc(filename, datalines, header)[source]: CSV writer customized for writing weights.

faultmap.weightcalc_onesource.readcsv_weightcalc(filename)[source]: CSV reader customized for reading weights.

faultmap.weightcalculators

This module stores the weight calculator classes used by the weightcalc module.

faultmap.weightcalculators.flexiblemethod(method)[source]: Decorator to allow methods to be defined as either static or instance methods.

class faultmap.weightcalculators.WeightCalculator(weight_calc_data, *_)[source]

Abstract base class for weight calculators.

Parameters:: weight_calc_data (WeightCalcData)

calculate_surrogate_weight(*args, **kwargs)[source]

Calculates surrogate weights for significance testing.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

abstractmethod calculate_significance_threshold(*args, **kwargs)[source]

Calculates the significance threshold for the weight between two vectors containing timer series data.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

abstractmethod report(*args, **kwargs)[source]

Calculates and reports the relevant output for each combination of variables tested.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

class faultmap.weightcalculators.CorrelationWeightCalculator(weight_calc_data)[source]

Implementation of WeightCalculator for correlation-based weight calculation.

Calculates correlation using covariance with optional standardisation and de-trending. This allows the effect of Skogestad scaling to be reflected in final result.

Parameters:: weight_calc_data (WeightCalcData)

static calculate_weight(source_var_data, destination_var_data, *_)[source]: Calculates the correlation between two vectors containing timer series data.

calculate_surrogate_weight(source_var, destination_var, box, trials)[source]

Calculates surrogate correlation values for significance threshold purposes.

Two methods for generating surrogate data is available: iAAFT (Schreiber 2000a) or random_shuffle in time.

Returns list of surrogate correlation entropy values of length num.

Parameters:

source_var (str)
destination_var (str)
box (ndarray)
trials (int)

thresh_rankorder(surr_corr, surr_dirindex)[source]

Calculates the minimum threshold required for a correlation value to be considered significant.

Makes use of a 95% single-sided certainty and a rank-order method. This correlates to taking the maximum transfer entropy from 19 surrogate transfer entropy calculations as the threshold, see Schreiber2000a.

Alternatively, the second highest from 38 observations can be taken, etc.

thresh_stdevs(surr_corr, surr_dirindex, stdevs)[source]

Calculates the minimum threshold required for a transfer entropy value to be considered significant.

Makes use of a six sigma Gaussian check as done in Bauer2005 with 30 samples of surrogate data.

calculate_significance_threshold(source_var, destination_var, box, _)[source]: Calculates the significance threshold for the weight between two vectors containing timer series data.

report(source_var_index, destination_var_index, weight_list, box, _)[source]: Calculates and reports the relevant output for each combination of variables tested.

class faultmap.weightcalculators.TransferEntropyWeightCalculator(weight_calc_data, estimator)[source]

Transfer entropy based weight calculation.

Parameters:

weight_calc_data (WeightCalcData)
estimator (Literal['kernel', 'kraskov', 'discrete'])

calculate_weight(cause_var_data, affected_var_data, *_)[source]: “Calculates the transfer entropy between two vectors containing timer series data.

report(source_var_index, destination_var_index, weight_list, box, proplist, milist)[source]: Calculates and reports the relevant output for each combination of variables tested.

calculate_surrogate_weight(source_var, destination_var, box, delay_index, trials)[source]

Calculates surrogate transfer entropy values for significance threshold purposes.

Two methods for generating surrogate data is available: iAAFT (Schreiber 2000a) or random_shuffle in time.

Returns list of surrogate transfer entropy values of length num.

Parameters:

delay_index (int)
trials (int)

static threshold_rankorder(surrogate_directional_weights, surrogate_absolute_weights)[source]

Calculates the minimum threshold required for a transfer entropy value to be considered significant.

Makes use of a 95% single-sided certainty and a rank-order method. This correlates to taking the maximum transfer entropy from 19 surrogate transfer entropy calculations as the threshold, see Schreiber2000a.

Alternatively, the second highest from 38 observations can be taken, etc.

thresh_stdevs(surr_te_directional, surr_te_absolute, stdevs)[source]

Calculates the minimum threshold required for a transfer entropy value to be considered significant.

Makes use of a six sigma Gaussian check as done in Bauer2005 with 30 samples of surrogate data.

calculate_significance_threshold(source_var, destination_var, box, delay)[source]: Calculates the significance threshold for the weight between two vectors containing timer series data.

faultmap.noderank

faultmap.graphreduce

Receives a weighted directed graph in GML format and deletes all edges that connects nodes that are connected via some other path. Only the longest paths are retained.

The graph should be available in the “graphs” directory in the case data folder. A reduced graph will have the same title as the original file with the suffix “_simplified”.

A <casename>_graphreduce.json configuration file needs to be available in the case directory root.

class faultmap.graphreduce.GraphReduceData(mode, case)[source]

Creates a data object from file and or function definitions for use in graph reduce method.

scenariodata(scenario)[source]: Retrieves data particular to each scenario for the case being investigated.

faultmap.graphreduce.compute_edge_threshold(graph, percentile)[source]: Calculates the threshold that should be used to delete edges from the original graph based on determined templates.

faultmap.graphreduce.delete_lowval_edges(graph, weight_threshold, remove_self_loops=True)[source]: Deletes all edges with weight below the threshold value. Also deletes all self-looping edges.

faultmap.graphreduce.decompose(input_, output_)[source]: Decomposes (flattens) a list of lists into a simple list.

faultmap.graphreduce.delete_loworder_edges(graph, max_depth, weight_discretion)[source]

Returns a simplified graph with higher order connections eliminated. All self-loops are also deleted.

The level up to which the search for higher order connections should be completed is indicated by the ‘max_depth’ parameter. A value of 1 means that children of children will be investigated, while a value of 2 means that children of children of children will be included in the search, and so on. If depth is set to “full”, then the search is completed until no more children is found.

If the ‘weight_discretion’ boolean is True, a higher order connection between a source node and a child will not be eliminated if this connection weight is higher than the weight of the connection between the last higher-order child to the destination node under question.

faultmap.networkgen

faultmap.type_definitions

Type definitions used throughout the library.