proxi.utils package¶

proxi.utils.distance module¶

Distance functions for proxi project.

proxi.utils.distance.abs_correlation(x, y)

Compute absolute correlation distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
proxi.utils.distance.abs_kendall(x, y)

Compute absolute Kendall correlation (tau) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
proxi.utils.distance.abs_spearmann(x, y)

Compute absolute spearmann correlation (spc) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
proxi.utils.distance.neg_correlation(x, y)

Compute negative correlation distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if pcc is positive. Otherwise, the distance is 1+pcc(x,y)
proxi.utils.distance.neg_kendall(x, y)

Compute negative Kendall correlation (tau) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if tau is positive. Otherwise, the distance is 1+tau(x,y)
proxi.utils.distance.neg_spearmann(x, y)

Compute negative spearmann correlation (spc) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if spc is positive. Otherwise, the distance is 1+spc(x,y)
proxi.utils.distance.pos_correlation(x, y)

Compute positive correlation distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if pcc is negative. Otherwise, the distance is 1-pcc(x,y)
proxi.utils.distance.pos_kendall(x, y)

Compute positive Kendall correlation (tau) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if tau is negative. Otherwise, the distance is 1-spc(x,y)
proxi.utils.distance.pos_spearmann(x, y)

Compute positive spearmann correlation (spc) distance between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. 1 if spc is negative. Otherwise, the distance is 1-spc(x,y)

proxi.utils.misc module¶

Miscellaneous Python methods for proxi project.

proxi.utils.misc.aggregate_graphs(G, min_num_edges, is_weighted=False)

Aggregate the adjaceny matrices of graphs defined over the same set of nodes.

Parameters: G (list of array_like matrices of shape (N,N)) – list of adjacency matrices. min_num_edges (int) – min number of edges between two nodes required to keep an edge between them in the aggregated graph. is_weighted (bool, optional, default = False) – whether to conmpute a weighted aggregated graph. rVal (agregated graph) W (edge weights (None if is_weighted is False))

Aggregate the adjaceny matrices of a list of graphs G and use the aggregated graph to decide which edges in the base graph A to keep. All graphs are assumed to be defined over the same set of nodes.

Parameters: A (array_like, shape(N,N)) – adjaceny matrix of the base graph. G (list of array_like matrices of shape (N,N)) – list of adjacency matrices. min_num_votes (int) – minimum number of edges between two nodes in the aggregated graph required to keep their edge (if exist) in the base graph. rVal (array_like, shape(N,N)) – adjaceny matrix of the filtered base graph. W (array_like, shape(N,N)) – edge wesights associated with rVal graph
proxi.utils.misc.save_graph(A, nodes_id, out_file, create_using=None)

Save the graph in graphml format.

Parameters: A (array_like, shape(N,N)) – adjaceny matrix of the base graph. nodes_id (array-like, shape(N,)) – list of modes id out_file (file or string) – File or filename to write. Filenames ending in .gz or .bz2 will be compressed. create_using (Networkx Graph object, optional, default is Graph) – User specified Networkx Graph type. Accepted types are: Undirected Simple Graph Directed Simple DiGraph With Self-loops Graph, DiGraph With Parallel edges MultiGraph, MultiDiGraph

Notes

This implementation, based on networkx write_graphml method, does not support mixed graphs (directed and unidirected edges together) hyperedges, nested graphs, or ports.

proxi.utils.misc.summarize_graph(G)

Report basic summary statistics of a networkx graph object.

Parameters: G (graph) – A networkx graph object A dictionary of basic graph properties.
proxi.utils.misc.jaccard_graph_similarity(G1, G2)

Compute Jaccard similarity between two graphs over the same set of nodes.

Parameters: G1 (graph) – A networkx graph object. G2 (graph) – A networkx graph pbject. Returns – -------s – Jaccard similarity between two graphs over the same set of nodes. (Compute) –
proxi.utils.misc.get_graph_object(A, nodes_id=None)

Construct a networkx graph object given an adjaceny matrix and nodes IDs.

Parameters: A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
nodes_id : array-like, shape(N,)
list of modes id
Returns: A networkx graph object.
proxi.utils.misc.get_collable_name(func)

Return the name of a collable function.

Parameters: func (collable function) – The name of a collable function.

Notes

str(func) returns <function neg_correlation at 0x1085cdd08>.

proxi.utils.process module¶

Pre-processing methods for proxi project.

proxi.utils.process.filter_OTUs_by_name(data, OTUs_to_keep, OTUs_column)

Keeps only the OTUs in OTUs_to_keep list.

Parameters: data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample OTUs_to_keep (list) – List of OTUs ID to select from the input dataframe. OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs). A dataframe derived from the input data by keeping only rows with specified OTUs IDs.

MAD is defined as the median of the absolute deviations from the data’s median:

Parameters: x (array_like, Shape(N,)) – Input 1-D array. The median of the absolute deviations (MAD) of x.
proxi.utils.process.get_non_zero_percentage(x)

The fraction of non-zero values in a 1-D array x.

Parameters: x (array_like, Shape(N,)) – Input 1-D array. The percentage of non-zero elements in x.
proxi.utils.process.get_variance(x)

Compute the variance of an input vector x. Variance is the average of the squared deviations from the meanvar = mean(abs(x - x.mean())**2)

Parameters: x (array_like, Shape(N,)) – Input 1-D array. The variance of x.
proxi.utils.process.select_top_OTUs(data, score_function, threshold, OTUs_column)

Filter OTUs using a scoring function and return top k OTUs or OTUs with scores greater than a threshold score.

Parameters: data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample score_function (collable function) – Unsupervised scoring function (e.g., variance or percentage of non-zeros) of each OTU. threshold (float) – if threshold > 1, return top threshold OTUs. Otherwise, return OTUs with score > threshold. OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs). dataframe with selected OTUs

proxi.utils.similarity module¶

Similarity functions for proxi project.

proxi.utils.similarity.abs_Kendall(x, y)

Compute absolute Kendall correlation similarity between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. |kendalltau(x,y)|
proxi.utils.similarity.abs_pcc(x, y)

Compute absolute Pearson correlation similarity between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. |pcc(x,y)|
proxi.utils.similarity.abs_spc(x, y)

Compute absolute Spearman correlation similarity between two vectors.

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. |spearmanr(x,y)|
proxi.utils.similarity.distance_to_similarity(x, y, dist_func)

Convert the distance functions in scipy.spatial.distance into similarity functions

Parameters: x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. dist_func (collable) – collabel distance function (e.g., any distance function in scipy.spatial.distance) similarity between x and y.