proxi.utils package¶

Submodules¶

proxi.utils.distance module¶

Distance functions for proxi project.

proxi.utils.distance.abs_correlation(x, y)¶

Compute absolute correlation distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1-\|pcc(x,y)\|

proxi.utils.distance.abs_kendall(x, y)¶

Compute absolute Kendall correlation (tau) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1-\|tau(x,y)\|

proxi.utils.distance.abs_spearmann(x, y)¶

Compute absolute spearmann correlation (spc) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1-\|spc(x,y)\|

proxi.utils.distance.neg_correlation(x, y)¶

Compute negative correlation distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if pcc is positive. Otherwise, the distance is 1+pcc(x,y)

proxi.utils.distance.neg_kendall(x, y)¶

Compute negative Kendall correlation (tau) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if tau is positive. Otherwise, the distance is 1+tau(x,y)

proxi.utils.distance.neg_spearmann(x, y)¶

Compute negative spearmann correlation (spc) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if spc is positive. Otherwise, the distance is 1+spc(x,y)

proxi.utils.distance.pos_correlation(x, y)¶

Compute positive correlation distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if pcc is negative. Otherwise, the distance is 1-pcc(x,y)

proxi.utils.distance.pos_kendall(x, y)¶

Compute positive Kendall correlation (tau) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if tau is negative. Otherwise, the distance is 1-spc(x,y)

proxi.utils.distance.pos_spearmann(x, y)¶

Compute positive spearmann correlation (spc) distance between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	1 if spc is negative. Otherwise, the distance is 1-spc(x,y)

proxi.utils.misc module¶

Miscellaneous Python methods for proxi project.

proxi.utils.misc.aggregate_graphs(G, min_num_edges, is_weighted=False)¶

Aggregate the adjaceny matrices of graphs defined over the same set of nodes.

Parameters:

G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
min_num_edges (int) – min number of edges between two nodes required to keep an edge between them in the aggregated graph.
is_weighted (bool, optional, default = False) – whether to conmpute a weighted aggregated graph.

Returns:

rVal (agregated graph)
W (edge weights (None if is_weighted is False))

proxi.utils.misc.filter_edges_by_votes(A, G, min_num_votes)¶

Aggregate the adjaceny matrices of a list of graphs G and use the aggregated graph to decide which edges in the base graph A to keep. All graphs are assumed to be defined over the same set of nodes.

Parameters:

A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
min_num_votes (int) – minimum number of edges between two nodes in the aggregated graph required to keep their edge (if exist) in the base graph.

Returns:

rVal (array_like, shape(N,N)) – adjaceny matrix of the filtered base graph.
W (array_like, shape(N,N)) – edge wesights associated with rVal graph

proxi.utils.misc.save_graph(A, nodes_id, out_file, create_using=None)¶

Save the graph in graphml format.

Parameters:

A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
nodes_id (array-like, shape(N,)) – list of modes id
out_file (file or string) – File or filename to write. Filenames ending in .gz or .bz2 will be compressed.
create_using (Networkx Graph object, optional, default is Graph) –
User specified Networkx Graph type. Accepted types are: Undirected Simple Graph

Directed Simple DiGraph With Self-loops Graph, DiGraph With Parallel edges MultiGraph, MultiDiGraph

Notes

This implementation, based on networkx write_graphml method, does not support mixed graphs (directed and unidirected edges together) hyperedges, nested graphs, or ports.

proxi.utils.misc.summarize_graph(G)¶

Report basic summary statistics of a networkx graph object.

Parameters:	G (graph) – A networkx graph object
Returns:
Return type:	A dictionary of basic graph properties.

proxi.utils.misc.jaccard_graph_similarity(G1, G2)¶

Compute Jaccard similarity between two graphs over the same set of nodes.

Parameters:	G1 (graph) – A networkx graph object. G2 (graph) – A networkx graph pbject. Returns – -------s – Jaccard similarity between two graphs over the same set of nodes. (Compute) –

proxi.utils.misc.get_graph_object(A, nodes_id=None)¶

Construct a networkx graph object given an adjaceny matrix and nodes IDs.

Parameters:	A (array_like, shape(N,N)) – adjaceny matrix of the base graph.

nodes_id : array-like, shape(N,): list of modes id

Returns:
Return type:	A networkx graph object.

proxi.utils.misc.get_collable_name(func)¶

Return the name of a collable function.

Parameters:	func (collable function) –
Returns:
Return type:	The name of a collable function.

Notes

str(func) returns <function neg_correlation at 0x1085cdd08>.

proxi.utils.process module¶

Pre-processing methods for proxi project.

proxi.utils.process.filter_OTUs_by_name(data, OTUs_to_keep, OTUs_column)¶

Keeps only the OTUs in OTUs_to_keep list.

Parameters:	data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample OTUs_to_keep (list) – List of OTUs ID to select from the input dataframe. OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns:
Return type:	A dataframe derived from the input data by keeping only rows with specified OTUs IDs.

proxi.utils.process.get_MAD(x)¶

MAD is defined as the median of the absolute deviations from the data’s median:

Parameters:	x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:	The median of the absolute deviations (MAD) of x.

proxi.utils.process.get_non_zero_percentage(x)¶

The fraction of non-zero values in a 1-D array x.

Parameters:	x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:	The percentage of non-zero elements in x.

proxi.utils.process.get_variance(x)¶

Compute the variance of an input vector x. Variance is the average of the squared deviations from the meanvar = mean(abs(x - x.mean())**2)

Parameters:	x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:	The variance of x.

proxi.utils.process.select_top_OTUs(data, score_function, threshold, OTUs_column)¶

Filter OTUs using a scoring function and return top k OTUs or OTUs with scores greater than a threshold score.

Parameters:	data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample score_function (collable function) – Unsupervised scoring function (e.g., variance or percentage of non-zeros) of each OTU. threshold (float) – if threshold > 1, return top threshold OTUs. Otherwise, return OTUs with score > threshold. OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns:
Return type:	dataframe with selected OTUs

proxi.utils.similarity module¶

Similarity functions for proxi project.

proxi.utils.similarity.abs_Kendall(x, y)¶

Compute absolute Kendall correlation similarity between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	\|kendalltau(x,y)\|

proxi.utils.similarity.abs_pcc(x, y)¶

Compute absolute Pearson correlation similarity between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	\|pcc(x,y)\|

proxi.utils.similarity.abs_spc(x, y)¶

Compute absolute Spearman correlation similarity between two vectors.

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector.
Returns:
Return type:	\|spearmanr(x,y)\|

proxi.utils.similarity.distance_to_similarity(x, y, dist_func)¶

Convert the distance functions in scipy.spatial.distance into similarity functions

Parameters:	x (array_like, Shape(N,)) – First input vector. y (array_like, Shape(N,)) – Second input vector. dist_func (collable) – collabel distance function (e.g., any distance function in scipy.spatial.distance)
Returns:
Return type:	similarity between x and y.

proxi.utils package¶

Submodules¶

proxi.utils.distance module¶

proxi.utils.misc module¶

proxi.utils.process module¶

proxi.utils.similarity module¶

Module contents¶