proxi.utils package

Submodules

proxi.utils.distance module

Distance functions for proxi project.

proxi.utils.distance.abs_correlation(x, y)

Compute absolute correlation distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1-|pcc(x,y)|

proxi.utils.distance.abs_kendall(x, y)

Compute absolute Kendall correlation (tau) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1-|tau(x,y)|

proxi.utils.distance.abs_spearmann(x, y)

Compute absolute spearmann correlation (spc) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1-|spc(x,y)|

proxi.utils.distance.neg_correlation(x, y)

Compute negative correlation distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if pcc is positive. Otherwise, the distance is 1+pcc(x,y)

proxi.utils.distance.neg_kendall(x, y)

Compute negative Kendall correlation (tau) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if tau is positive. Otherwise, the distance is 1+tau(x,y)

proxi.utils.distance.neg_spearmann(x, y)

Compute negative spearmann correlation (spc) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if spc is positive. Otherwise, the distance is 1+spc(x,y)

proxi.utils.distance.pos_correlation(x, y)

Compute positive correlation distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if pcc is negative. Otherwise, the distance is 1-pcc(x,y)

proxi.utils.distance.pos_kendall(x, y)

Compute positive Kendall correlation (tau) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if tau is negative. Otherwise, the distance is 1-spc(x,y)

proxi.utils.distance.pos_spearmann(x, y)

Compute positive spearmann correlation (spc) distance between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

1 if spc is negative. Otherwise, the distance is 1-spc(x,y)

proxi.utils.misc module

Miscellaneous Python methods for proxi project.

proxi.utils.misc.aggregate_graphs(G, min_num_edges, is_weighted=False)

Aggregate the adjaceny matrices of graphs defined over the same set of nodes.

Parameters:
  • G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
  • min_num_edges (int) – min number of edges between two nodes required to keep an edge between them in the aggregated graph.
  • is_weighted (bool, optional, default = False) – whether to conmpute a weighted aggregated graph.
Returns:

  • rVal (agregated graph)
  • W (edge weights (None if is_weighted is False))

proxi.utils.misc.filter_edges_by_votes(A, G, min_num_votes)

Aggregate the adjaceny matrices of a list of graphs G and use the aggregated graph to decide which edges in the base graph A to keep. All graphs are assumed to be defined over the same set of nodes.

Parameters:
  • A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
  • G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
  • min_num_votes (int) – minimum number of edges between two nodes in the aggregated graph required to keep their edge (if exist) in the base graph.
Returns:

  • rVal (array_like, shape(N,N)) – adjaceny matrix of the filtered base graph.
  • W (array_like, shape(N,N)) – edge wesights associated with rVal graph

proxi.utils.misc.save_graph(A, nodes_id, out_file, create_using=None)

Save the graph in graphml format.

Parameters:
  • A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
  • nodes_id (array-like, shape(N,)) – list of modes id
  • out_file (file or string) – File or filename to write. Filenames ending in .gz or .bz2 will be compressed.
  • create_using (Networkx Graph object, optional, default is Graph) –

    User specified Networkx Graph type. Accepted types are: Undirected Simple Graph

    Directed Simple DiGraph With Self-loops Graph, DiGraph With Parallel edges MultiGraph, MultiDiGraph

Notes

This implementation, based on networkx write_graphml method, does not support mixed graphs (directed and unidirected edges together) hyperedges, nested graphs, or ports.

proxi.utils.misc.summarize_graph(G)

Report basic summary statistics of a networkx graph object.

Parameters:G (graph) – A networkx graph object
Returns:
Return type:A dictionary of basic graph properties.
proxi.utils.misc.jaccard_graph_similarity(G1, G2)

Compute Jaccard similarity between two graphs over the same set of nodes.

Parameters:
  • G1 (graph) – A networkx graph object.
  • G2 (graph) – A networkx graph pbject.
  • Returns
  • -------s
  • Jaccard similarity between two graphs over the same set of nodes. (Compute) –
proxi.utils.misc.get_graph_object(A, nodes_id=None)

Construct a networkx graph object given an adjaceny matrix and nodes IDs.

Parameters:A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
nodes_id : array-like, shape(N,)
list of modes id
Returns:
Return type:A networkx graph object.
proxi.utils.misc.get_collable_name(func)

Return the name of a collable function.

Parameters:func (collable function) –
Returns:
Return type:The name of a collable function.

Notes

str(func) returns <function neg_correlation at 0x1085cdd08>.

proxi.utils.process module

Pre-processing methods for proxi project.

proxi.utils.process.filter_OTUs_by_name(data, OTUs_to_keep, OTUs_column)

Keeps only the OTUs in OTUs_to_keep list.

Parameters:
  • data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample
  • OTUs_to_keep (list) – List of OTUs ID to select from the input dataframe.
  • OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns:

Return type:

A dataframe derived from the input data by keeping only rows with specified OTUs IDs.

proxi.utils.process.get_MAD(x)

MAD is defined as the median of the absolute deviations from the data’s median:

Parameters:x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:The median of the absolute deviations (MAD) of x.
proxi.utils.process.get_non_zero_percentage(x)

The fraction of non-zero values in a 1-D array x.

Parameters:x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:The percentage of non-zero elements in x.
proxi.utils.process.get_variance(x)

Compute the variance of an input vector x. Variance is the average of the squared deviations from the meanvar = mean(abs(x - x.mean())**2)

Parameters:x (array_like, Shape(N,)) – Input 1-D array.
Returns:
Return type:The variance of x.
proxi.utils.process.select_top_OTUs(data, score_function, threshold, OTUs_column)

Filter OTUs using a scoring function and return top k OTUs or OTUs with scores greater than a threshold score.

Parameters:
  • data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample
  • score_function (collable function) – Unsupervised scoring function (e.g., variance or percentage of non-zeros) of each OTU.
  • threshold (float) – if threshold > 1, return top threshold OTUs. Otherwise, return OTUs with score > threshold.
  • OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns:

Return type:

dataframe with selected OTUs

proxi.utils.similarity module

Similarity functions for proxi project.

proxi.utils.similarity.abs_Kendall(x, y)

Compute absolute Kendall correlation similarity between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

|kendalltau(x,y)|

proxi.utils.similarity.abs_pcc(x, y)

Compute absolute Pearson correlation similarity between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

|pcc(x,y)|

proxi.utils.similarity.abs_spc(x, y)

Compute absolute Spearman correlation similarity between two vectors.

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
Returns:

Return type:

|spearmanr(x,y)|

proxi.utils.similarity.distance_to_similarity(x, y, dist_func)

Convert the distance functions in scipy.spatial.distance into similarity functions

Parameters:
  • x (array_like, Shape(N,)) – First input vector.
  • y (array_like, Shape(N,)) – Second input vector.
  • dist_func (collable) – collabel distance function (e.g., any distance function in scipy.spatial.distance)
Returns:

Return type:

similarity between x and y.

Module contents