proxi.utils package¶
Submodules¶
proxi.utils.distance module¶
Distance functions for proxi project.
-
proxi.utils.distance.
abs_correlation
(x, y)¶ Compute absolute correlation distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.distance.
abs_kendall
(x, y)¶ Compute absolute Kendall correlation (tau) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.distance.
abs_spearmann
(x, y)¶ Compute absolute spearmann correlation (spc) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.distance.
neg_correlation
(x, y)¶ Compute negative correlation distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if pcc is positive. Otherwise, the distance is 1+pcc(x,y)
-
proxi.utils.distance.
neg_kendall
(x, y)¶ Compute negative Kendall correlation (tau) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if tau is positive. Otherwise, the distance is 1+tau(x,y)
-
proxi.utils.distance.
neg_spearmann
(x, y)¶ Compute negative spearmann correlation (spc) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if spc is positive. Otherwise, the distance is 1+spc(x,y)
-
proxi.utils.distance.
pos_correlation
(x, y)¶ Compute positive correlation distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if pcc is negative. Otherwise, the distance is 1-pcc(x,y)
-
proxi.utils.distance.
pos_kendall
(x, y)¶ Compute positive Kendall correlation (tau) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if tau is negative. Otherwise, the distance is 1-spc(x,y)
-
proxi.utils.distance.
pos_spearmann
(x, y)¶ Compute positive spearmann correlation (spc) distance between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type: 1 if spc is negative. Otherwise, the distance is 1-spc(x,y)
proxi.utils.misc module¶
Miscellaneous Python methods for proxi project.
-
proxi.utils.misc.
aggregate_graphs
(G, min_num_edges, is_weighted=False)¶ Aggregate the adjaceny matrices of graphs defined over the same set of nodes.
Parameters: - G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
- min_num_edges (int) – min number of edges between two nodes required to keep an edge between them in the aggregated graph.
- is_weighted (bool, optional, default = False) – whether to conmpute a weighted aggregated graph.
Returns: - rVal (agregated graph)
- W (edge weights (None if is_weighted is False))
-
proxi.utils.misc.
filter_edges_by_votes
(A, G, min_num_votes)¶ Aggregate the adjaceny matrices of a list of graphs G and use the aggregated graph to decide which edges in the base graph A to keep. All graphs are assumed to be defined over the same set of nodes.
Parameters: - A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
- G (list of array_like matrices of shape (N,N)) – list of adjacency matrices.
- min_num_votes (int) – minimum number of edges between two nodes in the aggregated graph required to keep their edge (if exist) in the base graph.
Returns: - rVal (array_like, shape(N,N)) – adjaceny matrix of the filtered base graph.
- W (array_like, shape(N,N)) – edge wesights associated with rVal graph
-
proxi.utils.misc.
save_graph
(A, nodes_id, out_file, create_using=None)¶ Save the graph in graphml format.
Parameters: - A (array_like, shape(N,N)) – adjaceny matrix of the base graph.
- nodes_id (array-like, shape(N,)) – list of modes id
- out_file (file or string) – File or filename to write. Filenames ending in .gz or .bz2 will be compressed.
- create_using (Networkx Graph object, optional, default is Graph) –
User specified Networkx Graph type. Accepted types are: Undirected Simple Graph
Directed Simple DiGraph With Self-loops Graph, DiGraph With Parallel edges MultiGraph, MultiDiGraph
Notes
This implementation, based on networkx write_graphml method, does not support mixed graphs (directed and unidirected edges together) hyperedges, nested graphs, or ports.
-
proxi.utils.misc.
summarize_graph
(G)¶ Report basic summary statistics of a networkx graph object.
Parameters: G (graph) – A networkx graph object Returns: Return type: A dictionary of basic graph properties.
-
proxi.utils.misc.
jaccard_graph_similarity
(G1, G2)¶ Compute Jaccard similarity between two graphs over the same set of nodes.
Parameters: - G1 (graph) – A networkx graph object.
- G2 (graph) – A networkx graph pbject.
- Returns –
- -------s –
- Jaccard similarity between two graphs over the same set of nodes. (Compute) –
-
proxi.utils.misc.
get_graph_object
(A, nodes_id=None)¶ Construct a networkx graph object given an adjaceny matrix and nodes IDs.
Parameters: A (array_like, shape(N,N)) – adjaceny matrix of the base graph. - nodes_id : array-like, shape(N,)
- list of modes id
Returns: Return type: A networkx graph object.
-
proxi.utils.misc.
get_collable_name
(func)¶ Return the name of a collable function.
Parameters: func (collable function) – Returns: Return type: The name of a collable function. Notes
str(func) returns <function neg_correlation at 0x1085cdd08>.
proxi.utils.process module¶
Pre-processing methods for proxi project.
-
proxi.utils.process.
filter_OTUs_by_name
(data, OTUs_to_keep, OTUs_column)¶ Keeps only the OTUs in OTUs_to_keep list.
Parameters: - data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample
- OTUs_to_keep (list) – List of OTUs ID to select from the input dataframe.
- OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns: Return type: A dataframe derived from the input data by keeping only rows with specified OTUs IDs.
-
proxi.utils.process.
get_MAD
(x)¶ MAD is defined as the median of the absolute deviations from the data’s median:
Parameters: x (array_like, Shape(N,)) – Input 1-D array. Returns: Return type: The median of the absolute deviations (MAD) of x.
-
proxi.utils.process.
get_non_zero_percentage
(x)¶ The fraction of non-zero values in a 1-D array x.
Parameters: x (array_like, Shape(N,)) – Input 1-D array. Returns: Return type: The percentage of non-zero elements in x.
-
proxi.utils.process.
get_variance
(x)¶ Compute the variance of an input vector x. Variance is the average of the squared deviations from the meanvar = mean(abs(x - x.mean())**2)
Parameters: x (array_like, Shape(N,)) – Input 1-D array. Returns: Return type: The variance of x.
-
proxi.utils.process.
select_top_OTUs
(data, score_function, threshold, OTUs_column)¶ Filter OTUs using a scoring function and return top k OTUs or OTUs with scores greater than a threshold score.
Parameters: - data (DataFrame) – Input data as a pandas DataFrame object. Each row is an OTU and each column is a sample
- score_function (collable function) – Unsupervised scoring function (e.g., variance or percentage of non-zeros) of each OTU.
- threshold (float) – if threshold > 1, return top threshold OTUs. Otherwise, return OTUs with score > threshold.
- OTU_column (string) – Name of the DataFrame column that contains the OTUs IDs (i.e., nodes IDs).
Returns: Return type: dataframe with selected OTUs
proxi.utils.similarity module¶
Similarity functions for proxi project.
-
proxi.utils.similarity.
abs_Kendall
(x, y)¶ Compute absolute Kendall correlation similarity between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.similarity.
abs_pcc
(x, y)¶ Compute absolute Pearson correlation similarity between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.similarity.
abs_spc
(x, y)¶ Compute absolute Spearman correlation similarity between two vectors.
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
Returns: Return type:
-
proxi.utils.similarity.
distance_to_similarity
(x, y, dist_func)¶ Convert the distance functions in scipy.spatial.distance into similarity functions
Parameters: - x (array_like, Shape(N,)) – First input vector.
- y (array_like, Shape(N,)) – Second input vector.
- dist_func (collable) – collabel distance function (e.g., any distance function in scipy.spatial.distance)
Returns: Return type: similarity between x and y.