bioneuralnet.clustering.correlated_pagerank
Functions
|
Retrieves a global logger configured to write to 'bioneuralnet.log' at the project root. |
|
Pearson correlation coefficient and p-value for testing non-correlation. |
Classes
|
alias of |
|
Special type indicating an unconstrained type. |
|
Command-line reporter |
|
PageRank Class for Clustering Nodes Based on Personalized PageRank. |
- class bioneuralnet.clustering.correlated_pagerank.CorrelatedPageRank(graph: Graph, omics_data: DataFrame, phenotype_data: DataFrame, alpha: float = 0.9, max_iter: int = 100, tol: float = 1e-06, k: float = 0.5, tune: bool = False, gpu: bool = False, seed: int | None = None)[source]
Bases:
objectPageRank Class for Clustering Nodes Based on Personalized PageRank.
This class handles the execution of the Personalized PageRank algorithm and identification of clusters based on sweep cuts.
- generate_weighted_personalization(nodes: List[Any]) → Dict[Any, float][source]
Generates a weighted personalization vector for PageRank.
- Parameters:
nodes (List[Any]) – List of node identifiers to consider.
- Returns:
Personalization vector with weights for each node.
- Return type:
Dict[Any, float]
- get_quality() → float[source]
Returns the composite score (or correlation) from the latest clustering run.
- phen_omics_corr(nodes: List[Any]) → Tuple[float, str][source]
Calculates the Pearson correlation between the PCA of omics data and phenotype.
- run(seed_nodes: List[Any]) → Dict[str, Any][source]
Executes the correlated PageRank clustering pipeline.
Steps:
- Initializing Clustering:
Receives a list of seed nodes to personalize the PageRank algorithm.
Prepares the input graph and relevant parameters for clustering.
- PageRank Execution:
Applies the PageRank algorithm with personalization based on the seed nodes.
Computes node scores and determines cluster memberships.
- Result Compilation:
Compiles clustering results, including cluster sizes and node memberships, into a dictionary.
Logs the successful completion of the clustering process.
- Args:
- seed_nodes (List[Any]):
A list of node identifiers used as seed nodes for personalized PageRank.
These nodes influence the clustering process by biasing the algorithm.
Returns: Dict[str, Any]
- A dictionary containing the clustering results. Keys may include:
clusters: Lists of nodes grouped into clusters.
scores: PageRank scores for each node.
metadata: Additional metrics or details about the clustering process.
Raises:
ValueError: If the input graph is empty or seed nodes are invalid.
Exception: For any unexpected errors during clustering execution.
Notes:
Seed nodes strongly influence the clustering outcome; select them carefully based on prior knowledge or experimental goals.
The PageRank algorithm requires a well-defined and connected graph to produce meaningful results.
Results are sensitive to the alpha (damping factor) and other hyperparameters.