--- title: "Reproduction of manuscript's results" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Reproduction of manuscript's results} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Introduction This vignette serves the purpose to reproduce the results shown in the section "Real data example" of the manuscript describing *collaborative graphical lasso* ([Albanese, Kohlen and Behrouzi, 2024](#ref)). The computations carried out by the following code are heavy and could take a considerable amount of time. On our machine (RAM 48 Gb, CPU Intel(R) Xeon(R) W-2235 CPU @ 3.80GHz) it took almost 3 hours. Hence, we refer the user who wants to get started with the package to the introductory vignette: `vignette("coglasso")`. *Collaborative graphical lasso* is an algorithm to reconstruct multi-omics networks based on *graphical lasso* ([Friedman, Hastie and Tibshirani, 2008](#ref)) and *collaborative regression* ([Gross and Tibshirani, 2015](#ref)). The coglasso package implements and R interface to this algorithm. We will use it to reproduce the results obtained in the original manuscript. Let us first attach coglasso. ```{r setup} library(coglasso) ``` We will use the multi-omics data set provided by the package, in its largest version: `multi_omics_sd`. Let's inspect its description in the manual. ```{r} help(multi_omics_sd) ``` We notice that this data set has 162 transcripts and 76 metabolites. For further information on the data set sources and generation, we refer the user to the script [multi_omics_sd.R](https://github.com/DrQuestion/coglasso/blob/main/data-raw/multi_omics_sd.R) on the GitHub repository of the package. ## Reproducing the results For the multi-omics network reconstruction carried out in the manuscript we explored 30 automatically generated $λ_w$ and $λ_b$ values and 9 given $c$ values. ```{r} nlambda_w <- 30 nlambda_b <- 30 cs <- c(0.01, 0.05, 0.1, 0.2, 1, 5, 10, 20, 100) ``` We can now build the networks, one per combination of the chosen hyperparameters, with `coglasso()`. ```{r} cg <- coglasso(multi_omics_sd, p = 162, nlambda_w = nlambda_w, nlambda_b = nlambda_b, c = cs ) ``` We proceed now to select the most stable, yet sparse network, using *eXtended StARS* (*XStARS*, [Albanese, Kohlen and Behrouzi, 2024](#ref)), an adaptation to coglasso of *StARS* ([Liu, Roeder and Wasserman, 2010](#ref)). The function implementing *XStARS* is `select_coglasso()`, when setting the argument `method` to "xstars" (which is the default behaviour). ```{r} set.seed(42) sel_cg <- select_coglasso(cg, method = "xstars") ``` We display now the selected network using the package `igraph`, reproducing the one displayed in Figure 3 of the manuscript. The orientation of the network may vary. ```{r} plot(sel_cg) ``` We now plot the subnetwork of the gene *Cirbp* and of its neighborhood. This reproduces the network displayed in Figure 4A. ```{r} igraph::V(sel_graph)$label<-colnames(sel_cg$data) cirbp_index <- which(colnames(sel_cg$data) == "Cirbp") subnetwork <- igraph::subgraph( sel_graph, c(cirbp_index, igraph::neighbors(sel_graph, cirbp_index)) ) plot(subnetwork) ``` A useful way to extract information from a network is community discovery. This technique allows to simplify a network by breaking it in functional units where nodes share several connections, enough to be recognized as separate communities. In the manuscript we used the greedy algorithm described in [Clauset, Newman and Moore, 2004](#ref), part of the `igraph` toolkit. We focused and inspected the second largest community, shown in Figure 4B. Here is the code to reproduce it. ```{r} communities <- igraph::cluster_fast_greedy(sel_graph) community2 <- communities[[2]] community2_graph<-igraph::subgraph(sel_graph, community2) # Focusing on nodes of interest fosjun_erg_AA <- c("Fos", "Fosb", "Junb", "Fosl2", "Egr1", "Egr2", "Egr3", "Ala", "Arg", "Asn","His","Ile","Leu","Lys","Met","Orn","Phe","Ser","Thr","Tyr","Val") igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$color<-c(rep("#3bd8ff", 29), rep("#ffadad", 5)) igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$frame.color<-NA igraph::V(community2_igraph)[!(label %in% fosjun_erg_AA)]$label<-NA lo_community2 <- igraph::layout_with_fr(community2_graph) plot(community2_graph, layout=lo_community2) ``` ## References {#ref} Albanese, A., Kohlen, W., & Behrouzi, P. (2024). Collaborative graphical lasso (arXiv:2403.18602). *arXiv* Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. *Biostatistics*, 9(3), 432--441. Gross, S. M., & Tibshirani, R. (2015). Collaborative regression. *Biostatistics*, 16(2), 326--338. Liu, H., Roeder, K., & Wasserman, L. (2010). Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models (arXiv:1006.3316). *arXiv.* Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. *Physical Review E*, 70(6), 066111.