Rpy2 is a foreign function interface to R. It can be used in the following way:
import rpy2import rpy2.robjects as robjects
/home/runner/work/polygloty/polygloty/renv/python/virtualenvs/renv-python-3.12/lib/python3.12/site-packages/rpy2/rinterface_lib/embedded.py:276: UserWarning: R was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.
warnings.warn(msg)
R was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.
Luckily, we’re not restricted to just calling R functions and creating R objects. The real power of this in-memory interoperability lies in the conversion of Python objects to R objects to call R functions on, and then to the conversion of the results back to Python objects.
Rpy2 requires specific conversion rules for different Python objects. It is straightforward to create R vectors from corresponding Python lists:
However, for single cell biology, the objects that are most interesting to convert are (count) matrices, arrays and dataframes. In order to do this, you need to import the corresponding rpy2 modules and specify the conversion context.
One big limitation of rpy2 is the inability to convert sparse matrices: there is no built-in conversion module for scipy. The anndata2ri package provides, apart from functionality to convert SingleCellExperiment objects to an anndata objects, functions to convert sparse matrices.
with anndata2ri.converter.context(): sce = anndata2ri.py2rpy(adata_paul) ad2 = anndata2ri.rpy2py(sce)
4.2 Interactive sessions
One of the most useful ways to take advantage of in-memory interoperability is to use it in interactive sessions, where you’re exploring the data and want to try out some functions non-native to your language of choice.
Jupyter notebooks (and some other notebooks) make this possible from the Python side: using IPython line and cell magic and rpy2, you can easily run an R jupyter cell in your notebooks.
%load_ext rpy2.ipython # line magic that loads the rpy2 ipython extension.# this extension allows the use of the following cell magic%%R -i input-o output # this line allows to specify inputs # (which will be converted to R objects) and outputs # (which will be converted back to Python objects) # this line is put at the start of a cell# the rest of the cell will be run as R code
4.3 Usecase: ran in Python
We will perform the Compute DE step not in R, but in Python The pseudobulked data is read in:
import anndata as adpd_adata = ad.read_h5ad("../usecase/data/pseudobulk.h5ad")
Creating a DESeq dataset: This requires a bit more effort: we need to import the DESeq2 package, and combine the default, numpy2ri and pandas2ri converter to convert the count matrix and the obs dataframe.