5  Reticulate

Reticulate is a foreign function interface in R to Python.

6 Reticulate: basic functionality

Data types are automatically converted from Python to R and vice versa. A useful table of automatic conversions can be found here.

You can easily import python modules, and call the functions in the following way:

library(reticulate)

bi <- reticulate::import_builtins()
rd <- reticulate::import("random")

example <- c(1,2,3)
bi$max(example)
[1] 3
rd$choice(example)
[1] 3
bi$list(bi$reversed(example))
[1] 3 2 1

Numpy is also easily used:

np <- reticulate::import("numpy")

a <- np$asarray(tuple(list(1,2), list(3, 4)))
b <- np$asarray(list(5,6))
b <- np$reshape(b, newshape = tuple(1L,2L))

np$concatenate(tuple(a, b), axis=0L)
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6

If you want more finegrained control over conversion, you can specify in the import statement that you do not want results of functions of that package to be converted to R data types.

np <- reticulate::import("numpy", convert = FALSE)

a <- np$asarray(tuple(list(1,2), list(3, 4)))
b <- np$asarray(list(5,6))
b <- np$reshape(b, newshape = tuple(1L,2L))

np$concatenate(tuple(a, b), axis=0L)
array([[1., 2.],
       [3., 4.],
       [5., 6.]])

You can explicitly convert data types:

result <- np$concatenate(tuple(a, b), axis=0L)

py_to_r(result)
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
result_r <- py_to_r(result)
r_to_py(result_r)
array([[1., 2.],
       [3., 4.],
       [5., 6.]])

7 Interactivity

You can easily include Python chunks in Rmarkdown notebooks using the Python engine in knitr.

8 Usecase

We will not showcase the usefulness of reticulate by using the DE analysis: it would involve loading in pandas to create a Python dataframe, adding rownames and columnnames and then grouping them, but that is easier to do natively in R.

A more interesting thing you can do using reticulate is interacting with anndata-based Python packages, such as scanpy!

library(anndata)
library(reticulate)
sc <- import("scanpy")

adata_path <- "../usecase/data/sc_counts_subset.h5ad"
adata <- anndata::read_h5ad(adata_path)

We can preprocess the data:

sc$pp$filter_cells(adata, min_genes = 200)
sc$pp$filter_genes(adata, min_cells = 3)
sc$pp$pca(adata)
sc$pp$neighbors(adata)
sc$tl$umap(adata)

adata
AnnData object with n_obs × n_vars = 32727 × 20542
    obs: 'dose_uM', 'timepoint_hr', 'well', 'row', 'col', 'plate_name', 'cell_id', 'cell_type', 'split', 'donor_id', 'sm_name', 'control', 'SMILES', 'sm_lincs_id', 'library_id', 'leiden_res1', 'group', 'cell_type_orig', 'plate_well_celltype_reannotated', 'cell_count_by_well_celltype', 'cell_count_by_plate_well', 'n_genes'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'n_cells'
    uns: 'cell_type_colors', 'celltypist_celltype_colors', 'donor_id_colors', 'hvg', 'leiden_res1_colors', 'log1p', 'neighbors', 'over_clustering', 'rank_genes_groups', 'pca', 'umap'
    obsm: 'HTO_clr', 'X_pca', 'X_umap', 'protein_counts'
    varm: 'PCs'
    obsp: 'connectivities', 'distances'

We can’t easily show the result of the plot in this Quarto notebook, but we can save it and show it:

path <- "umap.png"
sc$pl$umap(adata, color="leiden_res1", save=path)
Figure 8.1: UMAP plot of the adata object