Skip to content

Using R

Downloading a Dataset in R

To access the Cirro Data Portal directly from R, you must first:

  1. Install the cirro client library (with pip install cirro from the command line)

  2. Install the reticulate package in R (with install.packages("reticulate") from the R prompt)

  3. Log in to your Data Portal account (with cirro-cli configure from the command line)

# Once your system is set up, you can use reticulate to import the client library
library(reticulate)
cirro <- import("cirro")
# As described in the Getting Started notebook, the `portal` object is used to access
# information available in the Data Portal
portal <- cirro$DataPortal()
# One of the most useful examples for how to use R is reading in data directly
# from files that are hosted in the Data Portal

# In the example below, we will read in the table of read counts which were
# generated in the "Test of mageck-count" dataset within the "Test Project" project:
project <- portal$get_project_by_name('Test Project')
dataset <- project$get_dataset_by_name("Test of mageck-count")
counts <- dataset$list_files()$get_by_name("data/mageck/count/combined/counts.txt")$read_csv(sep="\t")
head(counts)
A data.frame: 6 × 6
sgRNAGeneMO_Brunello_gDNA_2MO_Brunello_1MO_Brunello_2MO_Brunello_gDNA_1
<chr><chr><dbl><dbl><dbl><dbl>
1A1BG_0 A1BG0000
2A1BG_1 A1BG0002
3A1BG_2 A1BG0000
4A1BG_3 A1BG0020
5A1CF_36946A1CF0000
6A1CF_36947A1CF1000