Getting Started with the Command Line and Python/R

Along with the Cirro web application, there is also an auxillary interface that you can use to interact with your data. The cirro package can be used either through the command line (as a command-line interface or CLI) or in a Python or R session. This package can be used to upload, download, and read your datasets into Jupyter Notebooks for performing additional analysis.

Common Tasks

The Cirro client library can be useful for:

Uploading or downloading large files (> 100MB) that would be slow over the web app
Transferring files between Cirro and a remote computing cluster
Automating data ingest or scheduling data analysis

Installation and Set Up

You can install cirro via PyPI using:

pip install cirro

Upon first use, the Cirro client will ask if you would like to save your login information and give you a link to authenticate. Open the link in your web browser and then select your institution and enter your username and password.

If you ever need to change your credentials after this point, you can clear your saved login information by removing the ~/.cirro/token.dat file from your system or by running cirro configure and selecting "No" when it asks if you'd like to save your login information.

Command-Line Interface

After installing the cirro package, you can easily interact with your data in the command prompt using our command-line interface (CLI). Check out some common use cases with our command line examples.

Scripting Languages

In addition to the command-line interface, the Cirro client can be used as part of commonly-used languages like Python and R. This allows the user to (a) use Cirro as part of a more complex set of operations while also (b) reading data objects from Cirro directly into memory (e.g. as data frames) without having to download any files to disk.

To see more information on the API, visit our external API documentation.

Python Examples

See the following set of Python Jupyter Notebooks that contain examples on the following topics:

Topic	Jupyter Notebook
Installing and authenticating	Getting Started with Cirro Client
Uploading data	Uploading a dataset
Downloading data	Downloading a dataset
Calling data and reading into tables	Interacting with files
Run analysis pipeline	Analyzing a dataset
Managing reference data	Using references

R Examples

See the following set of R Jupyter Notebooks that contain examples on the following topics:

Topic	Jupyter Notebook
Downloading a dataset in R	Using R

Filetype Validation

When uploading a dataset, Cirro will perform a check that the files being uploaded meet any requirements set by the dataset type selected. If you try to upload a file and get an error telling you that the files don't meet dataset type requirements, read through the print out of the required files and make any adjustments. You can always include more files, but you must meet all requirements before uploading. Lean more about dataset type requirements in the documentation.

Data Integrity Validation

The integrity of all files uploaded or downloaded using the Cirro client library is ensured via MD5 checksum validation. Any differences in file content between Cirro and the local system (down to a single byte difference) will result in an error being immediately reported to the user. While this accounts for any issues arising from network errors, users with additional security requirements can enable SHA-256 hashing following the documentation for the Cirro-client software repository.