Skip to content

Glossary

Below are a list of Cirro specific terms and their definitions.

  • Dashboard: A page where you can create multiple tables and plots side by side from multiple datasets.
  • Dataset: A group of one or more files that lives inside a project. Each dataset has a "type" which defines what kinds of files must be included, though a dataset can have more files than just what is required.
  • Notebook: Refers to a Jupyter Notebook, which is a stand alone place to write and run code (often Python code, but other languages are available, like R). Notebooks can read in multiple datasets from a project and write out new files. Notebooks live inside a notebook environment.
  • Notebook Environment: A location where you can run and save notebooks in the browser. Notebooks and the files they create all live in a notebook environment. A project can have multiple notebook environments.
  • Organization: The highest level container in Cirro, which contains multiple projects. Your organization could be a Cirro instance for your company or for a broader community like the stand-alone "Independent Researcher" organization. This is also sometimes referred to as a "tenant".
  • Pipeline: A set of code, often using Nextflow, that runs analysis on datasets and writes out new datasets. Pipelines are also sometimes referred to as "processes" or "workflows".
  • Project: A higher-level container that holds datasets, samples, references, etc, with their own unique users and billing codes.
  • Reference: Additional data files needed for a pipeline to run, like metadata or "normal" samples. All reference files uploaded in a project are available for use by all datasets in that project.
  • Sample: Metadata on the data files in a project that come either from parsing the file names or from a samplesheet.csv file. All samples in a project are available for use by all datasets in that project.
  • Share: A collection of datasets that can be shared with other projects for them to view and analyze within their own projects. "Publisher" shares are ones where you publish the data for other projects to use, and "subscriber" shares are ones where you accept incoming datasets from other projects.