Glossary

Below are a list of Cirro specific terms and their definitions.

Dashboard: A page where you can create multiple tables and plots side by side from multiple datasets.
Dataset: A group of one or more files that lives inside a project. Each dataset has a "type" which defines what kinds of files must be included, though a dataset can have more files than just what is required.
Notebook: Refers to a Jupyter Notebook, which is a stand alone place to write and run code (often Python code, but other languages are available, like R). Notebooks can read in multiple datasets from a project and write out new files. Notebooks live inside a notebook environment.
Notebook Environment: A location where you can run and save notebooks in the browser. Notebooks and the files they create all live in a notebook environment. A project can have multiple notebook environments.
Pipeline: A set of code, often using Nextflow, that runs analysis on datasets and writes out new datasets. Pipelines are also sometimes referred to as "processes" or "workflows".
Project: A top-level container that holds datasets, samples, references, etc, with their own unique users and billing codes.
Reference: Additional data files needed for a pipeline to run, like metadata or "normal" samples. All reference files uploaded in a project are available for use by all datasets in that project.
Sample: Metadata on the data files in a project that come either from parsing the file names or from a samplesheet.csv file. All samples in a project are available for use by all datasets in that project.