Skip to content

Installing Cirro in your AWS Account

This page is for users who want to create and manage projects in Cirro that are linked to accounts in their own AWS organization. This is useful for users who have their own pricing agreements with AWS or requirements around data security and compliance.

Prerequisites

Cirro uses AWS accounts to store and process data used in projects. An AWS Account must be set up before you can use Cirro. If you don't have one yet, please contact your IT department or the Cirro team for more information.

Cirro Subscription

In order to create projects, you must have a valid Cirro subscription. Please contact the Cirro team to get started.

You can subscribe to Cirro through the AWS Marketplace.

Deployment Role

The account must contain the Cirro Deployment IAM role. This role is used by Cirro to create and manage resources within your AWS account. View the CloudFormation template.

The template expects the following parameters:

  • pDataPortalAccountId: The account ID of your Cirro tenant. Contact your Cirro representative for this value.
  • pAccountOwner: The Cirro username of the account owner. If this value is provided, only this user will be able to create projects in the account. If not provided, any Cirro user that has permission to create projects can use this account.

The other parameters can be left as default and only changed if you have specific requirements.

If you use the Create Project button in Cirro, a direct link to the CloudFormation stack creation wizard will be provided with the parameters filled in.

For users deploying to a large number of AWS accounts, it is recommended to use CloudFormation StackSets or Terraform to make this process easier.

VPC

The EC2 instances launched by Cirro requires a Virtual Private Cloud (VPC). Make note of the VPC ID you want to use.

You can also use the default VPC in the region, but it is recommended to create a new VPC for your projects. This will allow you to have more control over the networking and security of the resources.

For more information, see Get started with Amazon VPC.

Please see VPC Considerations for Cirro-specific requirements.

Subnets

The EC2 instances launched by Cirro will be placed in these subnets. They must be located in the VPC specified above. In order to maximize spot availability, it is recommended to use all availability zones in the region.

Make note of the Subnet IDs that you want to use.

Encryption Method

You can choose to encrypt the S3 buckets with a customer managed KMS key or using the default AWS Managed keys.

In both cases, the data in the bucket is encrypted at rest. The difference is that with a KMS key, you can control access to the key separately from the data. It is also recommended to use a KMS Key if you have specific compliance requirements around encryption and key management, such as FIPS 140-2.

Use the Cirro KMS CloudFormation Template to deploy the key. You can further customize the key policy to meet your requirements, but it must allow Cirro to use the key. Make note of the KMS Key ARN if you decide to use a customer managed key.

Please see KMS Considerations for Cirro-specific requirements.

Cost Allocation Tag

A user-defined cost allocation tag for projectid should be defined in order for the project cost tracking and budgets to work properly.

Project Creation

Input the following in the project creation popup.

  • Account Type: Select "I want to use my own AWS account", which will reveal the options below.
  • Account ID: The AWS account ID. This value should be a 12 character string.
  • Account Name: The name used to describe the account. This is useful when the account hosts multiple projects.
  • Region Name: AWS Region Code. Currently, only us-west-2, us-east-1, us-east-2, and ca-central-1 are supported. Contact us if you need a different region.
  • VPC ID: The VPC ID you want to use for the project. This value should begin with "vpc-" followed by a set of numbers.
  • KMS ARN: (Optional) The key ARN used to encrypt project buckets. If not provided, the default AWS managed key will be used.
  • Batch Subnet IDs: The batch subnet IDs you want to use for the project. When providing multiple IDs, please separate them with spaces and/or commas.
  • Sagemaker Subnet IDs: The sagemaker subnet IDs you want to use for the project. When providing multiple IDs, please separate them with spaces and/or commas.

Once the project is created, it will take around 5-10 minutes to fully deploy the resources.

Cirro will automatically submit service quota increase requests for the following services:

Quota Code Description
L-1216C47A Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances
L-34B43A08 All Standard (A, C, D, H, I, M, R, T, Z) Spot Instance Requests
L-3819A6DF All G and VT Spot Instance Requests
L-88CF9481 All F Spot Instance Requests

Other notes

Securing your AWS account

Please use the AWS Prescriptive Guidance document. It is your responsibility to secure your AWS Account.

For projects that hold sensitive data, we recommend setting up CloudTrail at the organization level to capture S3 data events. Learn more about logging data events with CloudTrail.

The data events will contain the username of the Cirro user that is accessing the files in the S3 bucket, the action they performed, and the file they accessed. We do not support S3 Access Logging at the moment.

Limitations

  • A maximum of 10 compute-enabled projects can be created within a single AWS Account.

  • Some AWS organizations will have service control policies (SCPs) that limit the creation of certain resources (such as S3 Buckets). In this case, it is advised that you make an exception to this policy that includes the Cirro deployment role.

  • Currently, we do not support tag policies or the customization of tags. We tag the resources with the following tags.

    Key Value
    projectid UUID of the project
    organization Organization of the project
    owner Cirro
    application Cirro
    createdby Cirro Project Automation
    environment Production

VPC Considerations

If you have multiple AWS accounts, the VPC can be in a centralized account and then shared with the project accounts. This will enable you to share networking resources among the different projects if you have specific networking requirements (such as traffic inspection or stateful firewall).

It is recommended to have a VPC endpoint for S3 in the VPC. This will allow the EC2 instances to access S3 directly without going through the internet gateway. The other endpoints that are helpful to reduce internet traffic are Systems Manager (ssm), CloudWatch (logs), ECR Registry (ecr.dkr), and EC2 (ec2).

Your VPC must also have the ability to communicate with the following services:

  • github.com
  • dockerhub.com
  • quay.io
  • mirror.centos.org
  • Other services that you may use in pipelines

KMS Considerations

Usually, it is best practice to keep the key in a separate account depending on your security requirements. This will allow you to have separation of duties and delegate key access to a separate account.

For example, if a user has access to view the S3 bucket in the project account, they will not be able to read the data unless they are also given access to the key, and they cannot give themselves access to the key as it is in a different account.

It is not easy to change the encryption method after the project is created, so it is important to decide on the encryption method before creating the project. If you need to change the encryption method, you must re-encrypt all the objects in the S3 bucket.

AWS Resources

Cirro creates and manages resources in your AWS account. Updating or deleting these resources may cause issues with the Cirro platform.

Each project has its own set of resources and access roles, and no resources are shared between projects. This is true even if the projects are in the same AWS account.

The resources created by Cirro are described below:

Resource Group AWS Resources Purpose
Project Data S3 Buckets Project data storage & scratch space
Compute Environment AWS Batch
Compute Environment
Job Queue
Job Definition
EC2 Launch Template
EC2 Security Group
Instance Profile & Role
Compute resources for running pipelines
Budget Notifications AWS Budgets, SNS Topic, Lambda Cost tracking and notifications
Dataset Events Lambda, SQS, EventBridge Rule Data upload tracking
Batch Events Lambda, EventBridge Rule Analysis job tracking
Logs CloudWatch Log Group Log storage for analysis jobs (/Cirro/<project>/* & /aws/lambda/Cirro*)
Notebook SageMaker
Notebook Instance
Lifecycle Config
EC2 Security Group
Jupyter notebook environment
IAM Roles IAM Roles and Policies Roles used for Cirro services

For each resource group, Cirro deploys a CloudFormation stack that is named in the format Cirro<resourceGroup>-<projectId>. You can inspect the contents of these stacks to see more detail on the resources that were created.

Consulting

For additional support on AWS best practices, please contact us at support@cirro.bio.