Name Description
GATK-Complete-Workflow-v1.0

This is a complete GATK workflow written in CWL-v1.0
Docker is used for the tools in the workflow.

BOSC 2016
FoG Boston 2016 Work Project
Old Ancestry Mapper Runs
Other scripts
bcbio test runs

bcbio CWL test runs: https://github.com/chapmanb/bcbio-nextgen/tree/master/cwl

test parent project
bcbio CWL
GATK bcbio style
Mason Lab - Methylkit

MethylKit is an R package for DNA methylation analysis from high-throughput bisulfite sequencing. It has many features, coverage/methylation statistics, differential methylation analysis, feature annotation, reading methylation calls.

Public Bioinformatics tools

Binaries of some Bioinformatics tools

lobSTR v.3 (Public)

lobSTR is a tool for profiling Short Tandem Repeats (STRs) from high throughput sequencing data.

UMC Public Pipeline (BOSC 2015)

A BWA-GATK Pipeline by the UMC Utrecht Community. Used in the Poster for Developing an Arvados BWA-GATK pipeline at BOSC 2015.

GATK2 Unified Genotyper (Public)

Run GATK2 on paired end reads and perform variant calls using Unified Genotyper. To run this pipeline, click on Run a pipeline and select “Demo GATK2 Pipeline”. Feel free to use "PGP HU34D5B9 “FASTQ” exome" as the input data set, which is 2 sets of paired end fastq files.

PGP hu826751

Complete Genomics whole genome sequencing raw data for Harvard Personal Genome Project participant hu826751 (2014-10-17).

Outputs of PATHOMAP_P00553.vcf

Mason Lab – Pathomap Output data for PATHOMAP_P00553.vcf

Docker Images

Mason Lab – Pathomap Docker Images

Output Demo Data

Mason Lab – Pathomap Output Data

GATK3 Haplotype Caller (Public)

Run GATK3 Best Practices pipeline on paired end reads and perform variant calls using both Haplotype Caller and Unified Genotyper. To run this pipeline, click on Run a pipeline and select “Demo GATK3 Haplotype Caller Pipeline” from the GATK3 Haplotype Caller Project. Feel free to use "PGP HU34D5B9 “FASTQ” exome" as the input data set, which is 2 sets of paired end fastq files.

Bcbio-nextgen (Public)

The bcbio-nextgen project was created by Brad Chapman from the Harvard School of Public Health.

Mason Lab - Pathomap / Ancestry Mapper (Public)

Part of the Pathomap Project developed by the Mason Lab at Weill Cornell Medical College.

Public Datasets / Collections
Platypus (Public)

Input fastq files and call variants using Platypus!
In order to run this pipeline, you can create a free account on the Curoverse home page. Then follow the instructions in the tutorial to use the test data or input your own data!

Sample Public Pipelines

A list of all Public Pipelines currently runs on Arvados.
Want your pipeline here? Email support@curoverse.com for help to get started!

RNA-seq/Tuxedo (Public)

A RNA-Seq pipeline consisting of Bowtie 2, Tophat 2, and Cufflinks.

Public GA4GH Collection

Readme

PCA of 174 whole genomes from the Personal Genome Project

Principal component analysis of 174 whole genome sequences (chromosomes 13 and 17) from the Personal Genome Project. From this project, you can explore the inputs (numpy files, path lengths, and human population data), the environment (docker image), the code (under pipeline templates), the tests ran (under pipelines), and their output. To rerun any analysis or alter inputs, sign in, create an account, and create a copy of this project.

Arvados Tutorial

Running a pipeline tutorial