uap – Robust, Consistent, and Reproducible Data Analysis¶
Authors:
Christoph Kämpf, Michael Specht, Sven-Holger Puppel, Alexander Scholz, Gero Doose, Kristin Reiche, Jana Hertel, Jörg Hackermüller
Description:
uap executes, controls and keeps track of the analysis of large data sets. It enables users to perform robust, consistent, and reproducible data analysis. uap encapsulates the usage of (bioinformatic) tools and handles data flow and processing during an analysis. Users can use predefined or self-made analysis steps to create custom analysis. Analysis steps encapsulate best practice usages for bioinformatic software tools. uap focuses on the analysis of high-throughput sequencing (HTS) data. But its plugin architecture allows users to add functionality, such that it can be used for any kind of large data analysis.
Usage:
uap is a command-line tool, implemented in Python. It requires a user-defined configuration file, which describes the analysis, as input.
Supported Platforms:
- Unix-like operating systems.
- High Performance Compute (HPC) cluster systems such as UGE, OGE/SGE and SLURM.
- see Tested Platforms for detailed information
Important Information
uap does NOT include all tools necessary for the data analysis. It expects that the required tools are already installed.
Table of contents¶
- Introducing uap
- Software Design
- Recommended uap Workflow
- Quick Start uap
- Installation of uap
- Analysis Configuration File
- Cluster Configuration File
- Command-Line Usage of uap
- Add New Functionality
- Tested Platforms
- Annotation Files
- Available steps
- Source steps
- Processing steps
- bam_to_bedgraph_and_bigwig
- bam_to_genome_browser
- bowtie2
- bowtie2_generate_index
- bwa_backtrack
- bwa_generate_index
- bwa_mem
- chromhmm_binarizebam
- chromhmm_learnmodel
- cuffcompare
- cufflinks
- cuffmerge
- cutadapt
- deepTools_bamCompare
- deepTools_bamPEFragmentSize
- deepTools_multiBamSummary
- deepTools_plotFingerprint
- discardLargeSplitsAndPairs
- fastqc
- fastx_quality_stats
- fix_cutadapt
- hisat2
- htseq_count
- macs2
- merge_assembly
- merge_fasta_files
- merge_fastq_files
- merge_numpy_zip_arrays
- pear
- pepr
- pepr_postprocess
- picard_add_replace_read_groups
- picard_markduplicates
- picard_merge_sam_bam_files
- post_cufflinksSuite
- preseq_complexity_curve
- preseq_future_genome_coverage
- preseq_future_yield
- reformatCigar
- remove_duplicate_reads_runs
- rgt_thor
- rseqc
- s2c
- sam_to_sorted_bam
- samtools
- samtools_faidx
- samtools_index
- samtools_merge
- samtools_sort
- samtools_stats
- segemehl
- segemehl_2017
- segemehl_generate_index
- segemehl_generate_index_bisulfite
- sra_fastq_dump
- stringtie
- stringtie_merge
- stringtie_prepDE
- subsetMappedReads
- tophat2
- trim_galore
- trimmomatic
- API documentation
- Information for uap Developers
Remarks¶
This documentation has been created using sphinx and reStructuredText.