CLIPflexR

The CLIPflexR package provides a set of its functions for the processing and analysis of differing forms of ClIP data for use within the R framework.

For more information on the processing of data using CLIPflexR you can see our CTK and CLIPflexR vignettes

Installation

Installing CLIPflexR

We can install CLIPflexR package from Github using the devtools package:

install.packages("devtools")
devtools::install_github("kathrynrozengagnon/CLIPflexR")

Installing Rfastp

We also can install the Rfastp package which may use to pre-process FastQ/FastA and de-duplicate BAM files as an alternative to the standard CTK/FASTX workflow.

If you are running >= R version 4.0, you can install the Bioconductor development Rfastp package:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("Rfastp")

If you are running < R version 4.0, use the following lines to install the Github Rfastp package:

devtools::install_github("RockefellerUniversity/rfastp")

Installing CTK software requirements

The CTK toolkit requires many external softwares to be available on the system path for use within their pipeline scripts.

One solution to the meet these system requirements is to create a Conda environment containing all the required software.

The CondaSysReq package provides a simple workflow to install self-contained Conda environments associated to packages using the Reticulate library.

This approach also allows us to capture the current Conda environment using the same tools we capture R library versions by means of the Renv package.

First we need to install the Herper package.

Installing Herper

If you are running >= R version 4.0:

library(devtools)
install_github("RockefellerUniversity/Herper")

If you are running < R version 4.0:

library(devtools)
install_github("RockefellerUniversity/Herper@3.5")

We can now use the Herper package to create the required Conda environment and requirments for the CLIPflexR package.

The below code will install the required external software to a Conda environment within the default location (same path as the Reticulate package’s Conda environments).

The path to the Conda executable and the name of the Conda environment:

library(Herper)
CondaInfo <- install_CondaSysReqs("CLIPflexR")
CondaInfo$pathToConda
CondaInfo$environment

The installed Conda environment may be accessed out of R just as with a standard environment.

Executables can be found the environment’s bin directory. Here we check the directory contains the fastx and homer executable.

Installing CTK software

The ctk toolkit is installed as part of the CLIPflexR install_ctk() function.

CLIPflexR::install_ctk()

This function installs CTK and it’s perl library dependencies to a user specified directory or if a Conda environment has been created with Herper to within the same Conda Environment associated to the package.

Using the CTK pipeline in CLIPflexR

Once we have installed the requirements for CTK we can make use of the tools within CLIPflexR very easily.

First we load the CLIPflexR package and review the loading messages informing us of any established conda paths.

## CLIPflexR_0.1.20 conda env found at /Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20
## ctk found  at /Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/bin/ctk
## czplib found  at /Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/lib/czplib

We can also list the important paths from the environmental options set by CLIPflexR.

getOption("CLIPflexR.condaEnv")
## [1] "/Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20"
getOption("CLIPflexR.ctk")
## [1] "/Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/bin/ctk"
getOption("CLIPflexR.czplib")
## [1] "/Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/lib/czplib"

Now that these are installed, all functions with the CLIPflexR package will use these by default without further configuration of paths or environmental variables.

require(CLIPflexR)
Fox3_Std <- system.file("extdata/Fox3_Std_small.fq.gz",package="CLIPflexR")
Fox3_Std_filtered <- ctk_fastqFilter(Fox3_Std,
                                     outFile = "SRR1107535_Test.fastq",
                                     qsFilter="mean:0-29:20",verbose=TRUE)
## fastq_filter.pl command is /Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/bin/ctk/fastq_filter.pl
## fastq_filter.pl arguments are /Users/runner/Library/r-miniconda/envs/CLIPflexR_0.1.20/bin/ctk/fastq_filter.pl  -if sanger  -of fastq  -f mean:0-29:20  /Users/runner/work/_temp/Library/CLIPflexR/extdata/Fox3_Std_small.fq.gz  SRR1107535_Test.fastq