Peptide-TCR co-culture screens support the development of personalized immunotherapy by revealing the specific reactivity of patient-derived T cell receptors to patient-specific (neo-)antigens. Here, we provide a software tool to assist in the creation of patient-specific antigen libraries, as well as in analyzing the result of co-culture screening data.
This R package is used to:
- Extract peptides with flanking region around mutations
- Run quality control on sequencing of these libraries
- Perform differential abundance testing of co-culture screens
Installation
We require R>=4.5.0
, as this includes important fixes on Bioconductor.
The package is currently only available on Github, use theremotes
package to install:
# run this in R
if (!requireNamespace("remotes", quietly=TRUE))
install.packages("remotes")
remotes::install_github("mschubert/pepitope", dependencies=TRUE, timeout=300)
In addition, we need the Rust programming language and its cargo
package manager, which we use to install fqtk
and guide-counter
:
We can check if the installation works by running:
Usage
Generating peptide constructs
Here we have sequenced the DNA (and optionally RNA) of a patient and identified the variants in a .vcf
file. We now want to extract the reference and mutated alternative sequences including their flanking regions into a summary report. The steps are:
- Load a genome and annotation, usually GRCh38 and Ensembl
- Load a VCF variants file as
VRanges
object and annotate the protein-coding mutations - Optionally, load a fusion VCF and annotating those
- Subset the peptide context around each mutation
- Make a report of variants, coding changes, and tiled peptides
More information can be found in the Variant calling vignette 🔗.
Code example
library(pepitope)
# genome and annotation
ens106 = AnnotationHub::AnnotationHub()[["AH100643"]]
asm = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
seqlevelsStyle(ens106) = "UCSC"
# read variants from VCF file, apply filters and annotate
variant_vcf_file = system.file("my_variants.vcf", package="pepitope")
vr = readVcfAsVRanges(variant_vcf_file) |>
filter_variants(min_cov=2, min_af=0.05, pass=TRUE)
ann = annotate_coding(vr, ens106, asm)
subs = ann |>
# filter_expressed(rna_sample, min_reads=1, min_tpm=0) |>
subset_context(15)
# read fusion variants, apply filters and annotate
fusion_vcf_file = system.file("my_fusions.vcf", package="pepitope")
vr2 = readVcfAsVRanges(fusion_vcf_file) |>
filter_fusions(min_reads=2, min_split_reads=1, min_tools=1)
seqlevelsStyle(vr2) = "UCSC"
fus = annotate_fusions(vr2, ens106, asm) |>
subset_context_fusion(15)
# create construct tables and make a report
tiled = make_peptides(subs, fus) |>
pep_tile() |>
remove_cutsite(BbsI="GAAGAC")
report = make_report(ann, subs, fus, tiled)
writexl::write_xlsx(report, "my_variants.xlsx")
Creating construct library (wetlab)
We want to express the sequences (minigenes) including their flanking regions (context) in target cells that will be used in a co-culture screen with T-cells. For this, we first need to add a barcode to each construct and then order them as gene blocks and transduce them into the target cells. The steps are:
- Add Barcodes in the annotation sheets as
barcode
orbarcode_{1,2}
columns - Order these constructs as gene blocks and clone them into expression vectors
- Transduce target cells with this peptide construct library
Code example (runnable without external data)
# creating barcoded constructs
lib = "https://raw.githubusercontent.com/hawkjo/freebarcodes/master/barcodes/barcodes12-1.txt"
valid_barcodes = readr::read_tsv(lib, col_names=FALSE)$X1
all_constructs = example_peptides(valid_barcodes)
plot_barcode_overlap(all_constructs, valid_barcodes)
Code example (loading from .xlsx
)
# this file is manually created from the output of step 1
fname = "my_combined_barcoded_file.xlsx"
sheets = readxl::excel_sheets(fname)
all_constructs = sapply(sheets, readxl::read_xlsx, path=fname, simplify=FALSE)
plot_barcode_overlap(all_constructs, valid_barcodes)
Performing quality control on construct library sequencing
In each step of generating the target cells expressing the reference and mutated versions of each peptide, we want to make sure our library is well-represented. For this, we will check if all constructs that should be in there are, and whether they are present in a similar enough amount. The steps are:
- Check the quality of the construct libraries
- Check the quality of the transduced target cells
- Check the quality of the co-culture screens
More information can be found in the Quality Control vignette 🔗
Code example
# demultiplexing and counting example data
sample_sheet = system.file("my_samples.tsv", package="pepitope")
fastq_file = example_fastq(sample_sheet, all_constructs)
temp_dir = demux_fq(fastq_file, sample_sheet, read_structures="7B+T")
dset = count_bc(temp_dir, all_constructs, valid_barcodes)
# quality control plots
plot_reads(dset)
plot_distr(dset)
Differential abundance testing of co-culture screens
Finally, we co-culture our target cells with T-cells expressing a variety of TCRs with our expressed peptide libraries to find the reactive ones. Those will be visible by decreasing in abundance more than the reference peptides compared to a mock-transduced population that was cultured the same way. The steps are:
- Calculate the differential abundance of peptide barcodes
- Plot the results to identify peptides recognized by T-cells
More information can be found in the Co-culture screen vignette 🔗.
Code example
# perform abundance testing and plot results
res = screen_calc(dset, list(c("Sample", "Mock")))
plot_screen(res$`Sample vs Mock`)