cellassign
automatically assigns single-cell RNA-seq data to known cell types across thousands of cells accounting for patient and batch specific effects. Information about a priori known markers cell types is provided as input to the model in the form of a (binary) marker gene by cell-type matrix. cellassign
then probabilistically assigns each cell to a cell type, removing subjective biases from typical unsupervised clustering workflows.
cellassign
is built using Google’s Tensorflow, and as such requires installation of the R package tensorflow
:
install.packages("tensorflow")
tensorflow::install_tensorflow(extra_packages='tensorflow-probability')
Please ensure this installs version 2 of tensorflow. You can check this by calling
TensorFlow v2.0.0 (/usr/local/lib/python3.7/site-packages/tensorflow)
cellassign
can then be installed from github:
With conda, install the current release version of cellassign
as follows:
cellassign
requires the following inputs:
exprs_obj
: Cell-by-gene matrix of raw counts (or SingleCellExperiment with counts
assay)marker_gene_info
: Binary gene-by-celltype marker gene matrix or list relating cell types to marker geness
: Size factorsX
: Design matrix for any patient/batch specific effectsThe model can be run as follows:
cas <- cellassign(exprs_obj = gene_expression_data,
marker_gene_info = marker_gene_info,
s = s,
X = X)
An example set of markers for the human tumour microenvironment can be loaded by calling
Please see the package vignette for details and caveats.