The input data required for greatR is a data frame that contains gene expression time-course data with all replicates. The illustrated diagram below shows the required structure of the input.

This data frame must contain reference and query expression data which users wish to compare, and the following five columns:

  • gene_id: locus name or unique ID of each gene.
  • accession: accession or name of the reference and query data to compare.
  • timepoint: time points of the gene expression data.
  • expression_value: desired expression values or measure of the abundance of gene or transcripts which users wish to compare. This value can be RPM, RPKM, FPKM, TPM, TMM, DESeq, SCnorm, GeTMM, ComBat-Seq, and raw reads counts.
  • replicate: biological replicate ID for an expression value at a particular time point.

Below we can see a real example of how the input data should look like:

gene_id accession timepoint expression_value replicate
BRAA02G018970.3C Ro18 11 0.3968734 Ro18-11-a
BRAA02G018970.3C Ro18 11 1.4147711 Ro18-11-b
BRAA02G018970.3C Col0 7 0.4667855 Col0-07-a
BRAA02G018970.3C Col0 7 0.0741901 Col0-07-b