
Register data with provided parameters
Source:vignettes/register-data-manually.Rmd
register-data-manually.Rmd
This article will show users how to register data using some pre-defined shift and stretch parameters. This demo will use one of the genes from the sample data provided by the package. Users can check whether their sample data can be registered using the manually specified shift and stretch values.
Load sample data
greatR
package provides an example of data frame
containing two different species A. thaliana and B.
rapa with two and three different replicates, respectively. This
data frame can be read as follows:
# Load the package
library(greatR)
library(data.table)
# Load a data frame from the sample data
b_rapa_data <- system.file("extdata/brapa_arabidopsis_all_replicates.csv", package = "greatR") |>
data.table::fread()
Registering single gene
Here, we will only use a single gene with
gene_id = "BRAA03G023790.3C"
from the sample data.
gene_BRAA03G023790.3C_data <- b_rapa_data[gene_id == "BRAA03G023790.3C"]
Before registering, we can use the helper function
get_approximate_stretch()
to approximate the stretch factor
between our sample datasets as shown in the figure below.
get_approximate_stretch(
gene_BRAA03G023790.3C_data,
reference = "Ro18",
query = "Col0"
)
#> [1] 2.666667
We can now use the estimated stretch calculated above in the
registration process below. Users need to set
optimise_registration_parameters = FALSE
to disable the
automated optimisation process.
registration_results <- register(
gene_BRAA03G023790.3C_data,
reference = "Ro18",
query = "Col0",
scaling_method = "z-score",
stretches = 2.6,
shifts = 4,
optimise_registration_parameters = FALSE
)
#> ── Validating input data ───────────────────────────────────────────────────────
#> ℹ Will process 1 gene.
#> ℹ Using estimated standard deviation, as no `exp_sd` was provided.
#>
#> ── Starting manual registration ────────────────────────────────────────────────
#> ✔ Applying registration for genes (1/1) [34ms]
To check whether the gene is registered or not, we can get the
summary results by accessing the model_comparison
table
from the registration result.
registration_results$model_comparison |>
knitr::kable()
gene_id | stretch | shift | BIC_diff | registered |
---|---|---|---|---|
BRAA03G023790.3C | 2.6 | 4 | -5.034115 | TRUE |
As we can see, using the given stretch and shift parameter above, the B. rapa gene BRAA03G023790.3C can be registered.
Registering multiple gene with different pre-defined registration parameters
Users can also register multiple different genes using their
pre-defined parameters. Similar to registration process above, users
need to set optimise_registration_parameters = FALSE
to
disable the automated optimisation process.
registration_results <- register(
b_rapa_data,
reference = "Ro18",
query = "Col0",
scaling_method = "z-score",
stretches = seq(1, 3, 0.1),
shifts = seq(0, 4, 0.1),
optimise_registration_parameters = FALSE
)
#> ── Validating input data ───────────────────────────────────────────────────────
#> ℹ Will process 10 genes.
#> ℹ Using estimated standard deviation, as no `exp_sd` was provided.
#>
#> ── Starting manual registration ────────────────────────────────────────────────
#> ✔ Applying registration for genes (10/10) [15.5s]
Similar to the previous registration process, users can analyse further the registration result by following the visualising results vignette.