
Registering data with parameter optimisation
Source:vignettes/optimise-parameters.Rmd
optimise-parameters.Rmd
Another feature provided by this package is an option for users to optimise the registration parameter values. To do this procedure, a user is given an option to initialise stretches and shifts. If a user defines stretch and shift factors boundary, the optimisation process will be processed within the boundary values initialised. If a user does not give any initialisation value, then the optimisation boundary will be automatically calculated.
Without boundary initialisation
The data used on register data section will be used. To do this optimisation process, users just need to enable parameter optimise_registration_parameters = TRUE
when using the main function scale_and_register_data()
as shown in the figure below.
As an example, we will use only one gene BRAA03G023790.3C
, we can filter the data by doing as follows:
gene_BRAA03G023790.3C_data <- all_data_df %>%
dplyr::filter(locus_name == "BRAA03G023790.3C")
gene_BRAA03G023790.3C_data %>%
head(5) %>%
knitr::kable()
locus_name | accession | tissue | timepoint | expression_value | group |
---|---|---|---|---|---|
BRAA03G023790.3C | Ro18 | apex | 11 | 1.984367 | Ro18-11-a |
BRAA03G023790.3C | Ro18 | apex | 11 | 1.474974 | Ro18-11-b |
BRAA03G023790.3C | Ro18 | apex | 11 | 2.194917 | Ro18-11-c |
BRAA03G023790.3C | Ro18 | apex | 29 | 113.797721 | Ro18-29-a |
BRAA03G023790.3C | Ro18 | apex | 29 | 94.650207 | Ro18-29-b |
# Running the registration
registration_results_without_boundary <- scale_and_register_data(
input_df = gene_BRAA03G023790.3C_data,
min_num_overlapping_points = 4,
initial_rescale = FALSE,
do_rescale = TRUE,
accession_data_to_transform = "Col0",
accession_data_ref = "Ro18",
start_timepoint = "reference",
maintain_min_num_overlapping_points = FALSE,
optimise_registration_parameters = TRUE
)
# ── Starting optimisation ────────────────────────────────────────────────────────────────────────────
# ℹ Using computed stretch boundary
# ℹ Using computed shift boundary
# ✓ Optimising registration parameters for genes (1/1) [11m 30s]
# ✓ Finished optimisation
#
# ── Model comparison results ─────────────────────────────────────────────────────────────────────────
# ℹ BIC finds registration better than non-registration for: 1/1
#
# ── Applying the best-shifts and stretches to gene expression ────────────────────────────────────────
# ✓ Normalising expression by mean and sd of compared values (1/1) [20ms]
# ✓ Applying best shift (1/1) [26ms]
# ℹ Max value of expression_value: 1.36
# ✓ Imputing transformed expression values (1/1) [29ms]
After running the registration, we can then visualise the results as follows:
registration_results_without_boundary$imputed_mean_df %>%
greatR::plot_registration_results() +
ggplot2::labs(title = "Registration results without boundary initialisation")
With boundary initialisation
Users can also specify or initialise boundary where the optimisation will be performed by doing the same routine as the process done above, but with the stretch and shift initialisation (see figure below).
# Running the registration
registration_results_with_boundary <- scale_and_register_data(
input_df = gene_BRAA03G023790.3C_data,
stretches = c(1.5, 2),
shifts = seq(1.5, 3, by = 0.5),
min_num_overlapping_points = 4,
initial_rescale = FALSE,
do_rescale = TRUE,
accession_data_to_transform = "Col0",
accession_data_ref = "Ro18",
start_timepoint = "reference",
maintain_min_num_overlapping_points = FALSE,
optimise_registration_parameters = TRUE
)
# ── Starting optimisation ────────────────────────────────────────────────────────────────────────────
# ℹ Using user-defined stretches as stretch boundary
# ℹ Using user-defined shifts as shift boundary
# ✓ Optimising registration parameters for genes (1/1) [8m 3.9s]
# ✓ Finished optimisation
#
# ── Model comparison results ─────────────────────────────────────────────────────────────────────────
# ℹ BIC finds registration better than non-registration for: 1/1
#
# ── Applying the best-shifts and stretches to gene expression ────────────────────────────────────────
# ✓ Normalising expression by mean and sd of compared values (1/1) [13ms]
# ✓ Applying best shift (1/1) [23ms]
# ℹ Max value of expression_value: 1.36
# ✓ Imputing transformed expression values (1/1) [20ms]
After running the registration, we can then visualise the results as follows:
registration_results$imputed_mean_df %>%
greatR::plot_registration_results() +
ggplot2::labs(title = "Registration results with boundary initialisation")