Changelog
Source:NEWS.md
greatR 2.0.0
CRAN release: 2024-04-09
- Added {patchwork} as a dependency.
- Updated {greatR} logo.
- Updated default
optimisation_method
inregister()
to be “lbfgsb” (LBFSG-B) instead of “nm” (Nelder-Mead). - Added sample SOC1
arabidopsis_SOC1_data.csv
andbrapa_SOC1_data.csv
extdata. - Refactored
optimise_registration_parameters
argument inregister()
touse_optimisation
. - Updated
register()
to return object of S3 classres_greatR
. - Updated
calculate_distance()
to return object of S3 classdist_greatR
. - Refactored
summarise_registration()
assummary.res_greatR()
S3 method.
Improvements
- Deprecate use of
time_delta
variable in registration process. - Added
fun_args
(a list of arguments used when calling the function) inregister()
results. - Updated
summary.res_greatR()
to returnNA
instead of[NA, NA]
when all genes are non-registered. - Added
reg_params
(table containing distribution of registration parameters) to results list insummary.res_greatR()
method. - Simplified
calc_overlapping_percent()
calculation. - Take into consideration
overlapping_percent
when applying manual registration. - Updated logic of
calc_variance()
for data with no replicates to considerexpression_value
. - Updated
get_stretch_search_space_limits()
andget_shift_search_space_limits()
to exclude unexplorable regions in search space. - Improved
calculate_distance()
and auxget_timepoint_comb_*_data()
functions to eliminate column selection and renaming insidelapply()
calls, reducing execution time by up to 25%. - Added
type
(“registered” or “all”) andgenes_list
arguments tocalculate_distance()
to filter genes. - Added new unit tests.
- Updated unit tests, and added S3 class checks where apropriate.
- Updated vignettes and README diagrams and figures.
- Updated vignettes with additional examples, comments on arguments, and full coverage of all
plot()
methods.
Bug fixes
- Fixed
get_shift_search_space_limits()
to adjust shift space limits accordingly to removal oftime_delta
variable (see 48c943cd). - Fixed default
overlapping_percent = 0.5
(instead of 50) inregister_manually()
. - Fixed
get_stretch_search_space_limits()
to correctly determine lower and upper limits when single stretch value is provided. - Fixed issue in
get_shift_search_space_limits()
where range variables were not available whencalc_mode == "bound"
.
New functions
-
bind_results()
auxiliary function to merge results fromregister()
. -
theme_greatR()
function andgreatR_palettes
list. -
transform_input()
S3 generic to accept different types of input inregister()
. -
plot.res_greatR()
S3 method to replaceplot_registration_results()
. -
plot.dist_greatR()
S3 method to replaceplot_heatmap()
. -
plot.summary.res_greatR()
S3 method inspired byWVPlots::ScatterHistC()
.
greatR 1.1.0
CRAN release: 2024-01-09
- Added {furrr} and {future} as dependencies.
- Added
num_cores
parameter toregister()
to allow users to run registration in parallel. - Added
exp_sd
parameter toregister()
to allow users to manually set up experimental gene expression variance. - Updated
scaling_method
parameter inregister()
andscale_data()
to allow no scaling (“none”, default), Z-score scaling (“z-score”), and min-max scaling (“min-max”), and updated unit tests accordingly.
Improvements
- Updated
register()
to perform 3 sequential registrations when using Nelder-Mead, this improves the results of optimal stretch and shift parameters. - Updated
calc_loglik()
to usesigma_squared
in every time point in the sum. - Updated
scaled_data()
andpreprocess_data()
to returnall_data
object only, instead of alist()
containingall_data
. - Updated
compare_H1_and_H2()
to returnBIC_diff
column (BIC_combined - BIC_separate
), instead ofBIC_combined
andBIC_separate
on their own. - Updated
explore_manual_search_space()
to useBIC_diff
instead ofBIC_combined
to calculatebest_params
frommodel_comparison
table. - Updated
register()
to perform 3 sequential registrations when using Nelder-Mead, this improves the results of optimal stretch and shift parameters. This may be reverted by tweakingneldermead()
parameters to ensure correct convergence. - Added optional
stretch_init
andshift_init
toget_search_space_limits()
, and updatedoptimise()
to allow for differentspace_lims
calculation settings: automatic, given boundary box, and given initial coords (new). - Removed unused
mean_data
calculation frompreprocess_data()
and argument fromscale_data()
. - Moved “Will process N genes” message from
register()
topreprocess_data()
after runningfilter_*()
functions. - Ensure
results_list$data
is arranged/ordered correctly inregister()
. - Updated
get_H*_model_curves()
functions to ensure model curves are smooth. - Updated
parse_gene_facets()
to displayBIC_diff
in facet strips. - Added
plot_mean_data
parameter toplot_registration_results()
. - Updated
overlapping_percent
parameter inregister()
so it goes from 0 to 100 (it’s later normalised in the function to avoid breakages down the line). - Added
scaling_method
as an attribute indata
results fromregister()
, this is used inplot_registration_results()
to build the y-axis label according the the scaling method used. - Updated
brapa_arabidopsis_registration.rds
file with new pipeline results. - Split
get_search_space_limits()
into separate aux functions for stretch and shift, which allows more stretch and shift input combinations. - Updated
validate_params(..., registration_type = "optimisation")
to allow more stretch and shift input combinations.
Bug fixes
- Improved
get_timepoint_comb_original_data()
andget_timepoint_comb_registered_data()
to performcross_join()
on a singlegene_id
at a time usinglapply()
, this fixes “Error: vector memory exhausted (limit reached?)” error. - Updated
match_names()
to do doublesetdiff()
to ensure name matching is done two ways, and updated corresponding unit test.
New functions
-
filter_incomplete_accession_pairs()
to filter out genes that are missing one accession. -
calc_variance()
to preprocess data variance insidepreprocess_data()
instead ofcalc_loglik()
. - Aux
register_single_gene_*()
functions insideregister()
to simplify and generalise the pipeline for parallel registration.
greatR 1.0.0
CRAN release: 2023-07-19
- Rewrote registration pipeline from scratch, deprecating unnecessary, and redundant auxiliary functions.
- Added L-BFGS-B and Nelder-Mead (now default) optimisation methods to {greatR}.
- Switched to manual calculation of log likelihood via
calc_loglik()
instead ofstats::logLik()
. - Reduced computation time up to 1000 times, (x30 speed-up from package rewrite, and x35 speed-up from switching default optimisation method).
- Removed {dplyr}, {magrittr}, {purrr}, {rlang}, and {stringr} as package dependencies.
- Added {neldermead} as a package dependency.
- Updated list of exported functions:
register()
summarise_registration()
get_approximate_stretch()
plot_registration_results()
plot_heatmap()
calculate_distance()
Improvements
- Simplified parameters of main
register()
function, and addedscaling_method
. - Simplified structure of output object of
register()
. - Simplified parameters of
summarise_registration()
,plot_registration_results()
,plot_heatmap()
,calculate_distance()
to simply requireresults
object fromregister()
, vastly simplifying usage. - Improved messages, errors, and progress indicators with {cli}.
- Added correct pluralisation in {cli} messages.
- Rewrote unit tests to use {data.table} exclusively for data manipulation.
- Added unit tests for
calc_loglik_H1()
,calc_loglik_H2()
,calc_overlapping_percent()
,calculate_distance()
,cross_join()
,get_search_space_limits_from_params()
,get_search_space_limits()
,objective_fun()
,optimise()
,plot_heatmap()
,plot_registration_results()
,preprocess_data()
,register_manually()
,register()
,summary_registration()
,validate_params()
.
Bug fixes
- Fixed
match_names()
call when validating accession names inregister()
- Fixed use of deprecated
aes_string()
by parsingtimepoint_var
using!!ggplot2::sym()
call. - Fixed
preds
left join inplot_registration_results()
. - Fixed issue in
plot_registration_results()
not working when all genes are unregistered withtype = "registered"
. - Fixed calculation of
time_delta
inpreprocess_data()
to ensure it’s grouped bygene_id
andaccession
(not justaccession
).
greatR 0.2.0
CRAN release: 2022-06-08
- Added Alex Calderwood as package co-author.
- Added vignette for optimisation process.
- Refactored
num_shifts
andshift_extreme
parameters by simplifiedshifts
parameter.
Improvements
- Improved default parameter values in exported functions.
- Added {optimization}, {purrr} as package dependencies.
- Removed {cowplot}, {ggpubr}, {ggrepel}, {Rtsne}, and {viridis} as package dependencies.
- Cleaned up {cli} messages.
- Removed legacy AIC references, as it is no longer used.
- Updated
calculate_between_sample_distance()
to useregistration_results
as primary parameter instead ofmean_df
,mean_df_sc
, andimputed_mean_df
. - Added warning if there is no comparable time points found using users’ pre-defined parameters.
- Refactored
optimise_shift_extreme
asmaintain_min_num_overlapping_points
, properly defined and corrected the boundary box if number overlapping points whether needed to be maintained or not.
Bug fixes
- Check that input accessions exist in the input data in
get_approximate_stretch()
. - Manually create time point sorting levels for
x_sample
andy_sample
columns according inplot_heatmap()
. - Properly handle
-
character in accession names inplot_heatmap()
so that time points are parsed correctly.
New features
- Added optional parameter optimisation process using Simulated Annealing through
optimise_registration_params()
.
New functions
-
preprocess_data()
to simplifyscale_and_register_data()
code and reuse logic elsewhere. -
get_best_stretch_and_shift_simplified()
. -
get_BIC_from_registering_data()
. -
get_boundary_box()
. -
optimise_registration_params_single_gene()
. -
optimise_registration_params()
as wrapper ofoptimise_registration_params_single_gene()
for multiple genes. -
get_best_stretch_and_shift_after_optimisation()
.