Title: | What the Package Does (One Line, Title Case) |
---|---|
Description: | What the package does (one paragraph). |
Authors: | Sam El-Kamand [aut, cre] |
Maintainer: | Sam El-Kamand <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9000 |
Built: | 2024-11-09 03:22:15 UTC |
Source: | https://github.com/selkamand/mutaliskRutils |
Sample Names From Mutalisk Output
extract_sample_names_from_mutalisk_filenames(mutalisk_filenames)
extract_sample_names_from_mutalisk_filenames(mutalisk_filenames)
mutalisk_filenames |
names of mutalisk output files (character) |
sample name (string)
Sample Names From Mutalisk File Contents
extract_sample_names_from_mutalisk_files(mutalisk_filenames)
extract_sample_names_from_mutalisk_files(mutalisk_filenames)
mutalisk_filenames |
names of mutalisk output files (character) |
sample name (string)
Mutalisk directory to dataframe
mutalisk_best_signature_directory_to_dataframe(directory, metadata = NA)
mutalisk_best_signature_directory_to_dataframe(directory, metadata = NA)
directory |
path to mutalisk_best_fit folder. To obtain, run your VCFs through mutalisk. Select Mutational Signature (Best only) and click 'Get the selected result for all samples at once'. Then unzip the file, and youre ready to go |
metadata |
Either a path to csv file OR a dataframe. Must contain a header line which contains a SampleID column that matches that of mutalisk_dataframe (string) |
tibble
Add sample metadata from dataframe to mutalisk data.frame
mutalisk_dataframe_add_metadata(mutalisk_dataframe, sample_metadata)
mutalisk_dataframe_add_metadata(mutalisk_dataframe, sample_metadata)
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
sample_metadata |
a dataframe containing a SampleID column and additional columns for each property you want to add as metadata (data.frame) |
mutalisk dataframe with additional metadata columns (data.frame)
In normal mutalisk dataframe, each sample has data for ONLY the 1-7 signatures that comprise the 'best fit'. Bascically, we end up with a dataframe where not all signatures have entries for all samples. We call this 'implicit' missing values. This means that a signature level jitterplot won't show the samples where it was not included in this 'best fit' set, when we'd actually want to know that it contributed 0% to that sample. This function fixes the issue by adding entries for ALL signature - sample pairs, with Contributions set to 0% where relevant.
mutalisk_dataframe_expand(mutalisk_dataframe)
mutalisk_dataframe_expand(mutalisk_dataframe)
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
dataframe containing all combinations of Sample ID and Signatures. For cases where a signature was not included in the 'best fit' subset, Contribution is set to 0%.
Adds metadata from a file to the mutalisk dataframe
mutalisk_dataframe_inform_user_of_metadata(mutalisk_dataframe, metadata)
mutalisk_dataframe_inform_user_of_metadata(mutalisk_dataframe, metadata)
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
metadata |
Either a path to csv file OR a dataframe. Must contain a header line which contains a SampleID column that matches that of mutalisk_dataframe (string) |
mutalisk dataframe with metadata columns (data.frame)
Get a vector of metadata columns from a mutalisk_dataframe.
mutalisk_dataframe_metadata_column_names(mutalisk_dataframe)
mutalisk_dataframe_metadata_column_names(mutalisk_dataframe)
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
a character vector containing names of metadata columns. If no metadata columns have been added, returns a zero length character vector. (character)
Mutalisk files to dataframe
mutalisk_to_dataframe(mutalisk_files, sample_names_from_file_contents = FALSE)
mutalisk_to_dataframe(mutalisk_files, sample_names_from_file_contents = FALSE)
mutalisk_files |
a vector of filepaths, each leading to the report.txt files output when downloading best_signature results for all vcfs in cohort |
sample_names_from_file_contents |
guess sample names from filecontents instead of filenames (flag) |
a dataframe containing three columns:
SampleID: a sample identifier.
Signatures: an identifier for a particular signature.
Contributions: the percentage contribution of the signature to the patients genetic profile (0.1 = 10 percent).
You probably want to use tags$strongmutalisk_to_dataframe instead. See ?mutalisk_to_dataframe
mutalisk_to_dataframe_single_sample( mutalisk_file, sample_names_from_file_contents = FALSE )
mutalisk_to_dataframe_single_sample( mutalisk_file, sample_names_from_file_contents = FALSE )
mutalisk_file |
a vector of filepaths, each leading to the report.txt files output when downloading best_signature results for all vcfs in cohort |
sample_names_from_file_contents |
guess sample names from filecontents instead of filenames (flag) |
tibble
Custom Palette
pallette_cols23_customised(n)
pallette_cols23_customised(n)
n |
number of colors in pallete to return |
character vector of colors
pallette_cols23_customised(2)
pallette_cols23_customised(2)
Plot Signature-Level Dotplot Plots a signature-Level dotplot
plot_signature_contribution_jitterplot(mutalisk_dataframe)
plot_signature_contribution_jitterplot(mutalisk_dataframe)
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
a ggplot object
A note of warning: for different mutalisk runs, this function will not enforce uniform colours for a single mutational signature. Better to get all data in at once and add a facet_wrap call
plot_stacked_bar( mutalisk_dataframe, lump_type = "min_prop", lump_min = 0.1, topn = 5, legend = "right", legend_direction = NA, pal = pals::kovesi.diverging_rainbow_bgymr_45_85_c67, color_of_other = "grey60", facet_column = NA, fontsize_strip = 18, fontsize_axis_title = 18 )
plot_stacked_bar( mutalisk_dataframe, lump_type = "min_prop", lump_min = 0.1, topn = 5, legend = "right", legend_direction = NA, pal = pals::kovesi.diverging_rainbow_bgymr_45_85_c67, color_of_other = "grey60", facet_column = NA, fontsize_strip = 18, fontsize_axis_title = 18 )
mutalisk_dataframe |
a dataframe that can be produced using mutalisk_best_signature_directory_to_dataframe. Can also just make it yourself, if you want to visualise non-mutalisk data. Dataframe just needs 3 columns:
|
lump_type |
one of "min_prop", "topn", or "none". min_prop will allow lump together all signatures that contribute less than lump_min. topn will keep the topn contributing signatures distinct, and lump the rest together (string) |
lump_min |
see lump_type |
topn |
see lump_type |
legend |
Where should the legend be placed? ("top", "left", "bottom", "right") |
legend_direction |
How should the legend be oriented. By defualt will guess based on position of legend ("vertical", "horizontal") |
pal |
Palette to use for generating colours. |
color_of_other |
colour of signatures lumped into 'other' (string) |
facet_column |
name of column to use for faceting (string) |
fontsize_strip |
fontsize of facet titles (number) |
fontsize_axis_title |
fontsize of axis titles (number) |
a ggplot (gg)