API documentations¶
SetLibrary¶
The data structure for microbe-set SetLibrary
-
class
msea.
SetLibrary
(d_gmt=None, rank_means=None, rank_stds=None)¶ SetLibrary is convenient class for a reference microbe-set library.
-
enrich
(input_set, adjust=False, universe=1000)¶ Perform MSEA given an input_set.
Parameters: - input_set – a set of microbes as the input for MSEA analysis against this reference set
- adjust – if True, adjust for the expected distributions of ranks
- universe – number of microbes used as the universe size for Fisher’s exact test
Returns: returns a pandas.DataFrame object for the MSEA result table
-
get_empirical_ranks
(n=1000, universe=1200, fix_size=None)¶ Calculate the empirical rank for each reference sets.
Parameters: - n – number of permutations
- universe – number of microbes used as the universe size for Fisher’s exact test
- fix_size – if None, uses variable sizes when generating random sets; if int, uses fixed size random sets to evaluate the null distribution of the ranks
Returns: returns nothing
-
classmethod
load
(gmt_file=None, rank_means_file=None, rank_stds_file=None)¶ Load a reference set into a SetLibrary instance from files.
Parameters: - gmt_file – a file or url of a file for the reference set in GMT format
- rank_means_file – a .npy file for the array of mean ranks
- rank_stds_file – a .npy file for the array of std ranks
Returns: returns a SetLibrary object
-
save
(dirname)¶ Save the SetLibrary instance in a directory, optionally with computed rank_means and rank_stds.
Parameters: dirname – directory name to which the object is going to be stored Returns: returns nothing
-
Utility functions¶
Utils for performing microbe-set enrichment analysis.
-
msea.utils.
enrich
(microbes, d_gmt, rank_means=None, rank_stds=None, universe=1000)¶ Perform enrichment analysis for a set of microbes against a microbe-set library using Fisher’s exact test and z-score.
Parameters: - microbes – a set of microbes as the input for MSEA analysis against this reference set
- d_gmt – a dictionary of microbe-sets representing the reference microbe-set library
- rank_means – (optional) the array of mean ranks from null distribution
- rank_stds – (optional) the array of standard deviations of ranks from null distribution
- universe – number of microbes used as the universe size for Fisher’s exact test
Returns: returns a pandas.DataFrame object for the MSEA result table
-
msea.utils.
fisher_test
(s1, s2, universe)¶ Perform Fisher’s exact test for two sets.
Parameters: - s1 – a set of items
- s2 – a set of items
- universe – int, universe size
Returns: returns the odds ratio and p-value
-
msea.utils.
get_empirical_ranks
(d_gmt, n=1000, universe=1200, fix_size=None)¶ Generate random microbe sets to get empirical ranks for each term.
Parameters: - n – number of permutations
- universe – number of microbes used as the universe size for Fisher’s exact test
- fix_size – if None, uses variable sizes when generating random sets; if int, uses fixed size random sets to evaluate the null distribution of the ranks
Returns: returns the means and standard deviations of the null ranks
-
msea.utils.
multipletests_fdr_bh
(pvals, is_sorted=False)¶ FDR Benjamini-Hochberg correction for p-values adapted from statsmodels.
Parameters: - pvals – an array of nominal p-values
- is_sorted – bool, whether the p-values are sorted
Returns: returns an array of corrected p-values aka FDRs/q-values
-
msea.utils.
read_gmt
(file_or_url)¶ Read a gmt file into a dictionary of sets.
Parameters: file_or_url – a GMT file or URL of a GMT file Returns: a dictionary of sets
-
msea.utils.
write_gmt
(d_gmt, filename)¶ Write a dictionary of sets to a gmt file.
Parameters: - d_gmt – a dictionary of sets
- filename – filename for the GMT file
Returns: returns nothing