Spectra class#
- class Spectra(data: Dict[str, Iterable])[source]#
Bases:
objectClass for holding and manipulating site-frequency spectra of multiple types.
- __init__(data: Dict[str, Iterable])[source]#
Initialize spectra.
- Parameters:
data (
Dict[str,Iterable]) – Dictionary of SFS counts keyed by type
- property n: int#
The sample size.
- Returns:
Sample size
- property k: int#
The number of types.
- Returns:
Number of types
- property n_monomorphic: Series#
The number of monomorphic sites.
- Returns:
Number of monomorphic sites
- property polymorphic: ndarray#
The polymorphic counts.
- Returns:
Polymorphic counts
- property n_polymorphic: ndarray#
The total number of polymorphic counts.
- Returns:
Total number of polymorphic counts for each type
- static from_list(data: Sequence, types: List)[source]#
Create from array of spectra. Note that data.ndim needs to be 2.
- Parameters:
data (
Sequence) – Array of spectratypes (
List) – Types
- Return type:
- Returns:
Spectra
- property types: List[str]#
The types.
- Returns:
Types
- property n_sites: Series#
The number of mutational target sites which is the sum of all SFS entries.
- Returns:
Number of mutational target sites for each type
- property n_div: Series#
The number of divergence counts.
- Returns:
Number of divergence counts for each type
- property has_div: Series#
Whether n_div was specified.
- Returns:
Whether n_div was specified for each type
- property theta: Series#
Calculate site-wise population mutation rate using Watterson’s estimator. Note that theta is given per site, i.e. Watterson’s estimator is divided by the total number of sites (
n_sites).
- property Theta: Series#
Calculate genome-wide population mutation rate using Watterson’s estimator.
- normalize()[source]#
Normalize spectra by sum of all entries.
- Return type:
- Returns:
Normalized spectra
- to_spectra()[source]#
Convert to dictionary of spectrum objects.
- Return type:
Dict[str,Spectrum]- Returns:
Dictionary of spectrum objects
- select(keys: str | List[str] | ndarray | tuple, use_regex: bool = True)[source]#
Select types. Alias for __getitem__.
- Parameters:
keys (
Union[str,List[str],ndarray,tuple]) – String or list of strings, possibly regex to match type namesuse_regex (
bool) – Whether to use regex to match type names
- Return type:
- Returns:
Spectrum or Spectra depending on the number of matches
- get_empty()[source]#
Get a Spectra object with zero counts but having the same shape and types as self.
- Return type:
- Returns:
Spectra object with zero counts
- merge_groups(level: List[int] | int = 0)[source]#
Group over given levels and sum up spectra so the spectra are summed over the levels that were not specified.
- Parameters:
level (
Union[List[int],int]) – Level(s) to group over- Return type:
- Returns:
Spectra object with merged groups
- has_dots()[source]#
Check whether column names contain dots.
- Return type:
bool- Returns:
True if column names contain dots, False otherwise
- replace_dots(replacement: str = '_')[source]#
Replace dots in column names with a given string.
- Parameters:
replacement (
str) – Replacement string- Return type:
- Returns:
Spectra object with replaced dots
- static from_dict(data: dict)[source]#
Load from nested dictionary first indexed by types and then by samples.
- Parameters:
data (
dict) – Dictionary of lists indexed by types- Return type:
- Returns:
Spectra object
- static from_dataframe(data: DataFrame)[source]#
Load Spectra object from dataframe.
- Parameters:
data (
DataFrame) – Dataframe- Return type:
- Returns:
Spectra object
- classmethod from_file(file: str)[source]#
Save object to file.
- Parameters:
file (
str) – Path to file, possibly URL- Return type:
- Returns:
Spectra object
- static from_spectra(spectra: Dict[str, Spectrum])[source]#
Create from dict of spectrum objects indexed by type.
- static from_spectrum(sfs: Spectrum)[source]#
Create from single spectrum object. The type of the spectrum is set to ‘all’.
- to_spectrum()[source]#
Convert to Spectrum object by summing over all types.
- Return type:
- Returns:
Spectrum object
- plot(show: bool = True, file: str = None, title: str = None, log_scale: bool = False, use_subplots: bool = False, show_monomorphic: bool = False, kwargs_legend: dict = {'prop': {'size': 8}}, ax: plt.Axes = None)[source]#
Visualize spectra.
- Parameters:
show (
bool) – Whether to show the plot.file (
str) – File name to save the plot to.title (
str) – Plot title.log_scale (
bool) – Whether to use log scale on y-axis.use_subplots (
bool) – Whether to use subplots. Only for Python visualization backend.show_monomorphic (
bool) – Whether to show monomorphic sites.kwargs_legend (
dict) – Keyword arguments passed toplt.legend(). Only for Python visualization backend.ax (plt.Axes) – Axes to plot on. Only for Python visualization backend and if
use_subplotsisFalse.
- Return type:
plt.Axes
- Returns:
Axes
- drop_empty()[source]#
Remove types whose spectra have no counts.
- Return type:
- Returns:
Spectra with non-empty types
- drop_zero_entries()[source]#
Remove types whose spectra have some zero entries. Note that we ignore zero counts in the last entry i.e. fixed derived alleles.
- Return type:
- Returns:
Spectra with non-zero entries
- drop_sparse(n_polymorphic: int)[source]#
Remove types whose spectra have fewer than equal
n_polymorphicpolymorphic sites.- Return type:
- Returns:
Spectra
- rename(names: List[str])[source]#
Rename types.
- Parameters:
names (
List[str]) – New names- Return type:
- Returns:
Spectra with renamed types
- prefix(prefix: str)[source]#
Prefix types, i.e. ‘type’ -> ‘prefix.type’ for all types.
- Parameters:
prefix (
str) – Prefix- Return type:
- Returns:
Spectra with prefixed types
- reorder_levels(levels: List[int])[source]#
Reorder levels.
- Parameters:
levels (
List[int]) – New order of levels- Return type:
- Returns:
Spectra with reordered levels
- subsample(n: int, mode: Literal['random', 'probabilistic'] = 'probabilistic', seed: int | Generator = None)[source]#
Subsample spectra to a given sample size.
Warning
If using the ‘random’ mode, The SFS counts are cast to integers before subsampling so this will only provide sensible results if the SFS counts are integers or if they are large enough to be approximated by integers. The ‘probabilistic’ mode does not have this limitation.
- Parameters:
n (
int) – Sample sizemode (
Literal['random','probabilistic']) – Subsampling mode. Either ‘random’ or ‘probabilistic’.seed (
int|Generator) – Random state or seed. Only for ‘random’ mode.
- Return type:
- Returns:
Subsampled spectra
- resample(seed: int | Generator = None)[source]#
Resample SFS assuming independent Poisson counts.
- Parameters:
seed (
int|Generator) – Random state or seed- Return type:
- Returns:
Resampled spectra.