Spectra class#

class Spectra(data: Dict[str, Iterable])[source]#

Bases: object

Class for holding and manipulating site-frequency spectra of multiple types.

__init__(data: Dict[str, Iterable])[source]#

Initialize spectra.

Parameters:

data (Dict[str, Iterable]) – Dictionary of SFS counts keyed by type

property n: int#

The sample size.

Returns:

Sample size

property k: int#

The number of types.

Returns:

Number of types

property n_monomorphic: Series#

The number of monomorphic sites.

Returns:

Number of monomorphic sites

property polymorphic: ndarray#

The polymorphic counts.

Returns:

Polymorphic counts

property n_polymorphic: ndarray#

The total number of polymorphic counts.

Returns:

Total number of polymorphic counts for each type

static from_list(data: Sequence, types: List)[source]#

Create from array of spectra. Note that data.ndim needs to be 2.

Parameters:
  • data (Sequence) – Array of spectra

  • types (List) – Types

Return type:

Spectra

Returns:

Spectra

property types: List[str]#

The types.

Returns:

Types

property n_sites: Series#

The number of mutational target sites which is the sum of all SFS entries.

Returns:

Number of mutational target sites for each type

property n_div: Series#

The number of divergence counts.

Returns:

Number of divergence counts for each type

property has_div: Series#

Whether n_div was specified.

Returns:

Whether n_div was specified for each type

property theta: Series#

Calculate site-wise population mutation rate using Watterson’s estimator. Note that theta is given per site, i.e. Watterson’s estimator is divided by the total number of sites (n_sites).

property Theta: Series#

Calculate genome-wide population mutation rate using Watterson’s estimator.

Note

Property Theta is not normalized by the total number of sites, unlike theta.

normalize()[source]#

Normalize spectra by sum of all entries.

Return type:

Spectra

Returns:

Normalized spectra

to_file(file: str)[source]#

Save object to file.

Parameters:

file (str) – File name

to_spectra()[source]#

Convert to dictionary of spectrum objects.

Return type:

Dict[str, Spectrum]

Returns:

Dictionary of spectrum objects

to_dataframe()[source]#

Get representation as dataframe.

Return type:

DataFrame

Returns:

Dataframe

to_numpy()[source]#

Convert to numpy array.

Return type:

ndarray

Returns:

Numpy array

to_list()[source]#

Convert to nested list.

Return type:

list

Returns:

Nested list

to_dict()[source]#

Convert to dictionary.

Return type:

dict

Returns:

Dictionary of lists

select(keys: str | List[str] | ndarray | tuple, use_regex: bool = True)[source]#

Select types. Alias for __getitem__.

Parameters:
  • keys (Union[str, List[str], ndarray, tuple]) – String or list of strings, possibly regex to match type names

  • use_regex (bool) – Whether to use regex to match type names

Return type:

Spectra

Returns:

Spectrum or Spectra depending on the number of matches

copy()[source]#

Copy object.

Return type:

Spectra

Returns:

Copy of object

get_empty()[source]#

Get a Spectra object with zero counts but having the same shape and types as self.

Return type:

Spectra

Returns:

Spectra object with zero counts

merge_groups(level: List[int] | int = 0)[source]#

Group over given levels and sum up spectra so the spectra are summed over the levels that were not specified.

Parameters:

level (Union[List[int], int]) – Level(s) to group over

Return type:

Spectra

Returns:

Spectra object with merged groups

has_dots()[source]#

Check whether column names contain dots.

Return type:

bool

Returns:

True if column names contain dots, False otherwise

replace_dots(replacement: str = '_')[source]#

Replace dots in column names with a given string.

Parameters:

replacement (str) – Replacement string

Return type:

Spectra

Returns:

Spectra object with replaced dots

property all: Spectrum#

The ‘all’ type equals the sum of all spectra.

Returns:

Spectrum object

combine(s: Spectra)[source]#

Merge types of two Spectra objects.

Parameters:

s (Spectra) – Other Spectra object

Return type:

Spectra

Returns:

Merged Spectra object

static from_dict(data: dict)[source]#

Load from nested dictionary first indexed by types and then by samples.

Parameters:

data (dict) – Dictionary of lists indexed by types

Return type:

Spectra

Returns:

Spectra object

static from_dataframe(data: DataFrame)[source]#

Load Spectra object from dataframe.

Parameters:

data (DataFrame) – Dataframe

Return type:

Spectra

Returns:

Spectra object

classmethod from_file(file: str)[source]#

Save object to file.

Parameters:

file (str) – Path to file, possibly URL

Return type:

Spectra

Returns:

Spectra object

static from_spectra(spectra: Dict[str, Spectrum])[source]#

Create from dict of spectrum objects indexed by type.

Parameters:

spectra (Dict[str, Spectrum]) – Dictionary of spectrum objects indexed by type

Return type:

Spectra

Returns:

Spectra object

static from_spectrum(sfs: Spectrum)[source]#

Create from single spectrum object. The type of the spectrum is set to ‘all’.

Parameters:

sfs (Spectrum) – Spectrum

Return type:

Spectra

Returns:

Spectra object

to_spectrum()[source]#

Convert to Spectrum object by summing over all types.

Return type:

Spectrum

Returns:

Spectrum object

plot(show: bool = True, file: str = None, title: str = None, log_scale: bool = False, use_subplots: bool = False, show_monomorphic: bool = False, kwargs_legend: dict = {'prop': {'size': 8}}, ax: plt.Axes = None)[source]#

Visualize spectra.

Parameters:
  • show (bool) – Whether to show the plot.

  • file (str) – File name to save the plot to.

  • title (str) – Plot title.

  • log_scale (bool) – Whether to use log scale on y-axis.

  • use_subplots (bool) – Whether to use subplots. Only for Python visualization backend.

  • show_monomorphic (bool) – Whether to show monomorphic sites.

  • kwargs_legend (dict) – Keyword arguments passed to plt.legend(). Only for Python visualization backend.

  • ax (plt.Axes) – Axes to plot on. Only for Python visualization backend and if use_subplots is False.

Return type:

plt.Axes

Returns:

Axes

drop_empty()[source]#

Remove types whose spectra have no counts.

Return type:

Spectra

Returns:

Spectra with non-empty types

drop_zero_entries()[source]#

Remove types whose spectra have some zero entries. Note that we ignore zero counts in the last entry i.e. fixed derived alleles.

Return type:

Spectra

Returns:

Spectra with non-zero entries

drop_sparse(n_polymorphic: int)[source]#

Remove types whose spectra have fewer than equal n_polymorphic polymorphic sites.

Return type:

Spectra

Returns:

Spectra

rename(names: List[str])[source]#

Rename types.

Parameters:

names (List[str]) – New names

Return type:

Spectra

Returns:

Spectra with renamed types

prefix(prefix: str)[source]#

Prefix types, i.e. ‘type’ -> ‘prefix.type’ for all types.

Parameters:

prefix (str) – Prefix

Return type:

Spectra

Returns:

Spectra with prefixed types

reorder_levels(levels: List[int])[source]#

Reorder levels.

Parameters:

levels (List[int]) – New order of levels

Return type:

Spectra

Returns:

Spectra with reordered levels

print()[source]#

Print spectra.

fold()[source]#

Fold spectra.

Returns:

Folded spectra

subsample(n: int, mode: Literal['random', 'probabilistic'] = 'probabilistic', seed: int | Generator = None)[source]#

Subsample spectra to a given sample size.

Warning

If using the ‘random’ mode, The SFS counts are cast to integers before subsampling so this will only provide sensible results if the SFS counts are integers or if they are large enough to be approximated by integers. The ‘probabilistic’ mode does not have this limitation.

Parameters:
  • n (int) – Sample size

  • mode (Literal['random', 'probabilistic']) – Subsampling mode. Either ‘random’ or ‘probabilistic’.

  • seed (int | Generator) – Random state or seed. Only for ‘random’ mode.

Return type:

Spectra

Returns:

Subsampled spectra

resample(seed: int | Generator = None)[source]#

Resample SFS assuming independent Poisson counts.

Parameters:

seed (int | Generator) – Random state or seed

Return type:

Spectra

Returns:

Resampled spectra.

is_folded()[source]#

Check whether spectra are folded.

Return type:

Dict[str, bool]

Returns:

Dictionary of types and whether they are folded

sort_types()[source]#

Sort types alphabetically.

Return type:

Spectra

Returns:

Sorted spectra object