Spectrum class#

class Spectrum(data: Sequence[float])[source]#

Bases: Iterable

Class for holding and manipulating a site-frequency spectrum.

__init__(data: Sequence[float])[source]#

Initialize spectrum.

Parameters:

data (Sequence[float]) – SFS counts

property n: int#

The sample size.

Returns:

Sample size

property n_sites: float#

The total number of sites.

Returns:

Total number of sites

property n_div: float#

Number of divergence counts.

Returns:

Number of divergence counts

property has_div: bool#

Whether n_div was specified.

Returns:

Whether n_div was specified

property n_monomorphic: float#

Number of monomorphic sites.

Returns:

Number of monomorphic sites

property polymorphic: ndarray#

Get the polymorphic counts.

Returns:

Polymorphic counts

property n_polymorphic: ndarray#

Get the polymorphic counts.

Returns:

Polymorphic counts

to_list()[source]#

Convert to list.

Return type:

list

Returns:

SFS counts

to_spectra()[source]#

Convert to Spectra object.

Return type:

Spectra

Returns:

Spectra object

to_file(file: str)[source]#

Save object to file.

Parameters:

file (str) – File name

static from_file(file: str)[source]#

Load object from file.

Parameters:

file (str) – File name

Return type:

Spectrum

Returns:

Spectrum object

to_numpy()[source]#

Convert to array.

Return type:

ndarray

Returns:

SFS counts

property theta: float#

Calculate site-wise theta using Watterson’s estimator.

Returns:

Site-wise theta

fold()[source]#

Fold the site-frequency spectrum.

Return type:

Spectrum

Returns:

Folded spectrum

subsample(n: int, mode: Literal['random', 'probabilistic'] = 'probabilistic', seed: int | None = None)[source]#

Subsample spectrum to a given sample size.

Warning

If using the ‘random’ mode, The SFS counts are cast to integers before subsampling so this will only provide sensible results if the SFS counts are integers or if they are large enough to be approximated well by integers. The ‘probabilistic’ mode does not have this limitation.

Parameters:
  • n (int) – Sample size

  • mode (Literal['random', 'probabilistic']) – Subsampling mode. Either ‘random’ or ‘probabilistic’.

  • seed (int | None) – Seed for random number generator. Only for ‘random’ mode.

Return type:

Spectrum

Returns:

Subsampled spectrum

resample(seed: int = None)[source]#

Resample SFS assuming independent Poisson counts.

Parameters:

seed (int) – Seed for random number generator.

Return type:

Spectrum

Returns:

Resampled spectrum.

is_folded()[source]#

Check if the site-frequency spectrum is folded.

Return type:

bool

Returns:

True if folded, False otherwise

normalize()[source]#

Normalize SFS so that all non-monomorphic counts add up to 1.

Return type:

Spectrum

Returns:

Normalized spectrum

copy()[source]#

Copy the spectrum.

Return type:

Spectrum

Returns:

Copy of the spectrum

static from_polymorphic(data: Sequence)[source]#

Create Spectrum from polymorphic counts only.

Parameters:

data (Sequence) – Polymorphic counts

Return type:

Spectrum

Returns:

Spectrum

static from_list(data: Sequence)[source]#

Create Spectrum from list.

Parameters:

data (Sequence) – SFS counts

Return type:

Spectrum

Returns:

Spectrum

static from_polydfe(polymorphic: Sequence, n_sites: float, n_div: float)[source]#

Create Spectra from polyDFE specification which treats the number of mutational target sites and the divergence counts separately.

Parameters:
  • polymorphic (Sequence) – Polymorphic counts

  • n_sites (float) – Total number of sites

  • n_div (float) – Number of divergence counts

Return type:

Spectrum

Returns:

Spectrum

plot(show: bool = True, file: str = None, title: str = None, log_scale: bool = False, show_monomorphic: bool = False, kwargs_legend: dict = {'prop': {'size': 8}}, ax: plt.Axes = None)[source]#

Plot spectrum.

Parameters:
  • show (bool) – Whether to show plot.

  • file (str) – File to save plot to.

  • title (str) – Title of plot.

  • log_scale (bool) – Whether to use log scale on y-axis.

  • show_monomorphic (bool) – Whether to show monomorphic counts.

  • kwargs_legend (dict) – Keyword arguments passed to plt.legend(). Only for Python visualization backend.

  • ax (plt.Axes) – Axes to plot on. Only for Python visualization backend.

Return type:

plt.Axes

Returns:

Axes

static standard_kingman(n: int, n_monomorphic: int = 0)[source]#

Get standard Kingman SFS.

Parameters:
  • n (int) – sample size

  • n_monomorphic (int) – Number of monomorphic sites.

Return type:

Spectrum

Returns:

Standard Kingman SFS