Working with SFS#
Subtypes#
fastDFE offers a set of utilities for handling and manipulating site-frequency spectra (SFS) data, especially when there are multiple types. The Spectra class is designed to hold and manipulate spectra of multiple types. It provides a number of methods and properties for creating, accessing, and manipulating the spectra data. Usually we would obtain already stratified spectra from Parser, but we can also create them manually.
import fastdfe as fd
import matplotlib.pyplot as plt
# create spectra with two subtypes and two types
spectra = fd.Spectra.from_spectra({
"subtype1.type1": fd.Spectrum.standard_kingman(10) * 1,
"subtype1.type2": fd.Spectrum.standard_kingman(10) * 2,
"subtype2.type1": fd.Spectrum.standard_kingman(10) * 3,
})
# plot spectra
spectra.plot();
We access types by their index from which we obtain a Spectrum object.
sfs: fd.Spectrum = spectra["subtype1.type1"]
sfs.plot();
We can also use wildcards to access multiple types at once.
spectra["subtype1.*"].plot();
Grouping#
To get rid of the subtypes, we can merge the spectra over the specified number of groups.
spectra.merge_groups(1).plot();
All subtypes for each type are merged into a single spectrum by adding them upp.
Serialization#
We can also save the spectra to a file and restore them again.
spectra.to_file("out/spectra.csv")
spectra2 = fd.Spectra.from_file("out/spectra.csv")
Prefixing#
Here we prefix the spectra with a string to distinguish them and then combine them into a single spectra object.
spectra.prefix('original').combine(spectra2.prefix('restored')).plot();
For a complete reference of the available methods and properties, see Spectra and Spectrum.
Folded spectra#
Spectrum and Spectra objects can also be folded by collapsing the bins corresponding to the derived allele counts onto the bins corresponding to the ancestral allele counts. Basing the DFE inference on folded spectra, we have less information to work with, which is particularly noticeable when including beneficial mutations. However, folded spectra are easier to obtain, and are robust to misspecification of the ancestral state, which is often unknown. DFE inference is based on folded spectra whenever spectrum.is_folded is True.
# create subplots
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(7, 3))
# fold spectra object
spectra.fold().plot(ax=ax1)
# fold spectrum object
sfs.fold().plot(ax=ax2);