Configuration class#
- class Config(polydfe_spectra_config: str = None, polydfe_init_file: str = None, polydfe_init_file_id: int = 1, sfs_neut: Spectra | Spectrum = None, sfs_sel: Spectra | Spectrum = None, intervals_del: Tuple[float, float, int] = (-100000000.0, -1e-05, 1000), intervals_ben: Tuple[float, float, int] = (1e-05, 10000.0, 1000), intervals_h: Tuple[float, float, int] = (0.0, 1.0, 21), h_callback: Callable[[ndarray], ndarray] = None, integration_mode: Literal['midpoint', 'quad'] = 'midpoint', linearized: bool = True, model: Parametrization | str = 'GammaExpParametrization', seed: int = 0, x0: Dict[str, Dict[str, float]] = {}, bounds: Dict[str, Tuple[float, float]] = {}, scales: Dict[str, Literal['lin', 'log', 'symlog']] = {}, loss_type: Literal['likelihood', 'L2'] = 'likelihood', opts_mle: dict = {}, method_mle: str = 'L-BFGS-B', n_runs: int = 10, fixed_params: Dict[str, Dict[str, float]] = None, shared_params: List[SharedParams] = [], covariates: List[Covariate] = [], do_bootstrap: bool = True, n_bootstraps: int = 100, n_bootstrap_retries: int = 2, parallelize: bool = True, **kwargs)[source]#
Bases:
objectConfiguration class to be used for
BaseInferenceandJointInference.- __init__(polydfe_spectra_config: str = None, polydfe_init_file: str = None, polydfe_init_file_id: int = 1, sfs_neut: Spectra | Spectrum = None, sfs_sel: Spectra | Spectrum = None, intervals_del: Tuple[float, float, int] = (-100000000.0, -1e-05, 1000), intervals_ben: Tuple[float, float, int] = (1e-05, 10000.0, 1000), intervals_h: Tuple[float, float, int] = (0.0, 1.0, 21), h_callback: Callable[[ndarray], ndarray] = None, integration_mode: Literal['midpoint', 'quad'] = 'midpoint', linearized: bool = True, model: Parametrization | str = 'GammaExpParametrization', seed: int = 0, x0: Dict[str, Dict[str, float]] = {}, bounds: Dict[str, Tuple[float, float]] = {}, scales: Dict[str, Literal['lin', 'log', 'symlog']] = {}, loss_type: Literal['likelihood', 'L2'] = 'likelihood', opts_mle: dict = {}, method_mle: str = 'L-BFGS-B', n_runs: int = 10, fixed_params: Dict[str, Dict[str, float]] = None, shared_params: List[SharedParams] = [], covariates: List[Covariate] = [], do_bootstrap: bool = True, n_bootstraps: int = 100, n_bootstrap_retries: int = 2, parallelize: bool = True, **kwargs)[source]#
Create config object.
- Parameters:
polydfe_spectra_config (
str) – Path to polyDFE SFS config file.polydfe_init_file (
str) – Path to polyDFE init file.polydfe_init_file_id (
int) – ID of polyDFE init file.sfs_neut (
Spectra|Spectrum) – Neutral SFS. Note that we require monomorphic counts to be specified in order to infer the mutation rate.sfs_sel (
Spectra|Spectrum) – Selected SFS. Note that we require monomorphic counts to be specified in order to infer the mutation rate.intervals_del (
Tuple[float,float,int]) –(start, stop, n_interval)for deleterious population-scaled selection coefficients. The intervals will be log10-spaced. Decreasing the number of intervals to100provides nearly identical results while increasing speed, especially when precomputing across dominance coefficients.intervals_ben (
Tuple[float,float,int]) – Same asintervals_delbut for positive selection coefficients. Decreasing the number of intervals to100provides nearly identical results while increasing speed, especially when precomputing across dominance coefficients.intervals_h (
Tuple[float,float,int]) –(start, stop, n_interval)for dominance coefficients which are linearly spaced. This is only used when inferring dominance coefficients. Values ofhbetween the edges will be interpolated linearly.h_callback (
Callable[[ndarray],ndarray]) – A function mapping the scalar parameterhand the array of selection coefficientsSto dominance coefficients of the same shape, allowing models wherehdepends onS. The default islambda h, S: np.full_like(S, h), keepinghconstant. Expected allele counts for a given dominance value are obtained by linear interpolation between precomputed values inintervals_h. The inferred parameter is still namedh, even if transformed byh_callback, and its bounds, scales, and initial values can be set viabounds,scales, andx0. The fitness of heterozygotes and mutation homozygotes is defined as1 + 2hsand1 + 2s, respectively.integration_mode (
Literal['midpoint','quad']) – Integration mode when computing expected SFS under semidominance.quadis not recommended.linearized (
bool) – Whether to discretize and cache the linearized integral mapping DFE to SFS or usescipy.integrate.quadin each call.Falsenot recommended.model (
Parametrization|str) – Parametrization of the DFE.seed (
int) – Seed for the random number generator. UseNonefor no seed.x0 (
Dict[str,Dict[str,float]]) – Dictionary of initial values in the form{type: {param: value}}bounds (
Dict[str,Tuple[float,float]]) – Bounds for the optimization in the form {param: (lower, upper)}scales (
Dict[str,Literal['lin','log','symlog']]) – Scales for the optimization in the form {param: scale}loss_type (
Literal['likelihood','L2']) – Loss function to use.opts_mle (
dict) – Options for the optimization.method_mle (
str) – Method to use for optimization. Seescipy.optimize.minimizefor available methods.n_runs (
int) – Number of independent optimization runs out of which the best one is chosen. The first run will use the initial values if specified. Consider increasing this number if the optimization does not produce good results.fixed_params (
Dict[str,Dict[str,float]]) – Fixed parameters for the optimization.shared_params (
List[SharedParams]) – Shared parameters for the optimization.covariates (
List[Covariate]) – Covariates for the optimization.do_bootstrap (
bool) – Whether to do bootstrapping automatically.n_bootstraps (
int) – Number of bootstraps.n_bootstrap_retries (
int) – Number of optimization runs for each bootstrap sample. This parameter previously defined the number of retries per bootstrap sample when subsequent runs failed, but now it defines the total number of runs per bootstrap sample, taking the most likely one.parallelize (
bool) – Whether to parallelize the optimization.kwargs – Additional keyword arguments which are ignored.
- update(**kwargs)[source]#
Update config with given data.
- Parameters:
kwargs – Data to update.
- Return type:
- Returns:
Updated config.
- parse_polydfe_init_file(file: str, id: int = 1, type='all')[source]#
Parse polyDFE init file. This will define the initial parameters and which ones will be held fixed during the optimization.
- Parameters:
type – Type of parameters to parse for.
id (
int) – ID of the init file.file (
str) – Path to the init file.
- create_polydfe_init_file(file: str, n: int, type: str = 'all')[source]#
Create an init file for polyDFE.
- Parameters:
type (
str) – Type to use for the init file.n (
int) – SFS samples size.file (
str) – Path to the init file to be created.
- parse_polydfe_sfs_config(file: str)[source]#
Parse frequency spectra and mutational target site from polyDFE configuration file.
- Parameters:
file (
str) – Path to the polyDFE config file.
- create_polydfe_sfs_config(file: str)[source]#
Create a sfs config file for polyDFE.
- Parameters:
file (
str) – Path to the sfs config file to be created.
- to_dict()[source]#
Represent config as dictionary.
- Return type:
dict- Returns:
Dictionary representation of config.
- static from_dict(data: dict)[source]#
Load config from dictionary.
- Return type:
- Returns:
Config object.
- static from_json(data: str)[source]#
Load config from JSON str.
- Parameters:
data (
str) – JSON string.- Return type:
- Returns:
Config object.
- static from_yaml(data: str)[source]#
Load config from YAML str.
- Parameters:
data (
str) – YAML string.- Return type:
- Returns:
Config object.