DenseAdditiveDominanceLinearGenomicModel#
- class pybrops.model.gmod.DenseAdditiveDominanceLinearGenomicModel.DenseAdditiveDominanceLinearGenomicModel(beta, u_misc, u_a, u_d, trait=None, model_name=None, hyperparams=None, **kwargs)[source]#
Bases:
DenseAdditiveLinearGenomicModel
,AdditiveDominanceLinearGenomicModel
The DenseAdditiveDominanceLinearGenomicModel class represents a Multivariate Multiple Linear Regression model.
A Multivariate Multiple Linear Regression model is defined as:
\[\mathbf{Y} = \mathbf{XB} + \mathbf{ZU} + \mathbf{E}\]Where:
\(\mathbf{Y}\) is a matrix of response variables of shape
(n,t)
.\(\mathbf{X}\) is a matrix of fixed effect predictors of shape
(n,q)
.\(\mathbf{B}\) is a matrix of fixed effect regression coefficients of shape
(q,t)
.\(\mathbf{Z}\) is a matrix of random effect predictors of shape
(n,p)
.\(\mathbf{U}\) is a matrix of random effect regression coefficients of shape
(p,t)
.\(\mathbf{E}\) is a matrix of error terms of shape
(n,t)
.
Block matrix modifications to :
\(\mathbf{Z}\) and \(\mathbf{U}\) can be decomposed into block matrices pertaining to different sets of effects:
\[\mathbf{Z} = \begin{bmatrix} \mathbf{Z_{misc}} & \mathbf{Z_{a}} & \mathbf{Z_{d}} \end{bmatrix}\]Where:
\(\mathbf{Z_{misc}}\) is a matrix of miscellaneous random effect predictors of shape
(n,p_misc)
\(\mathbf{Z_{a}}\) is a matrix of additive genomic marker predictors of shape
(n,p_a)
\(\mathbf{Z_{d}}\) is a matrix fo dominance genomic marker predictors of shape
(n,p_d)
\[\begin{split}\mathbf{U} = \begin{bmatrix} \mathbf{U_{misc}} \\ \mathbf{U_{a}} \\ \mathbf{U_{d}} \end{bmatrix}\end{split}\]Where:
\(\mathbf{U_{misc}}\) is a matrix of miscellaneous random effects of shape
(p_misc,t)
\(\mathbf{U_{a}}\) is a matrix of additive genomic marker effects of shape
(p_a,t)
\(\mathbf{U_{d}}\) is a matrix of dominance genomic marker effects of shape
(p_d,t)
Shape definitions:
n
is the number of individualsq
is the number of fixed effect predictors (e.g. environments)p
is the number of random effect predictors.p_misc
is the number of miscellaneous random effect predictors.p_a
is the number of additive genomic marker predictors.p_d
is the number of dominance genomic marker predictors.The sum of
p_misc
andp_a
andp_d
equalsp
.t
is the number of traits
Constructor for DenseAdditiveDominanceLinearGenomicModel class.
- Parameters:
beta (numpy.ndarray) –
A
float64
fixed effect regression coefficient matrix of shape(q,t)
.Where:
q
is the number of fixed effect predictors (e.g. environments).t
is the number of traits.
u_misc (numpy.ndarray, None) –
A
float64
random effect regression coefficient matrix of shape(p_misc,t)
containing miscellaneous effects.Where:
p_misc
is the number of miscellaneous random effect predictors.t
is the number of traits.
If
None
, then set to an empty array of shape(0,t)
.u_a (numpy.ndarray, None) –
A
float64
random effect regression coefficient matrix of shape(p_a,t)
containing additive marker effects.Where:
p_a
is the number of additive marker effect predictors.t
is the number of traits.
If
None
, then set to an empty array of shape(0,t)
.u_d (numpy.ndarray, None) –
A
float64
random effect regression coefficient matrix of shape(p_d,t)
containing dominance marker effects.Where:
p_d
is the number of dominance marker effect predictors. Must be equal top_a
.t
is the number of traits.
If
None
, then set a zero array of shape(p_a,t)
trait (numpy.ndarray, None) –
An
object_
array of shape(t,)
.Where:
t
is the number of traits.
model_name (str, None) – Name of the model.
hyperparams (dict, None) – Model parameters.
kwargs (dict) – Used for cooperative inheritance. Dictionary passing unused arguments to the parent class constructor.
Methods
Calculate the Bulmer effect.
Calculate the Bulmer effect.
Make a shallow copy of the DenseAdditiveDominanceLinearGenomicModel.
Determine whether a deleterious allele is available in the present taxa.
Calculate the deleterious allele count across all taxa.
Determine whether a deleterious allele is fixed across all taxa.
Calculate the deleterious allele frequency across all taxa.
Determine whether a deleterious allele is polymorphic across all taxa.
Make a deep copy of the DenseAdditiveDominanceLinearGenomicModel.
Determine whether a favorable allele is polymorphic or fixed across all taxa.
Calculate the favorable allele count across all taxa.
Determine whether a favorable allele is fixed across all taxa.
Calculate the favorable allele frequency across all taxa.
Determine whether a favorable allele is polymorphic across all taxa.
Fit a dense, additive linear genomic model.
Fit a dense, additive + dominance linear genomic model.
Read a
DenseAdditiveDominanceLinearGenomicModel
from a set of CSV files specified by values in adict
.Read
DenseAdditiveDominanceLinearGenomicModel
from an HDF5 file.Read an object from a
dict
ofpandas.DataFrame
.Calculate genomic estimated breeding values.
Calculate genomic estimated breeding values.
Calculate genomic estimated genotypic values.
Calculate genomic estimated genotypic values.
Calculate the lower selection limit for a population.
Calculate the lower selection limit for a population.
Determine whether a neutral allele is fixed across all taxa.
Determine whether a neutral allele is polymorphic across all taxa.
Predict breeding values.
Predict breeding values.
Return the coefficient of determination R**2 of the prediction.
Return the coefficient of determination R**2 of the prediction.
Export a DenseAdditiveDominanceLinearGenomicModel to a set of CSV files specified by values in a
dict
.Write
DenseAdditiveDominanceLinearGenomicModel
to an HDF5 file.Export a DenseAdditiveDominanceLinearGenomicModel to a
dict
ofpandas.DataFrame
.Calculate the upper selection limit for a population.
Calculate the upper selection limit for a population.
Calculate the population additive genetic variance
Calculate the population additive genetic variance
Calculate the population genetic variance.
Calculate the population genetic variance.
Calculate the population additive genic variance
Calculate the population additive genic variance
Attributes
Fixed effect regression coefficients.
Description for property hyperparams.
Description for property model_name.
Number of explanatory variables required by the model.
Number of fixed effect explanatory variables required by the model.
Number of random effect explanatory variables required by the model.
Number of additive genomic marker explanatory variables required by the model.
Number of dominance genomic marker explanatory variables required by the model.
Number of miscellaneous random effect explanatory variables required by the model.
Number of model parameters.
Number of fixed effect parameters.
Number of random effect parameters.
Number of additive genomic marker parameters.
Number of additive genomic marker parameters.
Number of miscellaneous random effect parameters.
Number of traits predicted by the model.
Description for property trait.
Random effect regression coefficients.
Additive genomic marker effects.
Additive genomic marker effects.
Miscellaneous random effect regression coefficients.
- property beta: ndarray#
Fixed effect regression coefficients.
- bulmer(gtobj, ploidy=None, **kwargs)#
Calculate the Bulmer effect.
- Parameters:
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) –
Ploidy of the species. If ploidy is None:
If gtobj is a GenotypeMatrix, then get ploidy from GenotypeMatrix.
If gtobj is a numpy.ndarray, then assumed to be 2 (diploid).
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population Bulmer effect statistics. In the event that additive genic variance is zero, NaN’s are produced.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- bulmer_numpy(Z, p, ploidy=2, **kwargs)#
Calculate the Bulmer effect.
- Parameters:
Z (numpy.ndarray) – A matrix of genotypes.
p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population Bulmer effect statistics. In the event that additive genic variance is zero, NaN’s are produced.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- copy()[source]#
Make a shallow copy of the DenseAdditiveDominanceLinearGenomicModel.
- Returns:
out – A shallow copy of the original DenseAdditiveDominanceLinearGenomicModel
- Return type:
- daavail(gmat, dtype=None, **kwargs)#
Determine whether a deleterious allele is available in the present taxa.
An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native boolean type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A
numpy.ndarray
of shape(p,t)
containing whether a deleterious allele is available.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- dacount(gmat, dtype=None, **kwargs)#
Calculate the deleterious allele count across all taxa.
An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to count deleterious alleles.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing allele counts of the deleterious allele.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- dafixed(gmat, dtype=None, **kwargs)#
Determine whether a deleterious allele is fixed across all taxa.
An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A
numpy.ndarray
of shape(p,t)
containing whether a deleterious allele is fixed.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- dafreq(gmat, dtype=None, **kwargs)#
Calculate the deleterious allele frequency across all taxa.
An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A
numpy.ndarray
of shape(p,t)
containing allele frequencies of the deleterious allele.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- dapoly(gmat, dtype=None, **kwargs)#
Determine whether a deleterious allele is polymorphic across all taxa.
An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A
numpy.ndarray
of shape(p,t)
containing whether a deleterious allele is polymorphic.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- deepcopy(memo=None)[source]#
Make a deep copy of the DenseAdditiveDominanceLinearGenomicModel.
- Parameters:
memo (dict) – Dictionary of memo metadata.
- Returns:
out – A deep copy of the original DenseAdditiveDominanceLinearGenomicModel
- Return type:
- faavail(gmat, dtype=None, **kwargs)#
Determine whether a favorable allele is polymorphic or fixed across all taxa.
An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing whether a favorable allele is available.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- facount(gmat, dtype=None, **kwargs)#
Calculate the favorable allele count across all taxa.
An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to count favorable alleles.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing allele counts of the favorable allele.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- fafixed(gmat, dtype=None, **kwargs)#
Determine whether a favorable allele is fixed across all taxa.
An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing whether a favorable allele is fixed.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- fafreq(gmat, dtype=None, **kwargs)#
Calculate the favorable allele frequency across all taxa.
An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing allele frequencies of the favorable allele.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- fapoly(gmat, dtype=None, **kwargs)#
Determine whether a favorable allele is polymorphic across all taxa.
An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing whether a favorable allele is polymorphic.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- classmethod fit(ptobj, cvobj, gtobj, **kwargs)[source]#
Fit a dense, additive linear genomic model.
- Parameters:
ptobj (BreedingValueMatrix, pandas.DataFrame, numpy.ndarray) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.
cvobj (numpy.ndarray) – An object containing covariate data.
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
trait (numpy.ndarray, None) – A trait name array of shape (t,).
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- classmethod fit_numpy(Y, X, Z, **kwargs)[source]#
Fit a dense, additive + dominance linear genomic model.
- Parameters:
Y (numpy.ndarray) – A phenotype matrix of shape (n,t).
X (numpy.ndarray) – A covariate matrix of shape (n,q).
Z (numpy.ndarray) – A genotypes matrix of shape (n,p).
trait (numpy.ndarray) – A trait name array of shape (t,).
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- classmethod from_csv_dict(filenames, sep=',', header=0, trait_cols='infer', model_name=None, hyperparams=None, **kwargs)[source]#
Read a
DenseAdditiveDominanceLinearGenomicModel
from a set of CSV files specified by values in adict
.- Parameters:
filenames (str) –
Dictionary of CSV file names from which to read.
Must have the following fields:
- ``"beta"`` is a ``str`` containing fixed effects. - ``"u_misc"`` is ``None`` or a ``str`` of CSV file path containing
miscellaneous random effects.
"u_a"
isNone
or astr
of CSV file path containing additive genetic marker random effects."u_d"
isNone
or astr
of CSV file path containing dominance genetic marker random effects.
sep (str, default = ',') – CSV delimiter to use.
header (int, list of int, default=0) – Row number(s) to use as the column names, and the start of the data.
kwargs (dict) – Additional keyword arguments to use for dictating importing from a CSV.
trait_cols (Sequence, str, None, default = "trait") – Names of the trait columns to which to read regression coefficients. If
Sequence
, column names are given by the strings or integers in thetrait_cols
Sequence. Ifstr
, must be equal to"infer"
. Use columns in the"beta"
input dataframe to load trait breeding values. IfNone
, do not load any trait regression coefficients.model_name (str, None) – Name of the model.
hyperparams (dict, None) – Model parameters.
kwargs – Additional keyword arguments to use for dictating importing from a CSV.
- Returns:
out – A
DenseAdditiveDominanceLinearGenomicModel
read from a set of CSV files.- Return type:
- classmethod from_hdf5(filename, groupname=None)[source]#
Read
DenseAdditiveDominanceLinearGenomicModel
from an HDF5 file.- Parameters:
filename (str, Path, h5py.File) – If
str
orPath
, an HDF5 file name from which to read. File is closed after reading. Ifh5py.File
, an opened HDF5 file from which to read. File is not closed after reading.groupname (str, None) – If
str
, an HDF5 group name under whichDenseAdditiveDominanceLinearGenomicModel
data is stored. IfNone
,DenseAdditiveDominanceLinearGenomicModel
is read from base HDF5 group.
- Returns:
out – A genomic model read from file.
- Return type:
- classmethod from_pandas_dict(dic, trait_cols='infer', model_name=None, hyperparams=None, **kwargs)[source]#
Read an object from a
dict
ofpandas.DataFrame
.- Parameters:
dic (dict) –
Python dictionary containing
pandas.DataFrame
from which to read. Must have the following fields:- ``"beta"`` is a ``pandas.DataFrame`` containing fixed effects. - ``"u_misc"`` is ``None`` or ``pandas.DataFrame`` containing
miscellaneous random effects.
"u_a"
isNone
or apandas.DataFrame
containing additive genetic marker random effects.
trait_cols (Sequence, str, None, default = "trait") – Names of the trait columns to which to read regression coefficients. If
Sequence
, column names are given by the strings or integers in thetrait_cols
Sequence. Ifstr
, must be equal to"infer"
. Use columns in the"beta"
input dataframe to load trait breeding values. IfNone
, do not load any trait regression coefficients.model_name (str, None) – Name of the model.
hyperparams (dict, None) – Model parameters.
kwargs (dict) – Additional keyword arguments to use for dictating importing from a
dict
ofpandas.DataFrame
.
- Returns:
out – A DenseAdditiveDominanceLinearGenomicModel read from a
dict
ofpandas.DataFrame
.- Return type:
- gebv(gtobj, **kwargs)#
Calculate genomic estimated breeding values.
Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.
- Parameters:
gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – Genomic estimated breeding values matrix.
- Return type:
- gebv_numpy(Z, **kwargs)#
Calculate genomic estimated breeding values.
Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.
- Parameters:
Z (numpy.ndarray) – A matrix of genotype values.
kwargs (dict) – Additional keyword arguments.
- Returns:
gebv_hat – A matrix of genomic estimated breeding values.
- Return type:
numpy.ndarray
- gegv(gtobj, **kwargs)[source]#
Calculate genomic estimated genotypic values.
- Parameters:
gtobj (GenotypeMatrix, numpy.ndarray) – A matrix of genotypic markers.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A matrix of genomic estimated genotypic values.
- Return type:
numpy.ndarray
- gegv_numpy(Z, **kwargs)[source]#
Calculate genomic estimated genotypic values.
- Parameters:
Z (numpy.ndarray) – A matrix of genotypic markers.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A matrix of genomic estimated genotypic values.
- Return type:
numpy.ndarray
- property hyperparams: dict#
Description for property hyperparams.
- lsl(gtobj, ploidy=None, unscale=False, **kwargs)#
Calculate the lower selection limit for a population.
- Parameters:
gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
unscale (bool) – If
True
, then apply the mean of the fixed effects to the output.kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population lower selection limit statistics.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- lsl_numpy(p, ploidy, unscale=False, **kwargs)#
Calculate the lower selection limit for a population.
- Parameters:
p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
unscale (bool) – If
True
, then apply the mean of the fixed effects to the output.kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population lower selection limit statistics.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- property model_name: str#
Description for property model_name.
- nafixed(gmat, dtype=None, **kwargs)#
Determine whether a neutral allele is fixed across all taxa.
An allele is considered neutral if its effect is equal to zero.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing whether a neutral allele is fixed.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- napoly(gmat, dtype=None, **kwargs)#
Determine whether a neutral allele is polymorphic across all taxa.
An allele is considered neutral if its effect is equal to zero.
- Parameters:
gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If
None
, use the native type.kwargs (dict) – Additional keyword arguments.
- Returns:
out – A numpy.ndarray of shape
(p,t)
containing whether a neutral allele is polymorphic.Where:
p
is the number of alleles.t
is the number of traits.
- Return type:
numpy.ndarray
- property nexplan: Integral#
Number of explanatory variables required by the model.
- property nexplan_beta: Integral#
Number of fixed effect explanatory variables required by the model.
- property nexplan_u: Integral#
Number of random effect explanatory variables required by the model.
- property nexplan_u_a: Integral#
Number of additive genomic marker explanatory variables required by the model.
- property nexplan_u_d: Integral#
Number of dominance genomic marker explanatory variables required by the model.
- property nexplan_u_misc: Integral#
Number of miscellaneous random effect explanatory variables required by the model.
- property nparam: Integral#
Number of model parameters.
- property nparam_beta: Integral#
Number of fixed effect parameters.
- property nparam_u: Integral#
Number of random effect parameters.
- property nparam_u_a: Integral#
Number of additive genomic marker parameters.
- property nparam_u_d: Integral#
Number of additive genomic marker parameters.
- property nparam_u_misc: Integral#
Number of miscellaneous random effect parameters.
- property ntrait: int#
Number of traits predicted by the model.
- predict(cvobj, gtobj, **kwargs)[source]#
Predict breeding values.
Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.
- Parameters:
cvobj (numpy.ndarray) – An object containing covariate data.
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values. If
numpy.ndarray
, must be coded as{0,1,2}
.kwargs (dict) – Additional keyword arguments.
- Returns:
out – Estimated breeding values matrix.
- Return type:
- predict_numpy(X, Z, **kwargs)[source]#
Predict breeding values.
Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.
- Parameters:
X (numpy.ndarray) – A matrix of covariates.
Z (numpy.ndarray) – A matrix of genotype values coded as {0,1,2} for additive predictors and {0,1} for dominance predictors.
kwargs (dict) – Additional keyword arguments.
- Returns:
Y_hat – A matrix of estimated breeding values.
- Return type:
numpy.ndarray
- score(ptobj, cvobj, gtobj, **kwargs)[source]#
Return the coefficient of determination R**2 of the prediction.
- Parameters:
ptobj (BreedingValueMatrix, pandas.DataFrame, numpy.ndarray) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.
cvobj (numpy.ndarray) – An object containing covariate data.
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.
- Returns:
Rsq – A coefficient of determination array of shape
(t,)
.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- score_numpy(Y, X, Z, **kwargs)[source]#
Return the coefficient of determination R**2 of the prediction.
- Parameters:
Y (numpy.ndarray) – A matrix of phenotypes.
X (numpy.ndarray) – A matrix of covariates.
Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.
- Returns:
Rsq – A coefficient of determination array of shape
(t,)
.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- to_csv_dict(filenames, trait_cols='trait', sep=',', header=True, index=False, **kwargs)[source]#
Export a DenseAdditiveDominanceLinearGenomicModel to a set of CSV files specified by values in a
dict
.- Parameters:
filenames (dict of str) – CSV file names to which to write. Must have the keys:
"beta"
,"u_misc"
,"u_a"
, and"u_d"
(case sensitive).trait_cols (Sequence, str, None, default = "trait") – Names of the trait columns to which to write regression coefficients. If
Sequence
, column names are given by the strings in thetrait_cols
Sequence. Ifstr
, must be equal to"trait"
. Use trait names given in thetrait
property. IfNone
, use numeric trait column names.sep (str, default = ",") – Separator to use in the exported CSV files.
header (bool, default = True) – Whether to save header names.
index (bool, default = False) – Whether to save a row index in the exported CSV files.
kwargs (dict) – Additional keyword arguments to use for dictating export to a CSV.
- Return type:
None
- to_hdf5(filename, groupname=None, overwrite=True)[source]#
Write
DenseAdditiveDominanceLinearGenomicModel
to an HDF5 file.- Parameters:
filename (str, Path, h5py.File) – If
str
, an HDF5 file name to which to write. File is closed after writing. Ifh5py.File
, an opened HDF5 file to which to write. File is not closed after writing.groupname (str, None) – If
str
, an HDF5 group name under whichDenseAdditiveDominanceLinearGenomicModel
data is stored. IfNone
,DenseAdditiveDominanceLinearGenomicModel
is written to the base HDF5 group.overwrite (bool) – Whether to overwrite data fields if they are present in the HDF5 file.
- Return type:
None
- to_pandas_dict(trait_cols='trait', **kwargs)[source]#
Export a DenseAdditiveDominanceLinearGenomicModel to a
dict
ofpandas.DataFrame
.- Parameters:
trait_cols (Sequence, str, None, default = "trait") – Names of the trait columns to which to write regression coefficients. If
Sequence
, column names are given by the strings in thetrait_cols
Sequence. Ifstr
, must be equal to"trait"
. Use trait names given in thetrait
property. IfNone
, use numeric trait column names.kwargs (dict) – Additional keyword arguments to use for dictating export to a
dict
ofpandas.DataFrame
.
- Returns:
out – An output dataframe.
- Return type:
dict
- property trait: ndarray#
Description for property trait.
- property u: ndarray#
Random effect regression coefficients.
- property u_a: ndarray#
Additive genomic marker effects.
- property u_d: ndarray#
Additive genomic marker effects.
- property u_misc: ndarray#
Miscellaneous random effect regression coefficients.
- usl(gtobj, ploidy=None, unscale=False, **kwargs)#
Calculate the upper selection limit for a population.
- Parameters:
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
unscale (bool) – If
True
, then apply the mean of the fixed effects to the output.kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population upper selection limit statistics.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- usl_numpy(p, ploidy, unscale=False, **kwargs)#
Calculate the upper selection limit for a population.
- Parameters:
p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
unscale (bool) – If
True
, then apply the mean of the fixed effects to the output.kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population upper selection limit statistics.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_A(gtobj, **kwargs)#
Calculate the population additive genetic variance
- Parameters:
gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population additive genetic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_A_numpy(Z, **kwargs)#
Calculate the population additive genetic variance
- Parameters:
Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population additive genetic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_G(gtobj, **kwargs)[source]#
Calculate the population genetic variance.
- Parameters:
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population genetic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_G_numpy(Z, **kwargs)[source]#
Calculate the population genetic variance.
- Parameters:
Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population genetic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_a(gtobj, ploidy=None, **kwargs)#
Calculate the population additive genic variance
- Parameters:
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (Integral, None) –
Ploidy of the species.
If ploidy is None:
If gtobj is a GenotypeMatrix, then get ploidy from GenotypeMatrix.
If gtobj is a numpy.ndarray, then assumed to be 2 (diploid).
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population additive genic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray
- var_a_numpy(p, ploidy=2, **kwargs)#
Calculate the population additive genic variance
- Parameters:
p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (Integral) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – An array of shape
(t,)
containing population additive genic variances.Where:
t
is the number of traits.
- Return type:
numpy.ndarray