NonlinearGenomicModel#

class pybrops.model.gmod.NonlinearGenomicModel.NonlinearGenomicModel[source]#

Bases: GenomicModel

An abstract class for non-linear genomic models.

The purpose for this abstract interface is to provide an interface through which non-linear models (e.g. neural networks) may be incorporated into PyBrOpS.

Methods

`bulmer`	Calculate the Bulmer effect.
`bulmer_numpy`	Calculate the Bulmer effect.
`copy`	Make a shallow copy of the GenomicModel.
`daavail`	Determine whether a deleterious allele is available in the present taxa.
`dacount`	Calculate the deleterious allele count across all taxa.
`dafixed`	Determine whether a deleterious allele is fixed across all taxa.
`dafreq`	Calculate the deleterious allele frequency across all taxa.
`dapoly`	Determine whether a deleterious allele is polymorphic across all taxa.
`deepcopy`	Make a deep copy of the GenomicModel.
`faavail`	Determine whether a favorable allele is polymorphic or fixed across all taxa.
`facount`	Calculate the favorable allele count across all taxa.
`fafixed`	Determine whether a favorable allele is fixed across all taxa.
`fafreq`	Calculate the favorable allele frequency across all taxa.
`fapoly`	Determine whether a favorable allele is polymorphic across all taxa.
`fit`	Fit the model.
`fit_numpy`	Fit the model.
`from_hdf5`	Read an object from an HDF5 file.
`gebv`	Calculate genomic estimated breeding values.
`gebv_numpy`	Calculate genomic estimated breeding values.
`gegv`	Calculate genomic estimated genotypic values.
`gegv_numpy`	Calculate genomic estimated genotypic values.
`lsl`	Calculate the lower selection limit for a population.
`lsl_numpy`	Calculate the lower selection limit for a population.
`nafixed`	Determine whether a neutral allele is fixed across all taxa.
`napoly`	Determine whether a neutral allele is polymorphic across all taxa.
`predict`	Predict breeding values.
`predict_numpy`	Predict breeding values.
`score`	Return the coefficient of determination R**2 of the prediction.
`score_numpy`	Return the coefficient of determination R**2 of the prediction.
`to_hdf5`	Write an object to an HDF5 file.
`usl`	Calculate the upper selection limit for a population.
`usl_numpy`	Calculate the upper selection limit for a population.
`var_A`	Calculate the population additive genetic variance
`var_A_numpy`	Calculate the population additive genetic variance
`var_G`	Calculate the population genetic variance.
`var_G_numpy`	Calculate the population genetic variance.
`var_a`	Calculate the population additive genic variance
`var_a_numpy`	Calculate the population additive genic variance

Attributes

`hyperparams`	Model parameters.
`model_name`	Name of the model.
`nexplan`	Number of explanatory variables required by the model.
`nparam`	Number of model parameters.
`ntrait`	Number of traits predicted by the model.
`trait`	Names of the traits predicted by the model.

abstract bulmer(gtobj, ploidy, **kwargs)#

Calculate the Bulmer effect.

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing Bulmer effects for each trait. In the event that additive genic variance is zero, NaN’s are produced.

Return type:

numpy.ndarray

abstract bulmer_numpy(Z, p, ploidy, **kwargs)#

Calculate the Bulmer effect.

Parameters:

Z (numpy.ndarray) – A matrix of genotypes.
p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing Bulmer effects for each trait. In the event that additive genic variance is zero, NaN’s are produced.

Return type:

numpy.ndarray

abstract copy()#

Make a shallow copy of the GenomicModel.

Returns:: out – A shallow copy of the original GenomicModel
Return type:: GenomicModel

abstract daavail(gmat, dtype=None, **kwargs)#

Determine whether a deleterious allele is available in the present taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native boolean type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is available.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract dacount(gmat, dtype, **kwargs)#

Calculate the deleterious allele count across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to count deleterious alleles.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele counts of the deleterious allele.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract dafixed(gmat, dtype, **kwargs)#

Determine whether a deleterious allele is fixed across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is fixed.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract dafreq(gmat, dtype, **kwargs)#

Calculate the deleterious allele frequency across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele frequencies of the deleterious allele.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract dapoly(gmat, dtype, **kwargs)#

Determine whether a deleterious allele is polymorphic across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is polymorphic.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract deepcopy(memo)#

Make a deep copy of the GenomicModel.

Parameters:: memo (dict) – Dictionary of memo metadata.
Returns:: out – A deep copy of the original GenomicModel
Return type:: GenomicModel

abstract faavail(gmat, dtype=None, **kwargs)#

Determine whether a favorable allele is polymorphic or fixed across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is available.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract facount(gmat, dtype, **kwargs)#

Calculate the favorable allele count across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to count favorable alleles.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele counts of the favorable allele.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract fafixed(gmat, dtype, **kwargs)#

Determine whether a favorable allele is fixed across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is fixed.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract fafreq(gmat, dtype, **kwargs)#

Calculate the favorable allele frequency across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele frequencies of the favorable allele.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract fapoly(gmat, dtype, **kwargs)#

Determine whether a favorable allele is polymorphic across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is polymorphic.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract classmethod fit(ptobj, cvobj, gtobj, **kwargs)#

Fit the model.

Parameters:

ptobj (BreedingValueMatrix, pandas.DataFrame, numpy.ndarray) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.
cvobj (numpy.ndarray) – An object containing covariate data.
gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract classmethod fit_numpy(Y, X, Z, **kwargs)#

Fit the model.

Parameters:

Y (numpy.ndarray) – A phenotype matrix of shape (n,t).
X (numpy.ndarray) – A covariate matrix of shape (n,q).
Z (numpy.ndarray) – A genotypes matrix of shape (n,p).
kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract classmethod from_hdf5(filename, groupname)#

Read an object from an HDF5 file.

Parameters:

filename (str, Path, h5py.File) – If str, an HDF5 file name from which to read. If Path, an HDF5 file name from which to read. If h5py.File, an opened HDF5 file from which to read.
groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is read from base HDF5 group.

Returns:

out – An object read from an HDF5 file.

Return type:

HDF5InputOutput

abstract gebv(gtobj, **kwargs)#

Calculate genomic estimated breeding values.

Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

gebvmat_hat – Genomic estimated breeding values.

Return type:

BreedingValueMatrix

abstract gebv_numpy(Z, **kwargs)#

Calculate genomic estimated breeding values.

Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:

Z (numpy.ndarray) – A matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

gebv_hat – A matrix of genomic estimated breeding values.

Return type:

numpy.ndarray

abstract gegv(gtobj, **kwargs)#

Calculate genomic estimated genotypic values.

Parameters:

Z (numpy.ndarray) – A matrix of genotypic markers.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A matrix of genomic estimated genotypic values.

Return type:

numpy.ndarray

abstract gegv_numpy(Z, **kwargs)#

Calculate genomic estimated genotypic values.

Parameters:

Z (numpy.ndarray) – A matrix of genotypic markers.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A matrix of genomic estimated genotypic values.

Return type:

numpy.ndarray

abstract property hyperparams: dict#: Model parameters.

abstract lsl(gtobj, ploidy, unscale, **kwargs)#

Calculate the lower selection limit for a population.

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
unscale (bool) – If True, then apply the mean of the fixed effects to the output.
kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing lower selection limits for each of t traits.

Return type:

numpy.ndarray

abstract lsl_numpy(p, ploidy, unscale, **kwargs)#

Calculate the lower selection limit for a population.

Parameters:

p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
unscale (bool) – If True, then apply the mean of the fixed effects to the output.
kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing lower selection limits for each of t traits.

Return type:

numpy.ndarray

abstract property model_name: str#: Name of the model.

abstract nafixed(gmat, dtype, **kwargs)#

Determine whether a neutral allele is fixed across all taxa.

An allele is considered neutral if its effect is equal to zero.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a neutral allele is fixed.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract napoly(gmat, dtype, **kwargs)#

Determine whether a neutral allele is polymorphic across all taxa.

An allele is considered neutral if its effect is equal to zero.

Parameters:

gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.
dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.
kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a neutral allele is polymorphic.

Where:

p is the number of alleles.
t is the number of traits.

Return type:

numpy.ndarray

abstract property nexplan: Integral#: Number of explanatory variables required by the model.

abstract property nparam: Integral#: Number of model parameters.

abstract property ntrait: int#: Number of traits predicted by the model.

abstract predict(cvobj, gtobj, **kwargs)#

Predict breeding values.

Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:

cvobj (numpy.ndarray) – An object containing covariate data.
gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

bvmat_hat – Estimated breeding values.

Return type:

BreedingValueMatrix

abstract predict_numpy(X, Z, **kwargs)#

Predict breeding values.

Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:

X (numpy.ndarray) – A matrix of covariates.
Z (numpy.ndarray) – A matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

Y_hat – A matrix of predicted breeding values.

Return type:

numpy.ndarray

abstract score(ptobj, cvobj, gtobj, **kwargs)#

Return the coefficient of determination R**2 of the prediction.

Parameters:

ptobj (BreedingValueMatrix or pandas.DataFrame) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.
cvobj (object) – An object containing covariate data.
gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

Rsq – A coefficient of determination array of shape (t,).

Where:

t is the number of traits.

Return type:

numpy.ndarray

abstract score_numpy(Y, X, Z, **kwargs)#

Return the coefficient of determination R**2 of the prediction.

Parameters:

Y (numpy.ndarray) – A matrix of phenotypes.
X (numpy.ndarray) – A matrix of covariates.
Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.

Returns:

Rsq – A coefficient of determination array of shape (t,).

Where:

t is the number of traits.

Return type:

numpy.ndarray

abstract to_hdf5(filename, groupname, overwrite)#

Write an object to an HDF5 file.

Parameters:

filename (str, Path, h5py.File) – If str, an HDF5 file name to which to write. If Path, an HDF5 file path to which to write. If h5py.File, an opened HDF5 file to which to write.
groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is written to the base HDF5 group.
overwrite (bool) – Whether to overwrite values in an HDF5 file if a field already exists.

Return type:

None

abstract property trait: ndarray#: Names of the traits predicted by the model.

abstract usl(gtobj, ploidy, unscale, **kwargs)#

Calculate the upper selection limit for a population.

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
unscale (bool) – If True, then apply the mean of the fixed effects to the output.
kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing upper selection limits for each of t traits.

Return type:

numpy.ndarray

abstract usl_numpy(p, ploidy, unscale, **kwargs)#

Calculate the upper selection limit for a population.

Parameters:

p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
unscale (bool) – If True, then apply the mean of the fixed effects to the output.
kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing upper selection limits for each of t traits.

Return type:

numpy.ndarray

abstract var_A(gtobj, **kwargs)#

Calculate the population additive genetic variance

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_A_numpy(Z, **kwargs)#

Calculate the population additive genetic variance

Parameters:

Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_G(gtobj, **kwargs)#

Calculate the population genetic variance.

Parameters:

gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_G_numpy(Z, **kwargs)#

Calculate the population genetic variance.

Parameters:

Z (numpy.ndarray) – A matrix of genotypes.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_a(gtobj, ploidy, **kwargs)#

Calculate the population additive genic variance

Parameters:

gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.
ploidy (int) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genic variances for each trait.

Return type:

numpy.ndarray

abstract var_a_numpy(p, ploidy, **kwargs)#

Calculate the population additive genic variance

Parameters:

p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).
ploidy (int) – Ploidy of the species.
kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genic variances for each trait.

Return type:

numpy.ndarray