NonlinearGenomicModel#

class pybrops.model.gmod.NonlinearGenomicModel.NonlinearGenomicModel[source]#

Bases: GenomicModel

An abstract class for non-linear genomic models.

The purpose for this abstract interface is to provide an interface through which non-linear models (e.g. neural networks) may be incorporated into PyBrOpS.

Methods

bulmer

Calculate the Bulmer effect.

bulmer_numpy

Calculate the Bulmer effect.

copy

Make a shallow copy of the GenomicModel.

daavail

Determine whether a deleterious allele is available in the present taxa.

dacount

Calculate the deleterious allele count across all taxa.

dafixed

Determine whether a deleterious allele is fixed across all taxa.

dafreq

Calculate the deleterious allele frequency across all taxa.

dapoly

Determine whether a deleterious allele is polymorphic across all taxa.

deepcopy

Make a deep copy of the GenomicModel.

faavail

Determine whether a favorable allele is polymorphic or fixed across all taxa.

facount

Calculate the favorable allele count across all taxa.

fafixed

Determine whether a favorable allele is fixed across all taxa.

fafreq

Calculate the favorable allele frequency across all taxa.

fapoly

Determine whether a favorable allele is polymorphic across all taxa.

fit

Fit the model.

fit_numpy

Fit the model.

from_hdf5

Read an object from an HDF5 file.

gebv

Calculate genomic estimated breeding values.

gebv_numpy

Calculate genomic estimated breeding values.

gegv

Calculate genomic estimated genotypic values.

gegv_numpy

Calculate genomic estimated genotypic values.

lsl

Calculate the lower selection limit for a population.

lsl_numpy

Calculate the lower selection limit for a population.

nafixed

Determine whether a neutral allele is fixed across all taxa.

napoly

Determine whether a neutral allele is polymorphic across all taxa.

predict

Predict breeding values.

predict_numpy

Predict breeding values.

score

Return the coefficient of determination R**2 of the prediction.

score_numpy

Return the coefficient of determination R**2 of the prediction.

to_hdf5

Write an object to an HDF5 file.

usl

Calculate the upper selection limit for a population.

usl_numpy

Calculate the upper selection limit for a population.

var_A

Calculate the population additive genetic variance

var_A_numpy

Calculate the population additive genetic variance

var_G

Calculate the population genetic variance.

var_G_numpy

Calculate the population genetic variance.

var_a

Calculate the population additive genic variance

var_a_numpy

Calculate the population additive genic variance

Attributes

hyperparams

Model parameters.

model_name

Name of the model.

nexplan

Number of explanatory variables required by the model.

nparam

Number of model parameters.

ntrait

Number of traits predicted by the model.

trait

Names of the traits predicted by the model.

abstract bulmer(gtobj, ploidy, **kwargs)#

Calculate the Bulmer effect.

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • ploidy (int) – Ploidy of the species.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing Bulmer effects for each trait. In the event that additive genic variance is zero, NaN’s are produced.

Return type:

numpy.ndarray

abstract bulmer_numpy(Z, p, ploidy, **kwargs)#

Calculate the Bulmer effect.

Parameters:
  • Z (numpy.ndarray) – A matrix of genotypes.

  • p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).

  • ploidy (int) – Ploidy of the species.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing Bulmer effects for each trait. In the event that additive genic variance is zero, NaN’s are produced.

Return type:

numpy.ndarray

abstract copy()#

Make a shallow copy of the GenomicModel.

Returns:

out – A shallow copy of the original GenomicModel

Return type:

GenomicModel

abstract daavail(gmat, dtype=None, **kwargs)#

Determine whether a deleterious allele is available in the present taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native boolean type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is available.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract dacount(gmat, dtype, **kwargs)#

Calculate the deleterious allele count across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to count deleterious alleles.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele counts of the deleterious allele.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract dafixed(gmat, dtype, **kwargs)#

Determine whether a deleterious allele is fixed across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is fixed.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract dafreq(gmat, dtype, **kwargs)#

Calculate the deleterious allele frequency across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele frequencies of the deleterious allele.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract dapoly(gmat, dtype, **kwargs)#

Determine whether a deleterious allele is polymorphic across all taxa.

An allele is considered deleterious if its effect is less than zero. Alleles with zero effect are not considered deleterious; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine deleterious allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a deleterious allele is polymorphic.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract deepcopy(memo)#

Make a deep copy of the GenomicModel.

Parameters:

memo (dict) – Dictionary of memo metadata.

Returns:

out – A deep copy of the original GenomicModel

Return type:

GenomicModel

abstract faavail(gmat, dtype=None, **kwargs)#

Determine whether a favorable allele is polymorphic or fixed across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is available.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract facount(gmat, dtype, **kwargs)#

Calculate the favorable allele count across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to count favorable alleles.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele counts of the favorable allele.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract fafixed(gmat, dtype, **kwargs)#

Determine whether a favorable allele is fixed across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is fixed.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract fafreq(gmat, dtype, **kwargs)#

Calculate the favorable allele frequency across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing allele frequencies of the favorable allele.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract fapoly(gmat, dtype, **kwargs)#

Determine whether a favorable allele is polymorphic across all taxa.

An allele is considered favorable if its effect is greater than zero. Alleles with zero effect are not considered favorable; they are considered neutral.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine favorable allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a favorable allele is polymorphic.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract classmethod fit(ptobj, cvobj, gtobj, **kwargs)#

Fit the model.

Parameters:
  • ptobj (BreedingValueMatrix, pandas.DataFrame, numpy.ndarray) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.

  • cvobj (numpy.ndarray) – An object containing covariate data.

  • gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract classmethod fit_numpy(Y, X, Z, **kwargs)#

Fit the model.

Parameters:
  • Y (numpy.ndarray) – A phenotype matrix of shape (n,t).

  • X (numpy.ndarray) – A covariate matrix of shape (n,q).

  • Z (numpy.ndarray) – A genotypes matrix of shape (n,p).

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract classmethod from_hdf5(filename, groupname)#

Read an object from an HDF5 file.

Parameters:
  • filename (str, Path, h5py.File) – If str, an HDF5 file name from which to read. If Path, an HDF5 file name from which to read. If h5py.File, an opened HDF5 file from which to read.

  • groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is read from base HDF5 group.

Returns:

out – An object read from an HDF5 file.

Return type:

HDF5InputOutput

abstract gebv(gtobj, **kwargs)#

Calculate genomic estimated breeding values.

Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

gebvmat_hat – Genomic estimated breeding values.

Return type:

BreedingValueMatrix

abstract gebv_numpy(Z, **kwargs)#

Calculate genomic estimated breeding values.

Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:
  • Z (numpy.ndarray) – A matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

gebv_hat – A matrix of genomic estimated breeding values.

Return type:

numpy.ndarray

abstract gegv(gtobj, **kwargs)#

Calculate genomic estimated genotypic values.

Parameters:
  • Z (numpy.ndarray) – A matrix of genotypic markers.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A matrix of genomic estimated genotypic values.

Return type:

numpy.ndarray

abstract gegv_numpy(Z, **kwargs)#

Calculate genomic estimated genotypic values.

Parameters:
  • Z (numpy.ndarray) – A matrix of genotypic markers.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A matrix of genomic estimated genotypic values.

Return type:

numpy.ndarray

abstract property hyperparams: dict#

Model parameters.

abstract lsl(gtobj, ploidy, unscale, **kwargs)#

Calculate the lower selection limit for a population.

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • ploidy (int) – Ploidy of the species.

  • unscale (bool) – If True, then apply the mean of the fixed effects to the output.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing lower selection limits for each of t traits.

Return type:

numpy.ndarray

abstract lsl_numpy(p, ploidy, unscale, **kwargs)#

Calculate the lower selection limit for a population.

Parameters:
  • p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).

  • ploidy (int) – Ploidy of the species.

  • unscale (bool) – If True, then apply the mean of the fixed effects to the output.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing lower selection limits for each of t traits.

Return type:

numpy.ndarray

abstract property model_name: str#

Name of the model.

abstract nafixed(gmat, dtype, **kwargs)#

Determine whether a neutral allele is fixed across all taxa.

An allele is considered neutral if its effect is equal to zero.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a neutral allele is fixed.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract napoly(gmat, dtype, **kwargs)#

Determine whether a neutral allele is polymorphic across all taxa.

An allele is considered neutral if its effect is equal to zero.

Parameters:
  • gmat (GenotypeMatrix) – Genotype matrix for which to determine neutral allele frequencies.

  • dtype (numpy.dtype, None) – Datatype of the returned array. If None, use the native type.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A numpy.ndarray of shape (p,t) containing whether a neutral allele is polymorphic.

Where:

  • p is the number of alleles.

  • t is the number of traits.

Return type:

numpy.ndarray

abstract property nexplan: Integral#

Number of explanatory variables required by the model.

abstract property nparam: Integral#

Number of model parameters.

abstract property ntrait: int#

Number of traits predicted by the model.

abstract predict(cvobj, gtobj, **kwargs)#

Predict breeding values.

Remark: The difference between ‘predict’ and ‘gebv’ is that ‘predict’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:
  • cvobj (numpy.ndarray) – An object containing covariate data.

  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

bvmat_hat – Estimated breeding values.

Return type:

BreedingValueMatrix

abstract predict_numpy(X, Z, **kwargs)#

Predict breeding values.

Remark: The difference between ‘predict_numpy’ and ‘gebv_numpy’ is that ‘predict_numpy’ can incorporate other factors (e.g., fixed effects) to provide prediction estimates.

Parameters:
  • X (numpy.ndarray) – A matrix of covariates.

  • Z (numpy.ndarray) – A matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

Y_hat – A matrix of predicted breeding values.

Return type:

numpy.ndarray

abstract score(ptobj, cvobj, gtobj, **kwargs)#

Return the coefficient of determination R**2 of the prediction.

Parameters:
  • ptobj (BreedingValueMatrix or pandas.DataFrame) – An object containing phenotype data. Must be a matrix of breeding values or a phenotype data frame.

  • cvobj (object) – An object containing covariate data.

  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

Rsq – A coefficient of determination array of shape (t,).

Where:

  • t is the number of traits.

Return type:

numpy.ndarray

abstract score_numpy(Y, X, Z, **kwargs)#

Return the coefficient of determination R**2 of the prediction.

Parameters:
  • Y (numpy.ndarray) – A matrix of phenotypes.

  • X (numpy.ndarray) – A matrix of covariates.

  • Z (numpy.ndarray) – A matrix of genotypes.

  • kwargs (dict) – Additional keyword arguments.

Returns:

Rsq – A coefficient of determination array of shape (t,).

Where:

  • t is the number of traits.

Return type:

numpy.ndarray

abstract to_hdf5(filename, groupname, overwrite)#

Write an object to an HDF5 file.

Parameters:
  • filename (str, Path, h5py.File) – If str, an HDF5 file name to which to write. If Path, an HDF5 file path to which to write. If h5py.File, an opened HDF5 file to which to write.

  • groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is written to the base HDF5 group.

  • overwrite (bool) – Whether to overwrite values in an HDF5 file if a field already exists.

Return type:

None

abstract property trait: ndarray#

Names of the traits predicted by the model.

abstract usl(gtobj, ploidy, unscale, **kwargs)#

Calculate the upper selection limit for a population.

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • ploidy (int) – Ploidy of the species.

  • unscale (bool) – If True, then apply the mean of the fixed effects to the output.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing upper selection limits for each of t traits.

Return type:

numpy.ndarray

abstract usl_numpy(p, ploidy, unscale, **kwargs)#

Calculate the upper selection limit for a population.

Parameters:
  • p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).

  • ploidy (int) – Ploidy of the species.

  • unscale (bool) – If True, then apply the mean of the fixed effects to the output.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – An array of shape (t,) containing upper selection limits for each of t traits.

Return type:

numpy.ndarray

abstract var_A(gtobj, **kwargs)#

Calculate the population additive genetic variance

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_A_numpy(Z, **kwargs)#

Calculate the population additive genetic variance

Parameters:
  • Z (numpy.ndarray) – A matrix of genotypes.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_G(gtobj, **kwargs)#

Calculate the population genetic variance.

Parameters:
  • gtobj (GenotypeMatrix) – An object containing genotype data. Must be a matrix of genotype values.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_G_numpy(Z, **kwargs)#

Calculate the population genetic variance.

Parameters:
  • Z (numpy.ndarray) – A matrix of genotypes.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing genetic variances for each trait.

Return type:

numpy.ndarray

abstract var_a(gtobj, ploidy, **kwargs)#

Calculate the population additive genic variance

Parameters:
  • gtobj (GenotypeMatrix, numpy.ndarray) – An object containing genotype data. Must be a matrix of genotype values.

  • ploidy (int) – Ploidy of the species.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genic variances for each trait.

Return type:

numpy.ndarray

abstract var_a_numpy(p, ploidy, **kwargs)#

Calculate the population additive genic variance

Parameters:
  • p (numpy.ndarray) – A vector of genotype allele frequencies of shape (p,).

  • ploidy (int) – Ploidy of the species.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – Array of shape (t,) contianing additive genic variances for each trait.

Return type:

numpy.ndarray