DenseGenotypeMatrix#
- class pybrops.popgen.gmat.DenseGenotypeMatrix.DenseGenotypeMatrix(mat, taxa=None, taxa_grp=None, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, ploidy=2, **kwargs)[source]#
Bases:
DenseTaxaVariantMatrix
,DenseGeneticMappableMatrix
,GenotypeMatrix
A concrete class for unphased genoypte matrix objects.
- The purpose of this concrete class is to implement functionality for:
Genotype matrix ploidy and phase metadata.
Genotype matrix format conversion.
Genotype matrix allele counting routines.
Genotype matrix genotype counting routines.
Loading genotype matrices from VCF and HDF5.
- Parameters:
mat (numpy.ndarray) – An int8 haplotype matrix. Must be {0,1,2} format.
taxa (numpy.ndarray, None) –
A
numpy.ndarray
of shape(n,)
containing taxa names. IfNone
, do not store any taxa name information.Where:
n
is the number of taxa.
taxa_grp (numpy.ndarray, None) –
A
numpy.ndarray
of shape(n,)
containing taxa groupings. IfNone
, do not store any taxa group information.Where:
n
is the number of taxa.
vrnt_chrgrp (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant chromosome group labels. IfNone
, do not store any variant chromosome group label information.Where:
p
is the number of variants.
vrnt_phypos (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant chromosome physical positions. IfNone
, do not store any variant chromosome physical position information.Where:
p
is the number of variants.
vrnt_name (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant names. IfNone
, do not store any variant names.Where:
p
is the number of variants.
vrnt_genpos (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant chromosome genetic positions. IfNone
, do not store any variant chromosome genetic position information.Where:
p
is the number of variants.
vrnt_xoprob (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant crossover probabilities. IfNone
, do not store any variant crossover probabilities.Where:
p
is the number of variants.
vrnt_hapgrp (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant haplotype group labels. IfNone
, do not store any variant haplotype group label information.Where:
p
is the number of variants.
vrnt_hapalt (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant alternative alleles. IfNone
, do not store any variant alternative allele information.Where:
p
is the number of variants.
vrnt_hapref (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing variant reference alleles. IfNone
, do not store any variant reference allele information.Where:
p
is the number of variants.
vrnt_mask (numpy.ndarray, None) –
A
numpy.ndarray
of shape(p,)
containing a variant mask. IfNone
, do not store any variant mask information.Where:
p
is the number of variants.
ploidy (int) – The ploidy represented by the genotype matrix. This only represents ploidy of the reproductive habit. If the organism represented is an allopolyploid (e.g. hexaploid wheat), the ploidy is 2 since it reproduces in a diploid manner.
kwargs (dict) – Additional keyword arguments.
Methods
Allele count of the non-zero allele across all taxa.
Add additional elements to the end of the Matrix along an axis.
Add additional elements to the end of the Matrix along the taxa axis.
Add additional elements to the end of the Matrix along the variant axis.
Determine allele fixation for loci across all taxa.
Allele frequency of the non-zero allele across all taxa.
Allele polymorphism presence or absense across all loci.
Append values to the matrix.
Append values to the Matrix along the taxa axis.
Append values to the Matrix along the variant axis.
Concatenate matrices together along an axis.
Concatenate list of Matrix together along the taxa axis.
Concatenate list of Matrix together along the variant axis.
Make a shallow copy of the DenseGenotypeMatrix.
Make a deep copy of the DenseGenotypeMatrix.
Delete sub-arrays along an axis.
Delete sub-arrays along the taxa axis.
Delete sub-arrays along the variant axis.
Read GenotypeMatrix from an HDF5 file.
Create a DenseGenotypeMatrix from a VCF file.
Sort the DenseTaxaVariantMatrix along an axis, then populate grouping indices.
Sort the Matrix along the taxa axis, then populate grouping indices for the taxa axis.
Sort the Matrix along the variant axis, then populate grouping indices for the variant axis.
Gather genotype counts for homozygous major, heterozygous, homozygous minor for all individuals.
Gather genotype frequencies for homozygous major, heterozygous, homozygous minor across all individuals.
Incorporate values along the given axis before the given indices.
Incorporate values along the taxa axis before the given indices.
Incorporate values along the variant axis before the given indices.
Insert values along the given axis before the given indices.
Insert values along the taxa axis before the given indices.
Insert values along the variant axis before the given indices.
Interpolate genetic map postions for variants using a GeneticMap
Interpolate genetic map positions AND crossover probabilities between sequential markers using a GeneticMap and a GeneticMapFunction.
Determine whether the Matrix has been sorted and grouped.
Determine whether the Matrix has been sorted and grouped along the taxa axis.
Determine whether the Matrix has been sorted and grouped along the variant axis.
Perform an indirect stable sort using a tuple of keys.
Perform an indirect stable sort using a sequence of keys along the taxa axis.
Perform an indirect stable sort using a sequence of keys along the variant axis.
Minor allele frequency across all taxa.
Get mat in a specific format type.
Mean expected heterozygosity across all taxa.
Remove sub-arrays along an axis.
Remove sub-arrays along the taxa axis.
Remove sub-arrays along the variant axis.
Reorder the VariantMatrix.
Reorder elements of the Matrix along the taxa axis using an array of indices.
Reorder elements of the Matrix along the variant axis using an array of indices.
Select certain values from the matrix.
Select certain values from the Matrix along the taxa axis.
Select certain values from the Matrix along the variant axis.
Reset metadata for corresponding axis: name, stix, spix, len.
Sort slements of the Matrix along the taxa axis using a sequence of keys.
Sort slements of the Matrix along the variant axis using a sequence of keys.
Allele count of the non-zero allele within each taxon.
Allele frequency of the non-zero allele within each taxon.
Write GenotypeMatrix to an HDF5 file.
Ungroup the DenseTaxaVariantMatrix along an axis by removing grouping metadata.
Ungroup the DenseTaxaMatrix along the taxa axis by removing taxa group metadata.
Ungroup the DenseVariantMatrix along the variant axis by removing variant group metadata.
Attributes
Pointer to raw numpy.ndarray object.
Matrix representation format.
Number of dimensions of the raw numpy.ndarray.
Shape of the raw numpy.ndarray.
Number of chromosome phases represented by the matrix.
Number of taxa
Number of variants.
Genome ploidy number represented by matrix.
Taxa label array
Get taxa axis number
Taxa group label.
Taxa group length.
Taxa group name.
Taxa group stop index.
Taxa group start index.
Get variant axis
Variant chromosome group label.
Variant chromosome group length.
Variant chromosome group names.
Variant chromosome group stop indices.
Variant chromosome group start indices.
Variant genetic position.
Variant haplotype sequence.
Variant haplotype group label.
Variant reference haplotype sequence.
Variant mask.
Variant name.
Variant physical position.
Variant crossover sequential probability.
- __add__(value)#
Elementwise add matrices
- Parameters:
value (object) – Object which to add.
- Returns:
out – An object resulting from the addition.
- Return type:
object
- __mul__(value)#
Elementwise multiply matrices
- Parameters:
value (object) – Object which to multiply.
- Returns:
out – An object resulting from the multiplication.
- Return type:
object
- acount(dtype=None)[source]#
Allele count of the non-zero allele across all taxa.
- Parameters:
dtype (dtype, None) – The data type of the accumulator and returned array. If
None
, use the native integer type.- Returns:
out – A
numpy.ndarray
of shape(p,)
containing allele counts of the allele coded as1
for allp
loci.- Return type:
numpy.ndarray
- adjoin(values, axis=-1, taxa=None, taxa_grp=None, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Add additional elements to the end of the Matrix along an axis.
- Parameters:
values (DensePhasedGenotypeMatrix, numpy.ndarray) – Values are appended to append to the Matrix.
axis (int) – The axis along which values are adjoined.
taxa (numpy.ndarray) – Taxa names to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None taxa field, providing this argument overwrites the field.
taxa_grp (numpy.ndarray) – Taxa groups to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None taxa_grp field, providing this argument overwrites the field.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to adjoin to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to adjoin to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A copy of DenseTaxaVariantMatrix with values appended to axis. Note that adjoin does not occur in-place: a new DenseTaxaVariantMatrix is allocated and filled.
- Return type:
- adjoin_taxa(values, taxa=None, taxa_grp=None, **kwargs)[source]#
Add additional elements to the end of the Matrix along the taxa axis.
- Parameters:
values (Matrix, numpy.ndarray) – Values are appended to adjoin to the Matrix.
taxa (numpy.ndarray) – Taxa names to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None taxa field, providing this argument overwrites the field.
taxa_grp (numpy.ndarray) – Taxa groups to adjoin to the Matrix. If values is a DenseHaplotypeMatrix that has a non-None taxa_grp field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A copy of mat with values appended to axis. Note that adjoin does not occur in-place: a new Matrix is allocated and filled.
- Return type:
- adjoin_vrnt(values, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)[source]#
Add additional elements to the end of the Matrix along the variant axis.
- Parameters:
values (Matrix, numpy.ndarray) – Values are appended to adjoin to the Matrix.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to adjoin to the Matrix.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to adjoin to the Matrix.
vrnt_name (numpy.ndarray) – Variant names to adjoin to the Matrix.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to adjoin to the Matrix.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to adjoin to the Matrix.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to adjoin to the Matrix.
vrnt_hapalt (numpy.ndarray) – Variant haplotype sequence.
vrnt_hapref (numpy.ndarray) – Variant haplotype reference sequence.
vrnt_mask (numpy.ndarray) – Variant mask to adjoin to the Matrix.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A copy of mat with values appended to axis. Note that adjoin does not occur in-place: a new Matrix is allocated and filled.
- Return type:
- afixed(dtype=None)[source]#
Determine allele fixation for loci across all taxa.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
of shape(p,)
containing indicator variables for whether a locus is fixed at a particular locus.- Return type:
numpy.ndarray
- afreq(dtype=None)[source]#
Allele frequency of the non-zero allele across all taxa.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
of shape(p,)
containing allele frequencies of the allele coded as1
for allp
loci.- Return type:
numpy.ndarray
- apoly(dtype=None)[source]#
Allele polymorphism presence or absense across all loci.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
of shape(p,)
containing indicator variables for whether the locus is polymorphic.- Return type:
numpy.ndarray
- append(values, axis=-1, taxa=None, taxa_grp=None, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Append values to the matrix.
- Parameters:
values (DenseHaplotypeMatrix, numpy.ndarray) – Values are appended to append to the matrix. Must be of type int8. Must be of shape (m, n, p)
axis (int) – The axis along which values are appended.
taxa (numpy.ndarray) – Taxa names to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa field, providing this argument overwrites the field.
taxa_grp (numpy.ndarray) – Taxa groups to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa_grp field, providing this argument overwrites the field.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to append to the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- append_taxa(values, taxa=None, taxa_grp=None, **kwargs)#
Append values to the Matrix along the taxa axis.
- Parameters:
values (Matrix, numpy.ndarray) – Values are appended to append to the matrix.
taxa (numpy.ndarray) – Taxa names to append to the Matrix.
taxa_grp (numpy.ndarray) – Taxa groups to append to the Matrix.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- append_vrnt(values, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Append values to the Matrix along the variant axis.
- Parameters:
values (Matrix, numpy.ndarray) – Values are appended to append to the matrix.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to append to the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- classmethod concat(mats, axis=-1, **kwargs)#
Concatenate matrices together along an axis.
- Parameters:
mats (Sequence of matrices) – List of Matrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.
axis (int) – The axis along which the arrays will be joined.
kwargs (dict) – Additional keyword arguments
- Returns:
out – The concatenated DenseTaxaVariantMatrix. Note that concat does not occur in-place: a new DenseTaxaVariantMatrix is allocated and filled.
- Return type:
- classmethod concat_taxa(mats, **kwargs)[source]#
Concatenate list of Matrix together along the taxa axis.
- Parameters:
mats (Sequence of Matrix) – List of Matrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.
kwargs (dict) – Additional keyword arguments
- Returns:
out – The concatenated DenseGenotypeMatrix. Note that concat does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- classmethod concat_vrnt(mats, **kwargs)[source]#
Concatenate list of Matrix together along the variant axis.
- Parameters:
mats (Sequence of Matrix) – List of Matrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.
kwargs (dict) – Additional keyword arguments
- Returns:
out – The concatenated matrix. Note that concat does not occur in-place: a new Matrix is allocated and filled.
- Return type:
- copy()[source]#
Make a shallow copy of the DenseGenotypeMatrix.
- Returns:
out – A shallow copy of the original DenseGenotypeMatrix.
- Return type:
- deepcopy(memo=None)[source]#
Make a deep copy of the DenseGenotypeMatrix.
- Parameters:
memo (dict) – Dictionary of memo metadata.
- Returns:
out – A deep copy of the original DenseGenotypeMatrix.
- Return type:
- delete(obj, axis=-1, **kwargs)#
Delete sub-arrays along an axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
axis (int) – The axis along which to delete the subarray defined by obj.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseTaxaVariantMatrix with deleted elements. Note that concat does not occur in-place: a new DenseTaxaVariantMatrix is allocated and filled.
- Return type:
- delete_taxa(obj, **kwargs)[source]#
Delete sub-arrays along the taxa axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseGenotypeMatrix with deleted elements. Note that concat does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- delete_vrnt(obj, **kwargs)[source]#
Delete sub-arrays along the variant axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseGenotypeMatrix with deleted elements. Note that concat does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- classmethod from_hdf5(filename, groupname=None)[source]#
Read GenotypeMatrix from an HDF5 file.
- Parameters:
filename (str, Path, h5py.File) – If
str
orPath
, an HDF5 file name from which to read. File is closed after reading. Ifh5py.File
, an opened HDF5 file from which to read. File is not closed after reading.groupname (str, None) – If
str
, an HDF5 group name under which GenotypeMatrix data is stored. IfNone
, GenotypeMatrix is read from base HDF5 group.
- Returns:
gmat – A genotype matrix read from file.
- Return type:
- classmethod from_vcf(filename, auto_group_vrnt=True)[source]#
Create a DenseGenotypeMatrix from a VCF file.
- Parameters:
filename (str) – Path to VCF file.
auto_group_vrnt (bool) – Whether to group variants in returned genotype matrix.
- Returns:
out – An unphased genotype matrix with associated metadata from VCF file.
- Return type:
- group(axis=-1, **kwargs)#
Sort the DenseTaxaVariantMatrix along an axis, then populate grouping indices.
- Parameters:
axis (int) – The axis along which values are grouped.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- group_taxa(**kwargs)#
Sort the Matrix along the taxa axis, then populate grouping indices for the taxa axis.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- group_vrnt(**kwargs)#
Sort the Matrix along the variant axis, then populate grouping indices for the variant axis.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- gtcount(dtype=None)[source]#
Gather genotype counts for homozygous major, heterozygous, homozygous minor for all individuals.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
array of shape(g,p)
containing allele counts across allp
loci for each ofg
genotype combinations.Where:
out[0]
is the count of0
genotype across all lociout[1]
is the count of1
genotype across all lociout[2]
is the count of2
genotype across all loci...
out[g-1]
is the count ofg-1
genotype across all loci
- Return type:
numpy.ndarray
- gtfreq(dtype=None)[source]#
Gather genotype frequencies for homozygous major, heterozygous, homozygous minor across all individuals.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
array of shape(g,p)
containing haplotype counts across allp
loci for allg
genotype combinations.Where:
out[0]
is the frequency of0
genotype across all lociout[1]
is the frequency of1
genotype across all lociout[2]
is the frequency of2
genotype across all loci...
out[g-1]
is the frequency ofg-1
genotype across all loci
- Return type:
numpy.ndarray
- incorp(obj, values, axis=-1, taxa=None, taxa_grp=None, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Incorporate values along the given axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.
values (array_like) – Values to incorporate into the matrix.
axis (int) – The axis along which values are incorporated.
taxa (numpy.ndarray) – Taxa names to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa field, providing this argument overwrites the field.
taxa_grp (numpy.ndarray) – Taxa groups to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa_grp field, providing this argument overwrites the field.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to incorporate into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- incorp_taxa(obj, values, taxa=None, taxa_grp=None, **kwargs)#
Incorporate values along the taxa axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.
values (Matrix, numpy.ndarray) – Values to incorporate into the matrix.
taxa (numpy.ndarray) – Taxa names to incorporate into the Matrix.
taxa_grp (numpy.ndarray) – Taxa groups to incorporate into the Matrix.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- incorp_vrnt(obj, values, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Incorporate values along the variant axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.
values (Matrix, numpy.ndarray) – Values to incorporate into the matrix.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to incorporate into the Matrix. If values is a DenseVariantMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- insert(obj, values, axis=-1, taxa=None, taxa_grp=None, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)#
Insert values along the given axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.
values (Matrix, numpy.ndarray) – Values to insert into the matrix.
axis (int) – The axis along which values are inserted.
taxa (numpy.ndarray) – Taxa names to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa field, providing this argument overwrites the field.
taxa_grp (numpy.ndarray) – Taxa groups to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None taxa_grp field, providing this argument overwrites the field.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_chrgrp field, providing this argument overwrites the field.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_phypos field, providing this argument overwrites the field.
vrnt_name (numpy.ndarray) – Variant names to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_name field, providing this argument overwrites the field.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_genpos field, providing this argument overwrites the field.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_xoprob field, providing this argument overwrites the field.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_hapgrp field, providing this argument overwrites the field.
vrnt_mask (numpy.ndarray) – Variant mask to insert into the Matrix. If values is a DenseTaxaVariantMatrix that has a non-None vrnt_mask field, providing this argument overwrites the field.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseTaxaVariantMatrix with values inserted. Note that insert does not occur in-place: a new DenseTaxaVariantMatrix is allocated and filled.
- Return type:
- insert_taxa(obj, values, taxa=None, taxa_grp=None, **kwargs)[source]#
Insert values along the taxa axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.
values (Matrix, numpy.ndarray) – Values to insert into the matrix.
taxa (numpy.ndarray) – Taxa names to insert into the Matrix.
taxa_grp (numpy.ndarray) – Taxa groups to insert into the Matrix.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseGenotypeMatrix with values inserted. Note that insert does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- insert_vrnt(obj, values, vrnt_chrgrp=None, vrnt_phypos=None, vrnt_name=None, vrnt_genpos=None, vrnt_xoprob=None, vrnt_hapgrp=None, vrnt_hapalt=None, vrnt_hapref=None, vrnt_mask=None, **kwargs)[source]#
Insert values along the variant axis before the given indices.
- Parameters:
obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.
values (array_like) – Values to insert into the matrix.
vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to insert into the Matrix.
vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to insert into the Matrix.
vrnt_name (numpy.ndarray) – Variant names to insert into the Matrix.
vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to insert into the Matrix.
vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to insert into the Matrix.
vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to insert into the Matrix.
vrnt_hapalt (numpy.ndarray) – Variant alternative haplotype labels to insert into the Matrix.
vrnt_hapref (numpy.ndarray) – Variant reference haplotype labels to insert into the Matrix.
vrnt_mask (numpy.ndarray) – Variant mask to insert into the Matrix.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – A DenseGenotypeMatrix with values inserted. Note that insert does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- interp_genpos(gmap, **kwargs)#
Interpolate genetic map postions for variants using a GeneticMap
- Parameters:
gmap (GeneticMap) – A genetic map from which to interopolate genetic map postions for loci within the VariantMatrix.
- Return type:
None
- interp_xoprob(gmap, gmapfn, **kwargs)#
Interpolate genetic map positions AND crossover probabilities between sequential markers using a GeneticMap and a GeneticMapFunction.
- Parameters:
gmap (GeneticMap) – A genetic map from which to interopolate genetic map postions for loci within the VariantMatrix.
gmapfn (GeneticMapFunction) – A genetic map function from which to interpolate crossover probabilities for loci within the VariantMatrix.
- Return type:
None
- is_grouped(axis=-1, **kwargs)#
Determine whether the Matrix has been sorted and grouped.
- Parameters:
axis (int) – Axis along which to determine if is grouped.
kwargs (dict) – Additional keyword arguments.
- Returns:
grouped – True or False indicating whether the GeneticMap has been sorted and grouped.
- Return type:
bool
- is_grouped_taxa(**kwargs)#
Determine whether the Matrix has been sorted and grouped along the taxa axis.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Returns:
grouped – True or False indicating whether the Matrix has been sorted and grouped.
- Return type:
bool
- is_grouped_vrnt(**kwargs)#
Determine whether the Matrix has been sorted and grouped along the variant axis.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Returns:
grouped – True or False indicating whether the Matrix has been sorted and grouped.
- Return type:
bool
- lexsort(keys=None, axis=-1, **kwargs)#
Perform an indirect stable sort using a tuple of keys.
- Parameters:
keys (tuple, None) – A tuple of columns to be sorted. The last column is the primary sort key. If None, sort using vrnt_chrgrp as primary key, and vrnt_phypos as secondary key.
axis (int) – The axis of the Matrix over which to sort values.
kwargs (dict) – Additional keyword arguments.
- Returns:
indices – Array of indices that sort the keys.
- Return type:
numpy.ndarray
- lexsort_taxa(keys=None, **kwargs)#
Perform an indirect stable sort using a sequence of keys along the taxa axis.
- Parameters:
keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.
kwargs (dict) – Additional keyword arguments.
- Returns:
indices – Array of indices that sort the keys along the specified axis.
- Return type:
A (N,) ndarray of ints
- lexsort_vrnt(keys=None, **kwargs)#
Perform an indirect stable sort using a sequence of keys along the variant axis.
- Parameters:
keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.
kwargs (dict) – Additional keyword arguments.
- Returns:
indices – Array of indices that sort the keys along the specified axis.
- Return type:
A (N,) ndarray of ints
- maf(dtype=None)[source]#
Minor allele frequency across all taxa.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
of shape(p,)
containing allele frequencies for the minor allele.- Return type:
numpy.ndarray
- property mat: ndarray#
Pointer to raw numpy.ndarray object.
- mat_asformat(format)[source]#
Get mat in a specific format type.
- Parameters:
format (str) – Desired output format. Options are “{0,1,2}”, “{-1,0,1}”, “{-1,m,1}”.
- Returns:
out – Matrix in the desired output format.
- Return type:
numpy.ndarray
- property mat_format: str#
Matrix representation format.
- property mat_ndim: int#
Number of dimensions of the raw numpy.ndarray.
- property mat_shape: tuple#
Shape of the raw numpy.ndarray.
- meh(dtype=None)[source]#
Mean expected heterozygosity across all taxa.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A number representing the mean expected heterozygous. If
dtype
isNone
, then a native 64-bit floating point is returned. Otherwise, of type specified bydtype
.- Return type:
Real
- property nphase: int#
Number of chromosome phases represented by the matrix.
- property ntaxa: int#
Number of taxa
- property nvrnt: int#
Number of variants.
- property ploidy: int#
Genome ploidy number represented by matrix.
- remove(obj, axis=-1, **kwargs)#
Remove sub-arrays along an axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
axis (int) – The axis along which to remove the subarray defined by obj.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- remove_taxa(obj, **kwargs)#
Remove sub-arrays along the taxa axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- remove_vrnt(obj, **kwargs)#
Remove sub-arrays along the variant axis.
- Parameters:
obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- reorder(indices, axis=-1, **kwargs)#
Reorder the VariantMatrix.
- Parameters:
indices (numpy.ndarray) – Indices of where to place elements.
axis (int) – The axis over which to reorder values.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- reorder_taxa(indices, **kwargs)#
Reorder elements of the Matrix along the taxa axis using an array of indices. Note this modifies the Matrix in-place.
- Parameters:
indices (A (N,) ndarray of ints) – Array of indices that reorder the matrix along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- reorder_vrnt(indices, **kwargs)#
Reorder elements of the Matrix along the variant axis using an array of indices. Note this modifies the Matrix in-place.
- Parameters:
indices (A (N,) ndarray of ints) – Array of indices that reorder the matrix along the specified axis.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- select(indices, axis=-1, **kwargs)#
Select certain values from the matrix.
- Parameters:
indices (array_like (Nj, ...)) – The indices of the values to select.
axis (int) – The axis along which values are selected.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – The output DenseTaxaVariantMatrix with values selected. Note that select does not occur in-place: a new DenseTaxaVariantMatrix is allocated and filled.
- Return type:
- select_taxa(indices, **kwargs)[source]#
Select certain values from the Matrix along the taxa axis.
- Parameters:
indices (array_like (Nj, ...)) – The indices of the values to select.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – The output DenseGenotypeMatrix with values selected. Note that select does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- select_vrnt(indices, **kwargs)[source]#
Select certain values from the Matrix along the variant axis.
- Parameters:
indices (array_like (Nj, ...)) – The indices of the values to select.
kwargs (dict) – Additional keyword arguments.
- Returns:
out – The output DenseGenotypeMatrix with values selected. Note that select does not occur in-place: a new DenseGenotypeMatrix is allocated and filled.
- Return type:
- sort(keys=None, axis=-1, **kwargs)#
Reset metadata for corresponding axis: name, stix, spix, len. Sort the VariantMatrix using a tuple of keys.
- Parameters:
keys (tuple, None) – A tuple of columns to be sorted. The last column is the primary sort key. If None, sort using vrnt_chrgrp as primary key, and vrnt_phypos as secondary key.
axis (int) – The axis over which to sort values.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- sort_taxa(keys=None, **kwargs)#
Sort slements of the Matrix along the taxa axis using a sequence of keys. Note this modifies the Matrix in-place.
- Parameters:
keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- sort_vrnt(keys=None, **kwargs)#
Sort slements of the Matrix along the variant axis using a sequence of keys. Note this modifies the Matrix in-place.
- Parameters:
keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- tacount(dtype=None)[source]#
Allele count of the non-zero allele within each taxon.
- Parameters:
dtype (dtype, None) – The data type of the accumulator and returned array. If
None
, use the native accumulator type (int or float).- Returns:
out – A
numpy.ndarray
of shape(n,p)
containing allele counts of the allele coded as1
for alln
individuals, for allp
loci.- Return type:
numpy.ndarray
- tafreq(dtype=None)[source]#
Allele frequency of the non-zero allele within each taxon.
- Parameters:
dtype (dtype, None) – The data type of the returned array. If
None
, use the native type.- Returns:
out – A
numpy.ndarray
of shape(n,p)
containing allele frequencies of the allele coded as1
for alln
individuals, for allp
loci.- Return type:
numpy.ndarray
- property taxa: ndarray | None#
Taxa label array
- property taxa_axis: int#
Get taxa axis number
- property taxa_grp: ndarray | None#
Taxa group label.
- property taxa_grp_len: ndarray | None#
Taxa group length.
- property taxa_grp_name: ndarray | None#
Taxa group name.
- property taxa_grp_spix: ndarray | None#
Taxa group stop index.
- property taxa_grp_stix: ndarray | None#
Taxa group start index.
- to_hdf5(filename, groupname=None, overwrite=True)[source]#
Write GenotypeMatrix to an HDF5 file.
- Parameters:
filename (str, Path, h5py.File) – If
str
, an HDF5 file name to which to write. File is closed after writing. Ifh5py.File
, an opened HDF5 file to which to write. File is not closed after writing.groupname (str, None) – If
str
, an HDF5 group name under which GenotypeMatrix data is stored. If None, GenotypeMatrix is written to the base HDF5 group.overwrite (bool) – Whether to overwrite values in an HDF5 file if a field already exists.
- Return type:
None
- ungroup(axis=-1, **kwargs)#
Ungroup the DenseTaxaVariantMatrix along an axis by removing grouping metadata.
- Parameters:
axis (int) – The axis along which values should be ungrouped.
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- ungroup_taxa(**kwargs)#
Ungroup the DenseTaxaMatrix along the taxa axis by removing taxa group metadata.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- ungroup_vrnt(**kwargs)#
Ungroup the DenseVariantMatrix along the variant axis by removing variant group metadata.
- Parameters:
kwargs (dict) – Additional keyword arguments.
- Return type:
None
- property vrnt_axis: int#
Get variant axis
- property vrnt_chrgrp: ndarray | None#
Variant chromosome group label.
- property vrnt_chrgrp_len: ndarray | None#
Variant chromosome group length.
- property vrnt_chrgrp_name: ndarray | None#
Variant chromosome group names.
- property vrnt_chrgrp_spix: ndarray | None#
Variant chromosome group stop indices.
- property vrnt_chrgrp_stix: ndarray | None#
Variant chromosome group start indices.
- property vrnt_genpos: ndarray | None#
Variant genetic position.
- property vrnt_hapalt: ndarray | None#
Variant haplotype sequence.
- property vrnt_hapgrp: ndarray | None#
Variant haplotype group label.
- property vrnt_hapref: ndarray | None#
Variant reference haplotype sequence.
- property vrnt_mask: ndarray | None#
Variant mask.
- property vrnt_name: ndarray | None#
Variant name.
- property vrnt_phypos: ndarray | None#
Variant physical position.
- property vrnt_xoprob: ndarray | None#
Variant crossover sequential probability.