GenotypeMatrix#

class pybrops.popgen.gmat.GenotypeMatrix.GenotypeMatrix[source]#

Bases: TaxaVariantMatrix, GeneticMappableMatrix, HDF5InputOutput

An abstract class for genoypte matrix objects.

The purpose of this abstract class is to define base functionality for:
  1. Genotype matrix ploidy and phase metadata.

  2. Genotype matrix format conversion.

  3. Genotype matrix allele counting routines.

  4. Genotype matrix genotype counting routines.

Methods

acount

Allele count of the non-zero allele across all taxa.

adjoin

Add additional elements to the end of the Matrix along an axis.

adjoin_taxa

Add additional elements to the end of the Matrix along the taxa axis.

adjoin_vrnt

Add additional elements to the end of the VariantMatrix along the variant axis.

afixed

Determine allele fixation for loci across all taxa.

afreq

Allele frequency of the non-zero allele across all taxa.

apoly

Allele polymorphism presence or absense across all loci.

append

Append values to the Matrix.

append_taxa

Append values to the Matrix along the taxa axis.

append_vrnt

Append values to the VariantMatrix along the variant axis.

concat

Concatenate matrices together along an axis.

concat_taxa

Concatenate list of Matrix together along the taxa axis.

concat_vrnt

Concatenate list of VariantMatrix together along the variant axis.

copy

Make a shallow copy of the Matrix.

deepcopy

Make a deep copy of the Matrix.

delete

Delete sub-arrays along an axis.

delete_taxa

Delete sub-arrays along the taxa axis.

delete_vrnt

Delete sub-arrays along the variant axis.

from_hdf5

Read an object from an HDF5 file.

group

Sort the GroupableMatrix along an axis, then populate grouping indices.

group_taxa

Sort the Matrix along the taxa axis, then populate grouping indices for the taxa axis.

group_vrnt

Sort the VariantMatrix along the variant axis, then populate grouping indices for the variant axis.

gtcount

Gather genotype counts for homozygous major, heterozygous, homozygous minor for all individuals.

gtfreq

Gather genotype frequencies for homozygous major, heterozygous, homozygous minor across all individuals.

incorp

Incorporate values along the given axis before the given indices.

incorp_taxa

Incorporate values along the taxa axis before the given indices.

incorp_vrnt

Incorporate values along the variant axis before the given indices.

insert

Insert values along the given axis before the given indices.

insert_taxa

Insert values along the taxa axis before the given indices.

insert_vrnt

Insert values along the variant axis before the given indices.

interp_genpos

Interpolate genetic map postions for variants using a GeneticMap

interp_xoprob

Interpolate genetic map positions AND crossover probabilities between sequential markers using a GeneticMap and a GeneticMapFunction.

is_grouped

Determine whether the Matrix has been sorted and grouped.

is_grouped_taxa

Determine whether the Matrix has been sorted and grouped along the taxa axis.

is_grouped_vrnt

Determine whether the Matrix has been sorted and grouped along the variant axis.

lexsort

Perform an indirect stable sort using a sequence of keys.

lexsort_taxa

Perform an indirect stable sort using a sequence of keys along the taxa axis.

lexsort_vrnt

Perform an indirect stable sort using a sequence of keys along the variant axis.

maf

Minor allele frequency across all taxa.

mat_asformat

Get the genotype matrix in a specific format type.

meh

Mean expected heterozygosity across all taxa.

remove

Remove sub-arrays along an axis.

remove_taxa

Remove sub-arrays along the taxa axis.

remove_vrnt

Remove sub-arrays along the variant axis.

reorder

Reorder elements of the Matrix using an array of indices.

reorder_taxa

Reorder elements of the Matrix along the taxa axis using an array of indices.

reorder_vrnt

Reorder elements of the Matrix along the variant axis using an array of indices.

select

Select certain values from the matrix.

select_taxa

Select certain values from the Matrix along the taxa axis.

select_vrnt

Select certain values from the VariantMatrix along the variant axis.

sort

Sort slements of the Matrix using a sequence of keys.

sort_taxa

Sort slements of the Matrix along the taxa axis using a sequence of keys.

sort_vrnt

Sort slements of the Matrix along the variant axis using a sequence of keys.

tacount

Allele count of the non-zero allele within each taxon.

tafreq

Allele frequency of the non-zero allele within each taxon.

to_hdf5

Write an object to an HDF5 file.

ungroup

Ungroup the GroupableMatrix along an axis by removing grouping metadata.

ungroup_taxa

Ungroup the TaxaMatrix along the taxa axis by removing taxa group metadata.

ungroup_vrnt

Ungroup the VariantMatrix along the variant axis by removing variant group metadata.

Attributes

mat

Pointer to raw matrix object.

mat_format

Matrix representation format.

mat_ndim

Number of dimensions of the raw matrix.

mat_shape

Shape of the raw matrix.

nphase

The number of phases represented by the genotype matrix.

ntaxa

Number of taxa.

nvrnt

Number of variants.

ploidy

The ploidy level represented by the genotype matrix.

taxa

Taxa label.

taxa_axis

Axis along which taxa are stored.

taxa_grp

Taxa group label.

taxa_grp_len

Taxa group length.

taxa_grp_name

Taxa group name.

taxa_grp_spix

Taxa group stop index.

taxa_grp_stix

Taxa group start index.

vrnt_axis

Axis along which variants are stored.

vrnt_chrgrp

Variant chromosome group label.

vrnt_chrgrp_len

Variant chromosome group length.

vrnt_chrgrp_name

Variant chromosome group names.

vrnt_chrgrp_spix

Variant chromosome group stop indices.

vrnt_chrgrp_stix

Variant chromosome group start indices.

vrnt_genpos

Variant genetic position.

vrnt_hapalt

Variant haplotype sequence.

vrnt_hapgrp

Variant haplotype group label.

vrnt_hapref

Variant reference haplotype sequence.

vrnt_mask

Variant mask.

vrnt_name

Variant name.

vrnt_phypos

Variant physical position.

vrnt_xoprob

Variant crossover sequential probability.

abstract __add__(value)#

Elementwise add matrices

Parameters:

value (object) – Object which to add.

Returns:

out – An object resulting from the addition.

Return type:

object

abstract __mul__(value)#

Elementwise multiply matrices

Parameters:

value (object) – Object which to multiply.

Returns:

out – An object resulting from the multiplication.

Return type:

object

abstract acount(dtype)[source]#

Allele count of the non-zero allele across all taxa.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (p,) containing allele counts of the allele coded as 1 for all p loci.

Return type:

numpy.ndarray

abstract adjoin(values, axis, **kwargs)#

Add additional elements to the end of the Matrix along an axis.

Parameters:
  • values (Matrix or numpy.ndarray) – Values are appended to append to the Matrix.

  • axis (int) – The axis along which values are adjoined.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A copy of mat with values appended to axis. Note that adjoin does not occur in-place: a new Matrix is allocated and filled.

Return type:

Matrix

abstract adjoin_taxa(values, taxa, taxa_grp, **kwargs)#

Add additional elements to the end of the Matrix along the taxa axis.

Parameters:
  • values (Matrix, numpy.ndarray) – Values are appended to adjoin to the Matrix.

  • taxa (numpy.ndarray) – Taxa names to adjoin to the Matrix.

  • taxa_grp (numpy.ndarray) – Taxa groups to adjoin to the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A copy of TaxaMatrix with values appended to axis. Note that adjoin does not occur in-place: a new TaxaMatrix is allocated and filled.

Return type:

TaxaMatrix

abstract adjoin_vrnt(values, vrnt_chrgrp, vrnt_phypos, vrnt_name, vrnt_genpos, vrnt_xoprob, vrnt_hapgrp, vrnt_mask, **kwargs)#

Add additional elements to the end of the VariantMatrix along the variant axis.

Parameters:
  • values (Matrix, numpy.ndarray) – Values are appended to adjoin to the Matrix.

  • vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to adjoin to the Matrix.

  • vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to adjoin to the Matrix.

  • vrnt_name (numpy.ndarray) – Variant names to adjoin to the Matrix.

  • vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to adjoin to the Matrix.

  • vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to adjoin to the Matrix.

  • vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to adjoin to the Matrix.

  • vrnt_mask (numpy.ndarray) – Variant mask to adjoin to the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A copy of the VariantMatrix with values appended to axis. Note that adjoin does not occur in-place: a new VariantMatrix is allocated and filled.

Return type:

VariantMatrix

abstract afixed(dtype)[source]#

Determine allele fixation for loci across all taxa.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (p,) containing indicator variables for whether a locus is fixed at a particular locus.

Return type:

numpy.ndarray

abstract afreq(dtype)[source]#

Allele frequency of the non-zero allele across all taxa.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (p,) containing allele frequencies of the allele coded as 1 for all p loci.

Return type:

numpy.ndarray

abstract apoly(dtype)[source]#

Allele polymorphism presence or absense across all loci.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (p,) containing indicator variables for whether the locus is polymorphic.

Return type:

numpy.ndarray

abstract append(values, axis, **kwargs)#

Append values to the Matrix.

Parameters:
  • values (Matrix, numpy.ndarray) – Values are appended to append to the matrix.

  • axis (int) – The axis along which values are appended.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract append_taxa(values, taxa, taxa_grp, **kwargs)#

Append values to the Matrix along the taxa axis.

Parameters:
  • values (Matrix, numpy.ndarray) – Values are appended to append to the matrix.

  • taxa (numpy.ndarray) – Taxa names to append to the Matrix.

  • taxa_grp (numpy.ndarray) – Taxa groups to append to the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract append_vrnt(values, vrnt_chrgrp, vrnt_phypos, vrnt_name, vrnt_genpos, vrnt_xoprob, vrnt_hapgrp, vrnt_mask, **kwargs)#

Append values to the VariantMatrix along the variant axis.

Parameters:
  • values (Matrix, numpy.ndarray) – Values are appended to append to the VariantMatrix.

  • vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to append to the VariantMatrix.

  • vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to append to the VariantMatrix.

  • vrnt_name (numpy.ndarray) – Variant names to append to the VariantMatrix.

  • vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to append to the VariantMatrix.

  • vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to append to the VariantMatrix.

  • vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to append to the VariantMatrix.

  • vrnt_mask (numpy.ndarray) – Variant mask to append to the VariantMatrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract classmethod concat(mats, axis, **kwargs)#

Concatenate matrices together along an axis.

Parameters:
  • mats (Sequence of Matrix) – List of Matrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.

  • axis (int) – The axis along which the arrays will be joined.

  • kwargs (dict) – Additional keyword arguments

Returns:

out – The concatenated matrix. Note that concat does not occur in-place: a new Matrix is allocated and filled.

Return type:

Matrix

abstract classmethod concat_taxa(mats, **kwargs)#

Concatenate list of Matrix together along the taxa axis.

Parameters:
  • mats (Sequence of Matrix) – List of Matrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.

  • kwargs (dict) – Additional keyword arguments

Returns:

out – The concatenated TaxaMatrix. Note that concat does not occur in-place: a new TaxaMatrix is allocated and filled.

Return type:

TaxaMatrix

abstract classmethod concat_vrnt(mats, **kwargs)#

Concatenate list of VariantMatrix together along the variant axis.

Parameters:
  • mats (Sequence of VariantMatrix) – List of VariantMatrix to concatenate. The matrices must have the same shape, except in the dimension corresponding to axis.

  • kwargs (dict) – Additional keyword arguments

Returns:

out – The concatenated matrix. Note that concat does not occur in-place: a new VariantMatrix is allocated and filled.

Return type:

VariantMatrix

abstract copy()#

Make a shallow copy of the Matrix.

Returns:

out – A shallow copy of the original Matrix.

Return type:

Matrix

abstract deepcopy(memo)#

Make a deep copy of the Matrix.

Parameters:

memo (dict) – Dictionary of memo metadata.

Returns:

out – A deep copy of the original Matrix.

Return type:

Matrix

abstract delete(obj, axis, **kwargs)#

Delete sub-arrays along an axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • axis (int) – The axis along which to delete the subarray defined by obj.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A Matrix with deleted elements. Note that concat does not occur in-place: a new Matrix is allocated and filled.

Return type:

Matrix

abstract delete_taxa(obj, **kwargs)#

Delete sub-arrays along the taxa axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A TaxaMatrix with deleted elements. Note that concat does not occur in-place: a new TaxaMatrix is allocated and filled.

Return type:

TaxaMatrix

abstract delete_vrnt(obj, **kwargs)#

Delete sub-arrays along the variant axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A VariantMatrix with deleted elements. Note that delete does not occur in-place: a new VariantMatrix is allocated and filled.

Return type:

VariantMatrix

abstract classmethod from_hdf5(filename, groupname)#

Read an object from an HDF5 file.

Parameters:
  • filename (str, Path, h5py.File) – If str, an HDF5 file name from which to read. If Path, an HDF5 file name from which to read. If h5py.File, an opened HDF5 file from which to read.

  • groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is read from base HDF5 group.

Returns:

out – An object read from an HDF5 file.

Return type:

HDF5InputOutput

abstract group(axis, **kwargs)#

Sort the GroupableMatrix along an axis, then populate grouping indices.

Parameters:
  • axis (int) – The axis along which values are grouped.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract group_taxa(**kwargs)#

Sort the Matrix along the taxa axis, then populate grouping indices for the taxa axis.

Parameters:

kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract group_vrnt(**kwargs)#

Sort the VariantMatrix along the variant axis, then populate grouping indices for the variant axis.

Parameters:

kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract gtcount(dtype)[source]#

Gather genotype counts for homozygous major, heterozygous, homozygous minor for all individuals.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray array of shape (g,p) containing allele counts across all p loci for each of g genotype combinations.

Where:

  • out[0] is the count of 0 genotype across all loci

  • out[1] is the count of 1 genotype across all loci

  • out[2] is the count of 2 genotype across all loci

  • ...

  • out[g-1] is the count of g-1 genotype across all loci

Return type:

numpy.ndarray

abstract gtfreq(dtype)[source]#

Gather genotype frequencies for homozygous major, heterozygous, homozygous minor across all individuals.

Parameters:

dtype (DTypeLike, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray array of shape (g,p) containing haplotype counts across all p loci for all g genotype combinations.

Where:

  • out[0] is the frequency of 0 genotype across all loci

  • out[1] is the frequency of 1 genotype across all loci

  • out[2] is the frequency of 2 genotype across all loci

  • ...

  • out[g-1] is the frequency of g-1 genotype across all loci

Return type:

numpy.ndarray

abstract incorp(obj, values, axis, **kwargs)#

Incorporate values along the given axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.

  • values (numpy.ndarray) – Values to incorporate into the matrix.

  • axis (int) – The axis along which values are incorporated.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract incorp_taxa(obj, values, taxa, taxa_grp, **kwargs)#

Incorporate values along the taxa axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.

  • values (Matrix, numpy.ndarray) – Values to incorporate into the matrix.

  • taxa (numpy.ndarray) – Taxa names to incorporate into the Matrix.

  • taxa_grp (numpy.ndarray) – Taxa groups to incorporate into the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract incorp_vrnt(obj, values, vrnt_chrgrp, vrnt_phypos, vrnt_name, vrnt_genpos, vrnt_xoprob, vrnt_hapgrp, vrnt_mask, **kwargs)#

Incorporate values along the variant axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is incorporated.

  • values (Matrix, numpy.ndarray) – Values to incorporate into the VariantMatrix.

  • vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to incorporate into the VariantMatrix.

  • vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to incorporate into the VariantMatrix.

  • vrnt_name (numpy.ndarray) – Variant names to incorporate into the VariantMatrix.

  • vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to incorporate into the VariantMatrix.

  • vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to incorporate into the VariantMatrix.

  • vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to incorporate into the VariantMatrix.

  • vrnt_mask (numpy.ndarray) – Variant mask to incorporate into the VariantMatrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract insert(obj, values, axis, **kwargs)#

Insert values along the given axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.

  • values (ArrayLike) – Values to insert into the matrix.

  • axis (int) – The axis along which values are inserted.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A Matrix with values inserted. Note that insert does not occur in-place: a new Matrix is allocated and filled.

Return type:

Matrix

abstract insert_taxa(obj, values, taxa, taxa_grp, **kwargs)#

Insert values along the taxa axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.

  • values (Matrix, numpy.ndarray) – Values to insert into the matrix.

  • taxa (numpy.ndarray) – Taxa names to insert into the Matrix.

  • taxa_grp (numpy.ndarray) – Taxa groups to insert into the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A TaxaMatrix with values inserted. Note that insert does not occur in-place: a new TaxaMatrix is allocated and filled.

Return type:

TaxaMatrix

abstract insert_vrnt(obj, values, vrnt_chrgrp, vrnt_phypos, vrnt_name, vrnt_genpos, vrnt_xoprob, vrnt_hapgrp, vrnt_mask, **kwargs)#

Insert values along the variant axis before the given indices.

Parameters:
  • obj (int, slice, or Sequence of ints) – Object that defines the index or indices before which values is inserted.

  • values (array_like) – Values to insert into the matrix.

  • vrnt_chrgrp (numpy.ndarray) – Variant chromosome groups to insert into the Matrix.

  • vrnt_phypos (numpy.ndarray) – Variant chromosome physical positions to insert into the Matrix.

  • vrnt_name (numpy.ndarray) – Variant names to insert into the Matrix.

  • vrnt_genpos (numpy.ndarray) – Variant chromosome genetic positions to insert into the Matrix.

  • vrnt_xoprob (numpy.ndarray) – Sequential variant crossover probabilities to insert into the Matrix.

  • vrnt_hapgrp (numpy.ndarray) – Variant haplotype labels to insert into the Matrix.

  • vrnt_mask (numpy.ndarray) – Variant mask to insert into the Matrix.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – A VariantMatrix with values inserted. Note that insert does not occur in-place: a new VariantMatrix is allocated and filled.

Return type:

VariantMatrix

abstract interp_genpos(gmap, **kwargs)#

Interpolate genetic map postions for variants using a GeneticMap

Parameters:
  • gmap (GeneticMap) – A genetic map from which to interopolate genetic map postions for loci within the VariantMatrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract interp_xoprob(gmap, gmapfn, **kwargs)#

Interpolate genetic map positions AND crossover probabilities between sequential markers using a GeneticMap and a GeneticMapFunction.

Parameters:
  • gmap (GeneticMap) – A genetic map from which to interopolate genetic map postions for loci within the VariantMatrix.

  • gmapfn (GeneticMapFunction) – A genetic map function from which to interpolate crossover probabilities for loci within the VariantMatrix.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract is_grouped(axis, **kwargs)#

Determine whether the Matrix has been sorted and grouped.

Parameters:
  • axis (int) – Axis along which to determine whether elements have been sorted and grouped.

  • kwargs (dict) – Additional keyword arguments.

Returns:

grouped – True or False indicating whether the Matrix has been sorted and grouped.

Return type:

bool

abstract is_grouped_taxa(**kwargs)#

Determine whether the Matrix has been sorted and grouped along the taxa axis.

Parameters:

kwargs (dict) – Additional keyword arguments.

Returns:

grouped – True or False indicating whether the Matrix has been sorted and grouped.

Return type:

bool

abstract is_grouped_vrnt(**kwargs)#

Determine whether the Matrix has been sorted and grouped along the variant axis.

Parameters:

kwargs (dict) – Additional keyword arguments.

Returns:

grouped – True or False indicating whether the Matrix has been sorted and grouped.

Return type:

bool

abstract lexsort(keys, axis, **kwargs)#

Perform an indirect stable sort using a sequence of keys.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • axis (int) – Axis to be indirectly sorted.

  • kwargs (dict) – Additional keyword arguments.

Returns:

indices – Array of indices that sort the keys along the specified axis.

Return type:

A (N,) ndarray of ints

abstract lexsort_taxa(keys, **kwargs)#

Perform an indirect stable sort using a sequence of keys along the taxa axis.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • kwargs (dict) – Additional keyword arguments.

Returns:

indices – Array of indices that sort the keys along the specified axis.

Return type:

A (N,) ndarray of ints

abstract lexsort_vrnt(keys, **kwargs)#

Perform an indirect stable sort using a sequence of keys along the variant axis.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • kwargs (dict) – Additional keyword arguments.

Returns:

indices – Array of indices that sort the keys along the specified axis.

Return type:

A (N,) ndarray of ints

abstract maf(dtype)[source]#

Minor allele frequency across all taxa.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (p,) containing allele frequencies for the minor allele.

Return type:

numpy.ndarray

abstract property mat: object#

Pointer to raw matrix object.

abstract mat_asformat(format)[source]#

Get the genotype matrix in a specific format type.

Parameters:

format (str) – Desired output format. Options are "{0,1,2}", "{-1,0,1}", "{-1,m,1}".

Returns:

out – Matrix in the desired output format.

Return type:

numpy.ndarray

abstract property mat_format: str#

Matrix representation format.

abstract property mat_ndim: int#

Number of dimensions of the raw matrix.

abstract property mat_shape: tuple#

Shape of the raw matrix.

abstract meh(dtype)[source]#

Mean expected heterozygosity across all taxa.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A number representing the mean expected heterozygous. If dtype is None, then a native 64-bit floating point is returned. Otherwise, of type specified by dtype.

Return type:

Real

abstract property nphase: int#

The number of phases represented by the genotype matrix.

abstract property ntaxa: int#

Number of taxa.

abstract property nvrnt: int#

Number of variants.

abstract property ploidy: int#

The ploidy level represented by the genotype matrix.

abstract remove(obj, axis, **kwargs)#

Remove sub-arrays along an axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • axis (int) – The axis along which to remove the subarray defined by obj.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract remove_taxa(obj, **kwargs)#

Remove sub-arrays along the taxa axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract remove_vrnt(obj, **kwargs)#

Remove sub-arrays along the variant axis.

Parameters:
  • obj (int, slice, or Sequence of ints) – Indicate indices of sub-arrays to remove along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract reorder(indices, axis, **kwargs)#

Reorder elements of the Matrix using an array of indices. Note this modifies the Matrix in-place.

Parameters:
  • indices (A (N,) ndarray of ints, Sequence of ints) – Array of indices that reorder the matrix along the specified axis.

  • axis (int) – Axis to be reordered.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract reorder_taxa(indices, **kwargs)#

Reorder elements of the Matrix along the taxa axis using an array of indices. Note this modifies the Matrix in-place.

Parameters:
  • indices (A (N,) ndarray of ints) – Array of indices that reorder the matrix along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract reorder_vrnt(indices, **kwargs)#

Reorder elements of the Matrix along the variant axis using an array of indices. Note this modifies the Matrix in-place.

Parameters:
  • indices (A (N,) ndarray of ints, Sequence of ints) – Array of indices that reorder the matrix along the specified axis.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract select(indices, axis, **kwargs)#

Select certain values from the matrix.

Parameters:
  • indices (ArrayLike (Nj, ...)) – The indices of the values to select.

  • axis (int) – The axis along which values are selected.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – The output matrix with values selected. Note that select does not occur in-place: a new Matrix is allocated and filled.

Return type:

Matrix

abstract select_taxa(indices, **kwargs)#

Select certain values from the Matrix along the taxa axis.

Parameters:
  • indices (array_like (Nj, ...)) – The indices of the values to select.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – The output TaxaMatrix with values selected. Note that select does not occur in-place: a new TaxaMatrix is allocated and filled.

Return type:

TaxaMatrix

abstract select_vrnt(indices, **kwargs)#

Select certain values from the VariantMatrix along the variant axis.

Parameters:
  • indices (ArrayLike (Nj, ...)) – The indices of the values to select.

  • kwargs (dict) – Additional keyword arguments.

Returns:

out – The output VariantMatrix with values selected. Note that select does not occur in-place: a new VariantMatrix is allocated and filled.

Return type:

VariantMatrix

abstract sort(keys, axis, **kwargs)#

Sort slements of the Matrix using a sequence of keys. Note this modifies the Matrix in-place.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • axis (int) – Axis to be indirectly sorted.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract sort_taxa(keys, **kwargs)#

Sort slements of the Matrix along the taxa axis using a sequence of keys. Note this modifies the Matrix in-place.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract sort_vrnt(keys, **kwargs)#

Sort slements of the Matrix along the variant axis using a sequence of keys. Note this modifies the Matrix in-place.

Parameters:
  • keys (A (k, N) array or tuple containing k (N,)-shaped sequences) – The k different columns to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract tacount(dtype)[source]#

Allele count of the non-zero allele within each taxon.

Parameters:

dtype (dtype, None) – The data type of the accumulator and returned array. If None, use the native accumulator type (int or float).

Returns:

out – A numpy.ndarray of shape (n,p) containing allele counts of the allele coded as 1 for all n individuals, for all p loci.

Where:

  • n is the number of taxa (individuals).

  • p is the number of variants (loci).

Return type:

numpy.ndarray

abstract tafreq(dtype)[source]#

Allele frequency of the non-zero allele within each taxon.

Parameters:

dtype (dtype, None) – The data type of the returned array. If None, use the native type.

Returns:

out – A numpy.ndarray of shape (n,p) containing allele frequencies of the allele coded as 1 for all n individuals, for all p loci.

Return type:

numpy.ndarray

abstract property taxa: object#

Taxa label.

abstract property taxa_axis: int#

Axis along which taxa are stored.

abstract property taxa_grp: object#

Taxa group label.

abstract property taxa_grp_len: object#

Taxa group length.

abstract property taxa_grp_name: object#

Taxa group name.

abstract property taxa_grp_spix: object#

Taxa group stop index.

abstract property taxa_grp_stix: object#

Taxa group start index.

abstract to_hdf5(filename, groupname, overwrite)#

Write an object to an HDF5 file.

Parameters:
  • filename (str, Path, h5py.File) – If str, an HDF5 file name to which to write. If Path, an HDF5 file path to which to write. If h5py.File, an opened HDF5 file to which to write.

  • groupname (str, None) – If str, an HDF5 group name under which object data is stored. If None, object is written to the base HDF5 group.

  • overwrite (bool) – Whether to overwrite values in an HDF5 file if a field already exists.

Return type:

None

abstract ungroup(axis, **kwargs)#

Ungroup the GroupableMatrix along an axis by removing grouping metadata.

Parameters:
  • axis (int) – The axis along which values should be ungrouped.

  • kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract ungroup_taxa(**kwargs)#

Ungroup the TaxaMatrix along the taxa axis by removing taxa group metadata.

Parameters:

kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract ungroup_vrnt(**kwargs)#

Ungroup the VariantMatrix along the variant axis by removing variant group metadata.

Parameters:

kwargs (dict) – Additional keyword arguments.

Return type:

None

abstract property vrnt_axis: int#

Axis along which variants are stored.

abstract property vrnt_chrgrp: object#

Variant chromosome group label.

abstract property vrnt_chrgrp_len: object#

Variant chromosome group length.

abstract property vrnt_chrgrp_name: object#

Variant chromosome group names.

abstract property vrnt_chrgrp_spix: object#

Variant chromosome group stop indices.

abstract property vrnt_chrgrp_stix: object#

Variant chromosome group start indices.

abstract property vrnt_genpos: object#

Variant genetic position.

abstract property vrnt_hapalt: object#

Variant haplotype sequence.

abstract property vrnt_hapgrp: object#

Variant haplotype group label.

abstract property vrnt_hapref: object#

Variant reference haplotype sequence.

abstract property vrnt_mask: object#

Variant mask.

abstract property vrnt_name: object#

Variant name.

abstract property vrnt_phypos: object#

Variant physical position.

abstract property vrnt_xoprob: object#

Variant crossover sequential probability.