Package 'LDcorSV'

Title: Linkage Disequilibrium Corrected by the Structure and the Relatedness
Description: Four measures of linkage disequilibrium are provided: the usual r^2 measure, the r^2_S measure (r^2 corrected by the structure sample), the r^2_V (r^2 corrected by the relatedness of genotyped individuals), the r^2_VS measure (r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample).
Authors: David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin
Maintainer: Aurélie Siberchicot <[email protected]>
License: GPL (>= 2)
Version: 1.3.3
Built: 2024-11-06 02:39:25 UTC
Source: https://github.com/cran/LDcorSV

Help Index


LDcorSV

Description

The package provides a set of functions which aim is to propose four measures of linkage disequilibrium: - Measure.R2: the usual r2r^2 measure.

- Measure.R2S: r2r^2 corrected by the structure of the sample (rS2r^2_S).

- Measure.R2V: r2r^2 corrected by the relatedness of genotyped individuals (rV2r^2_V).

- Measure.R2VS: r2r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample (rV2Sr^2_VS).

- LD.Measures: this function computes the four measures of linkage disequilibrium (r2r^2, rV2r^2_V, rS2r^2_S and rVS2r^2_{VS}) for a set of loci and gives extra information about them.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

Maintainer: [email protected]

References

Mangin, B., Siberchicot, A., Nicolas, S., Doligez, A., This, P., Cierco-Ayrolles, C. (2012). Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity, 108 (3), 285-291. DOI: 10.1038/hdy.2011.73


data.test

Description

data.test is a list of 3 elements:

- Geno: allelic doses of 20 markers on a chromosome of 91 Vitis vinifera plants.

- V.WAIS: kinship matrix of 91 plants of Vitis vinifera.

- S.2POP: structure population matrix of 91 plants of Vitis vinifera in two sub-populations.

Usage

data(data.test)

Format

A list containing the following components:

- Geno: matrix (91 x 20) of numerical values

- V.WAIS: matrix (91 x 91) of numerical values

- S.2POP: matrix (91 x 1) of numerical values

Examples

data(data.test)

# Allelic doses of 20 markers on a chromosome of 91 Vitis vinifera plants
Geno <- data.test[[1]]
Geno

# Kinship matrix of 91 plants of Vitis vinifera
V.WAIS <- data.test[[2]]
V.WAIS

# Structure population matrix of 91 plants of Vitis vinifera 
# in two sub-populations
S.2POP <- data.test[[3]]
S.2POP

Information on loci

Description

For a locus, this function computes the minor allelic frequency, the frequency of heterozygous genotypes and the missing value frequency.

Usage

Info.Locus(locus,data="G")

Arguments

locus

Numeric vector of allelic doses.

data

Value equal to "G" or "H" depending on the type of data (Genotype or Haplotype). Default value is "G".

Value

The returned value is a numeric vector of three values which are respectively the minor allelic frequency, the frequency of heterozygous genotypes (NA if haplotype data) and the missing value frequency.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

Examples

data(data.test)
Geno <- data.test[[1]]
info <- apply(Geno, 2, Info.Locus)
info

Inv.proj.matrix.sdp

Description

This function computes the Moore-Penrose pseudo-inverse of a symetric matrix. A single value decomposition is performed, the non positive eigen values are set to zero, then the pseudo-inverse is computed.

Usage

Inv.proj.matrix.sdp(matrix)

Arguments

matrix

symmetric matrix

Value

The returned value is the pseudo-inverse matrix.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

Examples

data(data.test)
V.WAIS <- data.test[[2]]
Inv.V.WAIS <- Inv.proj.matrix.sdp(V.WAIS)
Inv.V.WAIS

LD Measures

Description

This function estimates for a set of loci:

- the usual measure of linkage disequilibrium (r2r^2)

- the measure of linkage disequilibrium corrected by the structure of the sample (rS2r^2_S).

- the measure of linkage disequilibrium corrected by the relatedness of genotyped individuals (rV2r^2_V).

- the measure of linkage disequilibrium corrected by both, the relatedness of genotyped individuals and the structure of the sample (rVS2r^2_{VS}).

This function gives extra informations on the studied loci.

Usage

LD.Measures(donnees, V = NA, S = NA, data = "G", supinfo = FALSE, na.presence=TRUE)

Arguments

donnees

Numeric matrix (N x M), where N is the number of genotypes (or haplotypes) and M is the number of markers. Matrix values are the allelic doses:

- (0,1,2) for genotypes.

- (0,1) for haplotypes.

Row names correspond to the ID of individuals. Column names correspond to the ID of markers. Missing values are allowed.

V

Numeric matrix (N x N), where N is the number of genotypes (or haplotypes). Matrix values are coefficients of genetic variance-covariance between every pair of individuals. Row and column names must correspond to the ID of individuals. No missing value.

S

Numeric matrix (N x (1-P)), where N is the number of genotypes (or haplotypes) and P the number of sub-populations. Matrix values are the probabilities (between 0 and 1) for each genotype (or haplotype) to belong to each sub-populations. Row names must correspond to the ID of individuals. Column names correspond to the ID of sub-populations. The matrix must be inversible, if the structure is with P sub-populations, only P-1 columns are expected. No missing value.

data

Value equal to "G" or "H" depending on the type of data (Genotype or Haplotype). Default value is "G".

supinfo

Boolean indicating whether you wish to get information about the loci. If supinfo=TRUE, for each locus, the Minor Allelic Frequency (MAF), the frequency of heterozygous genotypes (only if the data are genotypes) and the missing value frequency are computed. By default, supinfo=FALSE.

na.presence

Boolean indicating the presence of missing values in data. If na.presence=FALSE (no missing data), computation of rV2r^2_V and rVS2r^2_{VS} is largely optimized. By default, na.presence=TRUE.

Value

The returned value is a dataframe of size (M(M-1))/2 rows and C columns, where M is the number of markers and C is a number between 3 and 12 depending on options chosen by user. The first three columns contain respectively the name of the first marker, the name of the second marker and the estimated value of the usual measure of linkage disequilibrium (r2r^2) between these two markers.

If only V is different from NA, the fourth column contains the estimated value of the measure of linkage disequilibrium corrected by the relatedness of genotyped individuals (rV2r^2_V).

If only S is different from NA, the fourth column contains the estimated value of the measure of linkage disequilibrium corrected by relatedness corrected by the structure of the sample (rS2r^2_S).

If V and S are simultaneously different from NA, the fourth, fifth and sixth columns respectively contain the estimated values of rV2r^2_V, rS2r^2_S and rVS2r^2_{VS} (r2r^2 corrected by both the relatedness of genotyped individuals and the structure of the sample).

If Supinfo=TRUE, then the last six columns contain information for both loci : the MAF, the frequency of heterozygous genotype (NA if haplotype data) and the missing value frequency.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

References

Mangin, B., Siberchicot, A., Nicolas, S., Doligez, A., This, P., Cierco-Ayrolles, C. (2012). Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity, 108 (3), 285-291. DOI: 10.1038/hdy.2011.73

Examples

data(data.test)
Geno <- data.test[[1]]
V.WAIS <- data.test[[2]]
S.2POP <- data.test[[3]]

LD <- LD.Measures(Geno, V = V.WAIS, S = S.2POP, data = "G", supinfo = TRUE, na.presence = TRUE)
head(LD)

r^2 measure

Description

This function estimates the usual measure of linkage disequilibrium (r2r^2) between two loci.

Usage

Measure.R2(biloci, na.presence = TRUE)

Arguments

biloci

Numeric matrix (N x 2), where N is the number of genotypes (or haplotypes)

Matrix values are the allelic doses:

- (0,1,2) for genotypes.

- (0,1) for haplotypes.

Rows names correspond to the ID of individuals.

Columns names correspond to the ID of markers.

na.presence

Boolean indicating the presence of missing values in data.

If na.presence=FALSE (no missing data), computation of rV2r^2_V and rVS2r^2_{VS} is largely optimized.

By default, na.presence=TRUE.

Value

The returned value is the estimated value of the usual measure of linkage disequilibrium (r2r^2) or NA if less than 5 individuals have non-missing data at both loci

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

References

Hill, W.G, Robertson, A. (1968). Linkage diseqilibrium in finite populations. Theoretical and Applied Genetics, 38, 226-231. DOI: 10.1007/BF01245622

Examples

data(data.test)
Geno <- data.test[[1]]
Measure.R2(Geno)

r^2_S measure

Description

This function estimates the novel measure of linkage disequilibrium which is corrected by the structure of the sample.

Usage

Measure.R2S(biloci, struc, na.presence=TRUE)

Arguments

biloci

Numeric matrix (N x 2), where N is the number of genotypes (or haplotypes)

Matrix values are the allelic doses:

- (0,1,2) for genotypes.

- (0,1) for haplotypes.

Row names correspond to the ID of individuals.

Column names correspond to the ID of markers.

struc

Numeric matrix (N x (P-1)), where N is the number of genotypes (or haplotypes) and P the number of sub-populations.

Matrix values are the probabilities for each genotypes (or haplotypes) to belong to each sub-populations.

Row names must correspond to the ID of individuals and must be ranged as in the biloci matrix.

Column names correspond to the ID of sub-populations.

The matrix must be inversible, if the structure is with P sub-populations, only P-1 columns are expected.

No missing value.

na.presence

Boolean indicating the presence of missing values in data.

If na.presence=FALSE (no missing data), computation of rV2r^2_V and rVS2r^2_{VS} is largely optimized.

By default, na.presence=TRUE.

Value

The returned value is the estimated value of the measure of linkage disequilibrium corrected by the structure of the sample or NA if less than 5 individuals have non-missing data at both loci.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

References

Mangin, B., Siberchicot, A., Nicolas, S., Doligez, A., This, P., Cierco-Ayrolles, C. (2012). Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity, 108 (3), 285-291. DOI: 10.1038/hdy.2011.73

Examples

data(data.test)
Geno <- data.test[[1]]
S.2POP <- data.test[[3]]
Measure.R2S(Geno, S.2POP)

r^2_V measure

Description

This function estimates the novel measure of linkage disequilibrium which is corrected by the relatedness of genotyped individuals.

Usage

Measure.R2V(biloci, V, na.presence=TRUE, V_inv=NULL)

Arguments

biloci

Numeric matrix (N x 2), where N is the number of genotypes (or haplotypes).

Matrix values are the allelic doses:

- (0,1,2) for genotypes.

- (0,1) for haplotypes.

Row names correspond to the ID of individuals.

Column names correspond to the ID of markers.

V

Numeric matrix (N x N), where N is the number of genotypes (or haplotypes).

Matrix values are coefficients of genetic covariance for each pair of individuals.

Rows and columns names must correspond to the ID of individuals and must be ranged in the same order as in the biloci matrix.

No missing value.

na.presence

Boolean indicating the presence of missing values in data.

If na.presence=FALSE (no missing data), computation of rV2r^2_V and rVS2r^2_{VS} is largely optimized.

By default, na.presence=TRUE.

V_inv

Should stay NULL.

Value

The returned value is the estimated value of the measure of linkage disequilibrium corrected by the relatedness of genotyped individuals or NA if less than 5 individuals have non-missing data at both loci.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

References

Mangin, B., Siberchicot, A., Nicolas, S., Doligez, A., This, P., Cierco-Ayrolles, C. (2012). Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity, 108 (3), 285-291. DOI: 10.1038/hdy.2011.73

Examples

data(data.test)
Geno <- data.test[[1]]
V.WAIS <- data.test[[2]]
Measure.R2V(Geno, V.WAIS)

r^2_VS measure

Description

This function estimates the novel measure of linkage disequilibrium which is corrected by both the relatedness of genotyped individuals and the structure of the sample.

Usage

Measure.R2VS(biloci, V, struc, na.presence = TRUE, V_inv = NULL)

Arguments

biloci

Numeric matrix (N x 2), where N is the number of genotypes (or haplotypes)

Matrix values are the allelic doses:

- (0,1,2) for genotypes.

- (0,1) for haplotypes.

Row names correspond to the ID of individuals.

Column names correspond to the ID of markers.

V

Numeric matrix (N x N), where N is the number of genotypes (or haplotypes).

Matrix values are coefficients of genetic variance-covariance for every pair of individuals. Row and column names must correspond to the ID of individuals and must be ranged as in the biloci matrix.

No missing value.

struc

Numeric matrix (N x (P-1)), where N is the number of genotypes (or haplotypes) and P the number of sub-populations.

Matrix values are the probabilities (between 0 and 1) for each genotypes (or haplotypes) to belong to each sub-populations.

Row names must correspond to the ID of individuals and must be ranged as in the biloci matrix.

Column names correspond to the ID of sub-populations.

The matrix must be inversible, if the structure is with P sub-populations, only P-1 columns are expected.

No missing value.

na.presence

Boolean indicating the presence of missing values in data.

If na.presence=FALSE (no missing data), computation of rV2r^2_V and rVS2r^2_{VS} is largely optimized.

By default, na.presence=TRUE.

V_inv

Should stay NULL

Value

The returned value is the estimated value of the linkage disequilibrium measure corrected by both the relatedness of genotyped individuals and the structure of the sample or NA if less than 5 individuals have non-missing data at both loci.

Author(s)

David Desrousseaux, Florian Sandron, Aurélie Siberchicot, Christine Cierco-Ayrolles and Brigitte Mangin

References

Mangin, B., Siberchicot, A., Nicolas, S., Doligez, A., This, P., Cierco-Ayrolles, C. (2012). Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity, 108 (3), 285-291. DOI: 10.1038/hdy.2011.73

Examples

data(data.test)
Geno <- data.test[[1]]
V.WAIS <- data.test[[2]]
S.2POP <- data.test[[3]]
Measure.R2VS(Geno, V.WAIS, S.2POP)