<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>LawlessGenomics</title>
    <description>A home for topics in human precision medicine, genomic analysis, and data visualisation.</description>
    <link>http://localhost:4000</link>
    <atom:link href="http://localhost:4000/feed.xml" rel="self" type="application/rss+xml" />
    
      <item>
        <title>Summary - inference of causal metabolite networks</title>
        <description>&lt;h1 id=&quot;summary---inference-of-causal-metabolite-networks&quot;&gt;Summary - inference of causal metabolite networks&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;13 Aug 2023&lt;/p&gt;

&lt;p&gt;A summary of: 
Chen, S., Lin, Z., Shen, X., Li, L., &amp;amp; Pan, W. (2023). Inference of causal metabolite networks in the presence of invalid instrumental variables with GWAS summary data. Genetic Epidemiology, 1–15,
&lt;a href=&quot;https://doi.org/10.1002/gepi.22535&quot;&gt;https://doi.org/10.1002/gepi.22535&lt;/a&gt;,
&lt;a class=&quot;citation&quot; href=&quot;#chen2023inference&quot;&gt;(Chen et al., n.d.)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Briefly, this paper is about using instrumental variables (SNPs) in causal inference with applications to genome-wide association studies (GWAS).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Exposure:&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Metabolites: They serve as intermediate phenotypes connecting genetic variants to clinical outcomes and play a crucial role in biological processes.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Instrumental Variable (IV):&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;SNPs (Single Nucleotide Polymorphisms): These are used as instrumental variables to isolate the variability in metabolites that is independent of confounders.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Outcome:&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Clinical outcomes: These are affected by metabolites, which in turn are influenced by SNPs. The causal effect of metabolites on clinical outcomes is what the study aims to determine.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a more gradual background jump down to the “In context” section.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;abstract-summary&quot;&gt;Abstract summary&lt;/h2&gt;
&lt;p&gt;Structural equation models (SEMs) for inferring causal networks in metabolites and other complex traits. The method:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Performs causal analysis to discover relationships among multiple traits.&lt;/li&gt;
  &lt;li&gt;Accounts for potential invalid IVs.&lt;/li&gt;
  &lt;li&gt;Allows for data analysis using only GWAS summary statistics.&lt;/li&gt;
  &lt;li&gt;Considers bidirectional relationships between traits.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The approach uses a stepwise selection to identify invalid IVs, and demonstrates its superior performance using both real and simulated GWAS data.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;key-methods-summary&quot;&gt;Key methods summary&lt;/h2&gt;

&lt;h3 id=&quot;21-one-sample-data&quot;&gt;2.1 One-sample data&lt;/h3&gt;

&lt;p&gt;For one-sample GWAS individual-level data:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;For each of \(n\) individuals:
    &lt;ul&gt;
      &lt;li&gt;We have \(p\) SNPs as IVs and \(M\) traits.&lt;/li&gt;
      &lt;li&gt;Traits are denoted by an \(n x M\) matrix \(Y = [y_1, y_2, ... , y_M]\).
        &lt;ul&gt;
          &lt;li&gt;Where \(y_i = (y_1i, y_2i, ..., y_ni)^T\) is the vector of \(n\) observations for trait \(i = 1, 2, ..., M\).&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;IVs are denoted as \(Z = [z_1, ... , z_p]\).
        &lt;ul&gt;
          &lt;li&gt;Where \(z_j = (z_1j, ..., z_nj)^T\) is the vector of the \(n\) observations for IV \(j = 1, 2, ..., p\).&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;22-sem-with-individual-level-data&quot;&gt;2.2 SEM with individual-level data&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;SEM is used as a tool for multivariate causal inference in this framework.&lt;/li&gt;
  &lt;li&gt;The analysis is based on one-sample GWAS individual-level data.&lt;/li&gt;
  &lt;li&gt;The system considers linear equations for the ‘n’ individuals in the sample.&lt;/li&gt;
  &lt;li&gt;Notations follow Wang et al. 2016.&lt;/li&gt;
  &lt;li&gt;The model has random errors denoted by a ‘n x M’ matrix.&lt;/li&gt;
  &lt;li&gt;The matrix ‘E’ consists of vectors for the ‘n’ random errors for each trait from 1 to M.&lt;/li&gt;
  &lt;li&gt;The assumption is made that the expected value of error ‘e_i’ is 0.&lt;/li&gt;
  &lt;li&gt;For each trait, the error follows a normal distribution with mean 0 and variance ‘σ_i^2’.&lt;/li&gt;
  &lt;li&gt;The matrix ‘M x M’ denotes the coefficients for the traits, represented by ‘Γ’.&lt;/li&gt;
  &lt;li&gt;The coefficients for each trait ‘i’ range from 1 to M.&lt;/li&gt;
  &lt;li&gt;The linear model for the ‘i-th’ trait is represented by an equation where traits are connected linearly and the sum is 0.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;23-sem-with-some-invalid-ivs&quot;&gt;2.3 SEM with some invalid IVs&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Discusses the scenario when some IVs may be invalid.&lt;/li&gt;
  &lt;li&gt;Violation of one or more of the three valid IV Assumptions (A1)–(A3).&lt;/li&gt;
  &lt;li&gt;Proposes a method to account for invalid IVs in SEM.&lt;/li&gt;
  &lt;li&gt;For trait i, the first \({p}_{0i}\) IVs are considered invalid IVs.&lt;/li&gt;
  &lt;li&gt;Represents the matrix of invalid IVs with \({Z}_{ {\mathscr{A}}_{i}}=[{z}_{1},{\rm{\ldots }},{z}_{p_{0i}}]\).&lt;/li&gt;
  &lt;li&gt;\({z}_{j}={({z}_{1j},{\rm{\ldots }},{z}_{nj})}^{T}\) is the vector for the n observations of IV \(j=1,2,{\rm{\ldots }},{p}_{0i}\).&lt;/li&gt;
  &lt;li&gt;\(M\) vectors are present for the coefficients of invalid IVs: \({B}_{1},{\rm{\ldots }},{B}_{M}\).&lt;/li&gt;
  &lt;li&gt;\({B}_{i}={({\beta }_{1i},{\beta }_{2i},{\rm{\ldots }},{\beta }_{p_{0i}i})}^{T}\) represents the direct or (horizontal) pleiotropic effects of the invalid IVs on trait \(i=1,2,{\rm{\ldots }},M\).&lt;/li&gt;
  &lt;li&gt;The linear SEM for trait i is represented as: \({y}_{1}{\gamma }_{1i}+\cdots +{y}_{i}{\gamma }_{ii}+\cdots +{y}_{M}{\gamma }_{Mi}+{z}_{1}{\beta }_{1i}+\cdots +{z}_{p_{0}i}{\beta }_{p_{0i}i}+{e}_{i}=0,\).&lt;/li&gt;
  &lt;li&gt;\(\gamma\)’s and \(\beta\)’s are unknown parameters in the model.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;code&quot;&gt;Code&lt;/h2&gt;
&lt;h3 id=&quot;onesample-mvstepiv&quot;&gt;onesample mvstepIV&lt;/h3&gt;

&lt;p&gt;The original source code is here:
&lt;a href=&quot;https://github.com/chen-siyi7/one-sample-stepwise-IV-selection/blob/main/one-sample%20stepwise%20IV%20code.R&quot;&gt;https://github.com/chen-siyi7/one-sample-stepwise-IV-selection/blob/main/one-sample%20stepwise%20IV%20code.R&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;onesample_mvstepIV&lt;/code&gt; conducts one-sample stepwise IV.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input Parameters:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;p&lt;/code&gt;: Total number of predictors.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;R&lt;/code&gt;: Correlation matrix.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betaZX&lt;/code&gt;: Regression coefficients for predictors.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betaZY&lt;/code&gt;: Regression coefficients for outcomes.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;se_betaZY&lt;/code&gt;: Standard error of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betaZY&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n&lt;/code&gt;: Sample size.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gamma_hat&lt;/code&gt;: Gamma hat values (prior information).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Main Computations:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Initialize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ZTZ&lt;/code&gt; using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;R&lt;/code&gt; as:
\(ZTZ = R\)&lt;/li&gt;
  &lt;li&gt;Compute &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ZTY&lt;/code&gt; as the element-wise product of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ZTZ&lt;/code&gt; diagonal and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betaZY&lt;/code&gt;: \(ZTY = \text{diag}(ZTZ) \times \beta{ZY}\)&lt;/li&gt;
  &lt;li&gt;Calculate the median of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;YTY&lt;/code&gt; for each predictor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SNP&lt;/code&gt; as: \(YTY[SNP] = (n-1) \times ZTZ[SNP,SNP] \times (se_\beta{ZY}^2)[SNP] + ZTY[SNP] \times \beta{ZY}[SNP]\), excluding NA values.&lt;/li&gt;
  &lt;li&gt;Compute Bayesian Information Criterion (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt;) for each predictor. For each predictor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;:
    &lt;ul&gt;
      &lt;li&gt;a. Create a matrix &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test11&lt;/code&gt; with diagonal element &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt; set to 1.&lt;/li&gt;
      &lt;li&gt;b. Create matrix &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;W1&lt;/code&gt; by combining &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test11&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gamma_hat&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;c. Solve for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;W1&lt;/code&gt; using: \(\text{solve.W1} = W1^T \times ZTZ \times W1\)&lt;/li&gt;
      &lt;li&gt;d. Compute beta estimates &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta1&lt;/code&gt; as: \(\beta1 = (solve.W1^{-1} \times W1^T \times ZTY)\)&lt;/li&gt;
      &lt;li&gt;e. Calculate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt; as: \(testbic[i] = n \times \log(YTY - \beta1^T \times W1^T \times ZTY) + \log(n) \times \sum_{i} \text{diag}(test11)\)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Determine the optimal instrument variables (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IVs&lt;/code&gt;) based on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt;:
    &lt;ul&gt;
      &lt;li&gt;a. For each iteration &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;j&lt;/code&gt;, select the predictor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt; with the smallest &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;b. Repeat the process by adding one predictor at a time.&lt;/li&gt;
      &lt;li&gt;c. Stop if the current and previous &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IV&lt;/code&gt; are the same.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Compute the final beta estimates using the invalid IVs:
    &lt;ul&gt;
      &lt;li&gt;a. Extract the invalid IVs from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;whichIV&lt;/code&gt; and set their diagonal elements in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test11&lt;/code&gt; to 1.&lt;/li&gt;
      &lt;li&gt;b. Compute beta estimates &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta1&lt;/code&gt; as: \(\beta1 = (solve.W1^{-1} \times W1^T \times ZTY)\)&lt;/li&gt;
      &lt;li&gt;c. Calculate variance of beta &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Varbeta&lt;/code&gt; as: \(\text{Var\beta} = \text{diag}(solve.W1 \times n) \times \sigma_u2\) where \(\sigma_u2 = YTY - \beta1^T \times W1^T \times ZTY\).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invalidIV&lt;/code&gt;: Indices of invalid IVs.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta_est&lt;/code&gt;: Estimated beta values.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta_se&lt;/code&gt;: Standard error of beta estimates.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;K&lt;/code&gt;: Number of invalid IVs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;onesample-mvstepiv-ind&quot;&gt;onesample mvstepIV ind&lt;/h3&gt;
&lt;p&gt;The original source code is here:
&lt;a href=&quot;https://github.com/chen-siyi7/one-sample-stepwise-IV-selection/blob/main/onesample_mvstepIV_ind%20code.R&quot;&gt;https://github.com/chen-siyi7/one-sample-stepwise-IV-selection/blob/main/onesample_mvstepIV_ind%20code.R&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;onesample_mvstepIV_ind&lt;/code&gt; performs one-sample stepwise IV for independent SNPs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input Parameters:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt;: Response variable.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z&lt;/code&gt;: Predictor matrix.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n&lt;/code&gt;: Sample size.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gamma_hat&lt;/code&gt;: Gamma hat values (prior information).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Main Computations:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Initialize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;testbic&lt;/code&gt; for Bayesian Information Criterion.&lt;/li&gt;
  &lt;li&gt;For each predictor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;:
    &lt;ul&gt;
      &lt;li&gt;a. Initialize a zero vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;l&lt;/code&gt; with length &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dim(Z)[2]&lt;/code&gt; and set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt;th element to 1.&lt;/li&gt;
      &lt;li&gt;b. Modify matrix &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22&lt;/code&gt; such that for each row &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;j&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22[j,]&lt;/code&gt; is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z[j,]*l&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;c. Perform a linear regression (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lm_stage2&lt;/code&gt;) of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z*gamma_hat&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;d. Calculate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt; for this predictor using: \(testbic[i] = n \times \log\left(\frac{\sum(lm\_stage2\text{residuals}^2)}{n}\right) + \log(n) \times \sum(l)\)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Determine the optimal instrument variables (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IVs&lt;/code&gt;) based on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt;:
    &lt;ul&gt;
      &lt;li&gt;a. For each iteration &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;j&lt;/code&gt;, select the predictor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i&lt;/code&gt; with the smallest &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;b. Modify matrix &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22&lt;/code&gt; for the selected predictors and add one predictor at a time.&lt;/li&gt;
      &lt;li&gt;c. Repeat the linear regression and calculate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BIC&lt;/code&gt; as in step 2.&lt;/li&gt;
      &lt;li&gt;d. Stop if the current and previous &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IV&lt;/code&gt; are the same.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Extract the invalid IVs, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;which.invalid&lt;/code&gt;, from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;whichIV&lt;/code&gt; and sort them to obtain &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;K&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;Compute the final beta estimates using the invalid IVs:
    &lt;ul&gt;
      &lt;li&gt;a. Extract columns &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;K&lt;/code&gt; from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z&lt;/code&gt; to form &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;b. Perform linear regression (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lm_stage2&lt;/code&gt;) of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z22&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z*gamma_hat&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;c. Calculate beta estimates &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betaest&lt;/code&gt; as: \(\beta{est} = \text{summary}(lm\_stage2)\text{coef[,1]}\)&lt;/li&gt;
      &lt;li&gt;d. Calculate variance of residuals &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sigma_u2&lt;/code&gt; as: \(\sigma_u2 = \frac{\sum(lm\_stage2\text{residuals}^2)}{n}\)&lt;/li&gt;
      &lt;li&gt;e. Calculate variance of beta &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Varbeta&lt;/code&gt; using: \(\text{Varbeta} = \text{diag}(ginv(X^TX)) \times \sigma_u2\) where \(X = \text{cbind}(Z22, Dhat)\) and \(Dhat = Z*gamma\_hat\).&lt;/li&gt;
      &lt;li&gt;f. Compute the standard error &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betase&lt;/code&gt; as: \(\beta{se} = \sqrt{\text{Varbeta}}\)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta_est&lt;/code&gt;: Estimated beta values.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;beta_se&lt;/code&gt;: Standard error of beta estimates.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invalid IVs&lt;/code&gt;: Indices of invalid IVs.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;no. of invalid IV&lt;/code&gt;: Number of invalid IVs.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;in-context&quot;&gt;In context&lt;/h2&gt;
&lt;p&gt;Recap:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Exposure:&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Metabolites: They serve as intermediate phenotypes connecting genetic variants to clinical outcomes and play a crucial role in biological processes.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Instrumental Variable (IV):&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;SNPs (Single Nucleotide Polymorphisms): These are used as instrumental variables to isolate the variability in metabolites that is independent of confounders.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Outcome:&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Clinical outcomes: These are affected by metabolites, which in turn are influenced by SNPs. The causal effect of metabolites on clinical outcomes is what the study aims to determine.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;background&quot;&gt;Background&lt;/h3&gt;
&lt;p&gt;In GWAS, associations are generally sought between single nucleotide polymorphisms (SNPs) and a single trait. 
But GWAS data can also be used to analyze multiple related traits, leading to improved power and new biological insights. 
Specifically, network analysis of multiple traits is gaining interest, especially when it comes to causal network analysis. 
This is pivotal for elucidating relationships among multiple traits, such as in gene network and protein network analyses. 
Metabolite network analysis, the focal point of this research, posits that metabolites are integral parts of many biological processes, often interacting with each other in regulatory networks. 
By inferring these networks, we can gain insight into relationships among metabolites in biological processes.&lt;/p&gt;

&lt;p&gt;In causal networks, traits, including metabolites, proteins, and genes, serve as the nodes. 
Their causal relationships are represented by directed edges connecting them. 
SNPs are utilized as instrumental variables (IVs). 
To model these intricate biological networks, structural equation models (SEMs) have been adopted.&lt;/p&gt;

&lt;h3 id=&quot;what-is-an-instrumental-variable-iv&quot;&gt;What is an Instrumental Variable (IV)?&lt;/h3&gt;
&lt;p&gt;An instrumental variable is associated with the exposure but does not have a direct association with the outcome, except through its relationship with the exposure. 
Its role is to isolate the variability in the exposure that is independent of the confounders.&lt;/p&gt;

&lt;h3 id=&quot;key-assumptions&quot;&gt;Key Assumptions&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Relevance:&lt;/strong&gt; The IV is correlated with the exposure.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Exclusion:&lt;/strong&gt; The IV only affects the outcome through its effect on the exposure.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Exchangeability:&lt;/strong&gt; The IV is not associated with unobserved confounders.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;how-does-it-work&quot;&gt;How Does It Work?&lt;/h3&gt;
&lt;p&gt;IV analysis uses the variation in the exposure explained by the instrument to estimate the causal effect of the exposure on the outcome.&lt;/p&gt;

&lt;h3 id=&quot;why-is-it-necessary-for-causal-inference&quot;&gt;Why is it Necessary for Causal Inference?&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Control for Unmeasured Confounding:&lt;/strong&gt; IVs can provide unbiased estimates of causal effects when unmeasured or unobserved confounding is present.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Endogeneity:&lt;/strong&gt; IVs can solve the problem of endogeneity.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Natural Experiments:&lt;/strong&gt; IVs can be employed in “natural experiments” where random assignment of treatments is not feasible.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;usage&quot;&gt;Usage&lt;/h3&gt;
&lt;p&gt;In the context of GWAS and metabolite network analysis, IV methods are crucial. They help determine causal relationships in complex biological processes, especially when metabolites, which do not function in isolation, interact within metabolite regulatory networks.&lt;/p&gt;

&lt;h3 id=&quot;limitations&quot;&gt;Limitations&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Weak Instruments:&lt;/strong&gt; Weak correlation between the IV and exposure can lead to biased IV estimates.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Violations of Assumptions:&lt;/strong&gt; IV estimates can be biased if any core assumptions are violated.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Interpretability:&lt;/strong&gt; The causal effect estimated through IV is often specific to a particular population, reducing generalizability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;summary&quot;&gt;Summary&lt;/h3&gt;
&lt;p&gt;Instrumental Variables are a pivotal tool in causal inference, especially in genome-wide association studies (GWAS). 
When utilized properly, they can provide valuable insights into causal relationships in settings laden with confounding and endogeneity. 
However, they come with their own assumptions and potential limitations.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;chen2023inference&quot;&gt;Chen, S., Lin, Z., Shen, X., Li, L., &amp;amp; Pan, W. Inference of causal metabolite networks in the presence of invalid instrumental variables with GWAS summary data. &lt;i&gt;Genetic Epidemiology&lt;/i&gt;, &lt;i&gt;n/a&lt;/i&gt;(n/a). https://doi.org/https://doi.org/10.1002/gepi.22535&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;
</description>
        <pubDate>Sun, 13 Aug 2023 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//2023/08/13/inf_causal_metab_sem.html</link>
        <guid isPermaLink="true">http://localhost:4000//2023/08/13/inf_causal_metab_sem.html</guid>
      </item>
    
      <item>
        <title>What I want from genomic data centers</title>
        <description>&lt;h1 id=&quot;what-i-want-from-genomic-data-centers&quot;&gt;What I want from genomic data centers&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;09 May 2023&lt;/p&gt;

&lt;p&gt;No one asked, but here are some things that would make me love a genomic data center.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Sensible file naming&lt;/strong&gt;: Require a certain ID format before submission. Ensure that the sequencing center and customers use consistent file naming conventions. 
Sequencing typical requires output like: SAMPLEID_MACHINE_RUNID.
However, if a customer provides a sample ID like “ABC-001”, the sequencing center should NOT be returning “ABC_001_XYZ0001_001234”.
Avoid discrepancies between the customer’s original ID and the sequencing center’s modified ID.
If modifications are required, ensure that the customer does first it on their side by submitting “ABC_001”.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Protocol file&lt;/strong&gt;: Provide a summary file that lists the number of samples, batch IDs, sample handling steps (e.g. library preparation kit used, sequencing lane, sequencer machine used), and any alignment or reference genome information. The customer should be responsible for recording what they have ordered but we must assume that they will loose it if it is not included in the data pack which they are returned.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Alignment&lt;/strong&gt;: If you can provide it as an option, align the data to the most commonly accepted reference genome, such as GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz. For more refer to &lt;a href=&quot;https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use&quot;&gt;Heng Li&lt;/a&gt; and &lt;a href=&quot;https://www.illumina.com/science/genomics-research/articles/dragen-demystifying-reference-genomes.html&quot;&gt;Illumina&lt;/a&gt;. Assume that the customer has no idea which to use unless they specific it. Include a note in the protocol file with advice on alternative reference genomes, such as:
    &lt;blockquote&gt;
      &lt;p&gt;If you map reads to GRCh37 or hg19, use hs37-1kg:
ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/human_g1k_v37.fasta.gz.
If you map to GRCh37 and believe decoy sequences help with better variant calling, use hs37d5:
ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz.
If you map reads to GRCh38 or hg38, use the following:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz.&lt;/p&gt;
    &lt;/blockquote&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Variant calling&lt;/strong&gt;: If providing variant calling services, such as using GATK best-practices, include a summary of the protocol.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Reporting prioritization&lt;/strong&gt;: Focus on implementing the most important reporting methods first, such as:
    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/25741868/&quot;&gt;ACMG recommendations for SNV and short INDEL variants&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/31690835/&quot;&gt;ACMG recommendations for CNV&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/clinvar/docs/acmg/&quot;&gt;ACMG recommendations for secondary findings&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Producing a simple report will be valuable for the majority of customers. 
Avoid reporting overly complex information.&lt;/p&gt;
</description>
        <pubDate>Tue, 09 May 2023 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//2023/05/09/what_i_want_from_genome_center.html</link>
        <guid isPermaLink="true">http://localhost:4000//2023/05/09/what_i_want_from_genome_center.html</guid>
      </item>
    
      <item>
        <title>AI - From Complex Requests to Simple Fixes</title>
        <description>&lt;h1 id=&quot;ai---from-complex-requests-to-simple-fixes&quot;&gt;AI - From Complex Requests to Simple Fixes&lt;/h1&gt;
&lt;p class=&quot;meta&quot;&gt;18 Jan 2023&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edit 2026: This post was written two months after the public launch of OpenAI’s chatGPT (Nov 2021). It seems rather basic now but it is a small reminder of the turning point and shows how blown away we were with this new tech.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I see a lot of people giving their opinions about chat AI without much examples of real-world application. 
So here is my concrete experience. 
In two weeks I have rewritten, from scratch, one of my genomics pipelines which can analyse 1000 clinical exomes is a few hours.&lt;/p&gt;

&lt;p&gt;It included:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;42 scripts/programs.&lt;/li&gt;
  &lt;li&gt;3694 lines of code.&lt;/li&gt;
  &lt;li&gt;Several languages.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It works to:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Process raw fastq up to germline variant calling.&lt;/li&gt;
  &lt;li&gt;Using high-performance computing cluster.&lt;/li&gt;
  &lt;li&gt;Process vcf into clinical variant interpretation.&lt;/li&gt;
  &lt;li&gt;Options for custom filtering strategies.&lt;/li&gt;
  &lt;li&gt;Statistical analysis and logs.&lt;/li&gt;
  &lt;li&gt;Replaced several mainstream tools that are difficult to manage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Final results:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;My work was ~6-10x times faster.&lt;/li&gt;
  &lt;li&gt;I probably would not have rewritten this better version as it would have taken too long.&lt;/li&gt;
  &lt;li&gt;Instead of getting stuck at any difficult impasse I can get context-specific alternatives and testable example code.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I use chatGPT plus as a paid member for access without restriction. 
Instead of googling for keywords that match, I either post a code example and ask for specific fixes or simply describe the request. 
Here is a very simple example, but I have done some very complex request with great success.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;ul&gt;
    &lt;li&gt;I have an R dataframe “df” with columns “genotype” “variant” “gene” “sample” “inheritance_model”.&lt;/li&gt;
    &lt;li&gt;Add a new column “flag”.&lt;/li&gt;
    &lt;li&gt;Group by “gene” and “sample”,&lt;/li&gt;
    &lt;li&gt;then add the value 1 to “flag” for rows where “genotype” &amp;gt; 1 and “inheritance_model” == AR.&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dplyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gene&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mutate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flag&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ifelse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inheritance_model&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;AR&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;My real-world use is usually far more complex than this and uses relatively large sets of code.
However, my queries are broken up or pre-empted with an introductory discussion such that, in context, the final request is as simple as the one above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final thought&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Whatever sales reps or armchair philosophers say about the rapidly approaching era of AI, 
if you start using these tools now you will rapidly understand how powerful they will become in many aspects of life.&lt;/p&gt;

&lt;p&gt;(ChatGPT also picked this nice blogpost title for me when I fed it this blogpost and asked for a suggestion).&lt;/p&gt;
</description>
        <pubDate>Wed, 18 Jan 2023 01:00:01 +0100</pubDate>
        <link>http://localhost:4000//2023/01/18/ai_from_complex_requests_to_simple_fixes.html</link>
        <guid isPermaLink="true">http://localhost:4000//2023/01/18/ai_from_complex_requests_to_simple_fixes.html</guid>
      </item>
    
      <item>
        <title>AI, tell me about precision medicine genomics</title>
        <description>&lt;h1 id=&quot;ai-tell-me-about-precision-medicine-genomics&quot;&gt;AI, tell me about precision medicine genomics&lt;/h1&gt;
&lt;p class=&quot;meta&quot;&gt;18 Jan 2023&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edit 2026: This post was written two months after the public launch of OpenAI’s chatGPT (Nov 2021). It seems rather basic now but it is a small reminder of the turning point and shows how blown away we were with this new tech.&lt;/em&gt;&lt;/p&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#tell-me-about-precision-medicine-genomics&quot; id=&quot;markdown-toc-tell-me-about-precision-medicine-genomics&quot;&gt;Tell me about precision medicine genomics.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#tell-me-how-ai-can-be-used-in-this-context&quot; id=&quot;markdown-toc-tell-me-how-ai-can-be-used-in-this-context&quot;&gt;Tell me how AI can be used in this context.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#in-this-context-is-ai-better-than-classical-statistical-analysis-and-machine-learning&quot; id=&quot;markdown-toc-in-this-context-is-ai-better-than-classical-statistical-analysis-and-machine-learning&quot;&gt;In this context, is AI better than classical statistical analysis and machine learning?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#summarize-all-of-the-previous-answers-into-one-paragraph&quot; id=&quot;markdown-toc-summarize-all-of-the-previous-answers-into-one-paragraph&quot;&gt;Summarize all of the previous answers into one paragraph.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#thank-you&quot; id=&quot;markdown-toc-thank-you&quot;&gt;Thank you.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following is a conversation with ChatGPT.&lt;/p&gt;

&lt;h2 id=&quot;tell-me-about-precision-medicine-genomics&quot;&gt;Tell me about precision medicine genomics.&lt;/h2&gt;

&lt;p&gt;“Precision medicine genomics is a medical approach that uses genetic information to tailor medical treatment to the individual patient. This approach is based on the understanding that genetic variations can affect a person’s response to certain drugs and susceptibility to certain diseases. By identifying these genetic variations, doctors can select the most effective treatment options for each patient and potentially avoid harmful side effects.&lt;/p&gt;

&lt;p&gt;“One important aspect of precision medicine genomics is the use of genetic testing to identify specific mutations or variations in a patient’s DNA. This information can be used to guide treatment decisions, such as selecting the most appropriate drug or dosage. For example, if a patient has a specific genetic mutation that is known to cause resistance to a certain drug, the doctor may choose to prescribe a different drug instead.&lt;/p&gt;

&lt;p&gt;“Another important aspect of precision medicine genomics is the use of large-scale genomic data to identify new targets for drug development and to develop new diagnostic tests. By analyzing the genomic data of large numbers of patients, researchers can identify genetic variations that are associated with specific diseases or drug responses. This information can be used to develop new drugs or diagnostic tests that are tailored to the specific genetic makeup of individual patients.&lt;/p&gt;

&lt;p&gt;“Precision medicine genomics is still in its early stages and there are still many challenges to be overcome, such as the high cost of genetic testing and the need for more research to fully understand the implications of genetic variations on disease and drug response. However, as the field continues to evolve, it has the potential to revolutionize the way medicine is practiced, leading to more effective and personalized treatments for patients.”&lt;/p&gt;

&lt;h2 id=&quot;tell-me-how-ai-can-be-used-in-this-context&quot;&gt;Tell me how AI can be used in this context.&lt;/h2&gt;

&lt;p&gt;“Artificial intelligence (AI) can be used in various ways to support precision medicine genomics. Some examples include:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Genetic variant analysis: AI algorithms can be used to analyze large amounts of genomic data to identify genetic variations that are associated with specific diseases or drug responses. This can help researchers identify new targets for drug development and develop new diagnostic tests.&lt;/li&gt;
  &lt;li&gt;Drug discovery: AI can be used to analyze large amounts of data on drug interactions and side effects to identify new drug candidates that have a high likelihood of being effective and safe.&lt;/li&gt;
  &lt;li&gt;Predictive modeling: AI algorithms can be used to create predictive models that can identify patients who are most likely to respond to specific treatments based on their genetic profile. This can help doctors make more informed treatment decisions.&lt;/li&gt;
  &lt;li&gt;Personalized medicine: AI can be used to analyze a patient’s genetic data, medical history, and other relevant information to create a personalized treatment plan.&lt;/li&gt;
  &lt;li&gt;Natural Language Processing (NLP) : AI-based NLP can help in extracting insights from unstructured data sources such as electronic medical records, clinical notes, and scientific literature.&lt;/li&gt;
  &lt;li&gt;Imaging analysis : AI can be used to analyze medical images such as CT scans, MRI’s and X-rays to identify and classify disease patterns.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;“AI is still in early stages of development in the field of precision medicine genomics, but it has the potential to revolutionize the field by making it possible to analyze and interpret large amounts of data quickly and accurately. This can help to improve the accuracy of diagnoses and treatment decisions, leading to better outcomes for patients.”&lt;/p&gt;

&lt;h2 id=&quot;in-this-context-is-ai-better-than-classical-statistical-analysis-and-machine-learning&quot;&gt;In this context, is AI better than classical statistical analysis and machine learning?&lt;/h2&gt;

&lt;p&gt;“In the context of precision medicine genomics, AI can be considered as an extension of classical statistical analysis and machine learning. It can be used to analyze and interpret large amounts of data quickly and accurately, which can make it a useful tool for precision medicine genomics. However, it is important to note that AI is not always better than classical statistical analysis and machine learning. The choice of method depends on the specific use case and the data being analyzed.&lt;/p&gt;

&lt;p&gt;“AI can be particularly useful for analyzing large amounts of complex data, such as genomic data, which can be difficult to analyze using traditional methods. For example, AI can be used to identify patterns in the data that might not be apparent using classical statistical analysis. Additionally, AI can be used to create predictive models that can identify patients who are most likely to respond to specific treatments based on their genetic profile, which can be difficult to do using traditional methods.&lt;/p&gt;

&lt;p&gt;“On the other hand, classical statistical analysis and machine learning still have their own advantages in specific scenarios. For example, traditional statistical analysis methods are well established and have been widely used in the field of medical research for decades. They are also relatively easy to understand and interpret. Additionally, classical statistical analysis is often more computationally efficient than AI, which can make it a better choice in certain cases.&lt;/p&gt;

&lt;p&gt;“In summary, AI can be a valuable tool in precision medicine genomics, but it is not always better than classical statistical analysis and machine learning. The choice of method depends on the specific use case and the data being analyzed.”&lt;/p&gt;

&lt;h2 id=&quot;summarize-all-of-the-previous-answers-into-one-paragraph&quot;&gt;Summarize all of the previous answers into one paragraph.&lt;/h2&gt;

&lt;p&gt;“Precision medicine genomics is a medical approach that uses genetic information to tailor medical treatment to the individual patient. Genetic testing can be used to identify specific mutations or variations in a patient’s DNA. Artificial intelligence (AI) can be used to analyze large amounts of genomic data to identify genetic variations that are associated with specific diseases or drug responses. AI can be used to analyze large amounts of data on drug interactions and side effects to identify new drug candidates that have a high likelihood of being effective and safe. AI can be used to create predictive models that can identify patients who are most likely to respond to specific treatments based on their genetic profile. However, it is important to note that AI is not always better than classical statistical analysis and machine learning, the choice of method depends on the specific use case and the data being analyzed.”&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you.&lt;/h2&gt;

&lt;p&gt;“You’re welcome! I’m glad I could help. Let me know if you have any other questions.”&lt;/p&gt;

&lt;!-- &lt;ol class=&quot;bibliography&quot;&gt;&lt;/ol&gt; --&gt;
</description>
        <pubDate>Wed, 18 Jan 2023 01:00:01 +0100</pubDate>
        <link>http://localhost:4000//2023/01/18/ai_tell_me_about_precision_medicine_genomics.html</link>
        <guid isPermaLink="true">http://localhost:4000//2023/01/18/ai_tell_me_about_precision_medicine_genomics.html</guid>
      </item>
    
      <item>
        <title>We need a nice rare disease database</title>
        <description>&lt;h1 id=&quot;we-need-a-nice-rare-disease-database&quot;&gt;We need a nice rare disease database&lt;/h1&gt;
&lt;p class=&quot;meta&quot;&gt;30 June 2022&lt;/p&gt;

&lt;!-- * TOC --&gt;
&lt;!-- {:toc} --&gt;

&lt;h2 id=&quot;what-is-this&quot;&gt;What is this?&lt;/h2&gt;

&lt;p&gt;This short post is to demonstrate a problem and a solution as an example of why I add these things to my public page. 
It is just one of the hundreds of small, modular datasets that I build and cross-link for all my public and private LawlessGenomics work.
For precision medicine we need countless small, efficient, connected, tools in a consisten format. 
This is one example.&lt;/p&gt;

&lt;h2 id=&quot;problem&quot;&gt;Problem&lt;/h2&gt;

&lt;p&gt;Diagnosing rare immune disorders is very challenging.
A fast checklist of candidate genes is required.&lt;/p&gt;

&lt;h2 id=&quot;one-step-to-the-solution&quot;&gt;One step to the solution&lt;/h2&gt;

&lt;p&gt;One of the most important resources for clinical immunology genetics is the yearly report of the
Inborn Errors of Immunity (IEI) Committee by the International Union of Immunological Societies (IUIS)
(&lt;a href=&quot;https://iuis.org/committees/iei/&quot;&gt;https://iuis.org&lt;/a&gt;),
&lt;a class=&quot;citation&quot; href=&quot;#tangye2022iei&quot;&gt;(Tangye et al., 2022)&lt;/a&gt;. 
Read the paper here:
&lt;a href=&quot;https://link.springer.com/article/10.1007/s10875-022-01289-3&quot;&gt;doi.org/10.1007/s10875-022-01289-3&lt;/a&gt;
and directly download their original xlsx table from Springer here:
&lt;a href=&quot;https://static-content.springer.com/esm/art%3A10.1007%2Fs10875-022-01289-3/MediaObjects/10875_2022_1289_MOESM2_ESM.xlsx&quot;&gt;Supplementary file2 (XLSX 93 kb)&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;however&quot;&gt;However&lt;/h2&gt;

&lt;p&gt;Because of traditional publishing protocols, there is a major problem. 
To use this dataset you must either:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Download and save an XLSX table (&lt;a href=&quot;https://static-content.springer.com/esm/art%3A10.1007%2Fs10875-022-01289-3/MediaObjects/10875_2022_1289_MOESM2_ESM.xlsx&quot;&gt;download&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;Bookmark the webpage and &lt;a href=&quot;https://link.springer.com/article/10.1007/s10875-022-01289-3/tables/1&quot;&gt;view a png screen-shot of each table&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Download the pdf version and view an embedded &lt;a href=&quot;https://link.springer.com/content/pdf/10.1007/s10875-022-01289-3.pdf&quot;&gt;pdf&lt;/a&gt; of each table.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;better-solution&quot;&gt;Better solution&lt;/h2&gt;

&lt;p&gt;For years, I have made personal databases of gene lists which I query very often. 
Every bioinformatician also uses large sets of databases for annotation.&lt;/p&gt;

&lt;p&gt;I notice a huge gap for clinicians, etc. who also need this info but have to rely on looking up the paper each time.
One good example of the correct method is the Genomics England PanelApp
&lt;a href=&quot;https://panelapp.genomicsengland.co.uk&quot;&gt;https://panelapp.genomicsengland.co.uk&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I want a fast, searchable version of the IUIS IEI gene list.
So I made it here:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/topic/precision_med/iuis_iei_table_page/&quot;&gt;Human Inborn Errors of Immunity&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;why-is-this-better&quot;&gt;Why is this better?&lt;/h2&gt;

&lt;p&gt;I don’t just want to confirm that my query gene is an &lt;em&gt;immune disorder-related gene&lt;/em&gt;, I want to quickly get a lot of facts about it.
Bioinformatically, I would add all this data in one step using my variant effect pipeline. 
However, I usually look this stuff up while on the go, in a meeting, etc.
For a clinician, it is the same. 
They do not have the time to call up and request a bioinformatic query.
We all need a fast, searchable table.&lt;/p&gt;

&lt;p&gt;I have added additional hyperlinks to all of the external websites that are most pertinent.
I also did a lot of cleaning and standardisation for filtering “Inheritance” terms, etc. so that it can work seamlessly.&lt;/p&gt;

&lt;h2 id=&quot;thanks&quot;&gt;Thanks&lt;/h2&gt;

&lt;p&gt;Many thanks to the committee for releasing their invaluable updates every year.&lt;/p&gt;

&lt;p&gt;Please do not contact me to complain!
I hope I have modified and cited the original information enough to deter a take-down request. 
I also hope someone adds a version to the IUIS website.&lt;/p&gt;

&lt;h2 id=&quot;anecdote&quot;&gt;Anecdote&lt;/h2&gt;

&lt;p&gt;I was lucky enough to have a paper which accounts for one of these newly added genes, since this list is rapidly approaching its asymptote:
&lt;a href=&quot;/2021/07/08/credited-iuis.html&quot;&gt;Credited for a newly added immune gene in IUIS&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;tangye2022iei&quot;&gt;Tangye, S. G., Al-Herz, W., Bousfiha, A., Cunningham-Rundles, C., Franco, J. L., Holland, S. M., Klein, C., Morio, T., Oksenhendler, E., Picard, C., Puel, A., Puck, J., Seppänen, M. R. J., Somech, R., Su, H. C., Sullivan, K. E., Torgerson, T. R., &amp;amp; Meyts, I. (2022). Human Inborn Errors of Immunity: 2022 Update on the Classification from the International Union of Immunological Societies Expert Committee. &lt;i&gt;Journal of Clinical Immunology&lt;/i&gt;. https://doi.org/10.1007/s10875-022-01289-3&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;
</description>
        <pubDate>Thu, 30 Jun 2022 02:00:01 +0200</pubDate>
        <link>http://localhost:4000//2022/06/30/we-need-a-nice-rare-disease-database.html</link>
        <guid isPermaLink="true">http://localhost:4000//2022/06/30/we-need-a-nice-rare-disease-database.html</guid>
      </item>
    
      <item>
        <title>KZNF protein pathway regulation</title>
        <description>&lt;h1 id=&quot;kznf-protein-pathway-regulation&quot;&gt;KZNF protein pathway regulation&lt;/h1&gt;
&lt;p class=&quot;meta&quot;&gt;24 June 2022&lt;/p&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#exec-summary&quot; id=&quot;markdown-toc-exec-summary&quot;&gt;Exec summary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#background&quot; id=&quot;markdown-toc-background&quot;&gt;Background&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#typical-approach&quot; id=&quot;markdown-toc-typical-approach&quot;&gt;Typical approach&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#new-approach&quot; id=&quot;markdown-toc-new-approach&quot;&gt;New approach&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#case-study&quot; id=&quot;markdown-toc-case-study&quot;&gt;Case study&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#protocol&quot; id=&quot;markdown-toc-protocol&quot;&gt;Protocol&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-select-a-pli-znf-pilot-study&quot; id=&quot;markdown-toc-0-select-a-pli-znf-pilot-study&quot;&gt;0. Select a pLI ZNF (pilot study)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-make-a-znf-in-vitro-assay&quot; id=&quot;markdown-toc-1-make-a-znf-in-vitro-assay&quot;&gt;1. Make a ZNF in vitro assay&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-chip-seq&quot; id=&quot;markdown-toc-2-chip-seq&quot;&gt;2. Chip-seq&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-find-genes-with-de&quot; id=&quot;markdown-toc-3-find-genes-with-de&quot;&gt;3. Find genes with DE&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-cluster-all-target-genes-into-their-protein-pathways&quot; id=&quot;markdown-toc-4-cluster-all-target-genes-into-their-protein-pathways&quot;&gt;4. Cluster all target genes into their protein pathways&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-identify-pathways-with-pli&quot; id=&quot;markdown-toc-5-identify-pathways-with-pli&quot;&gt;5. Identify pathways with pLI&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-further-development&quot; id=&quot;markdown-toc-6-further-development&quot;&gt;6. Further development&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#result&quot; id=&quot;markdown-toc-result&quot;&gt;Result&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#code-used&quot; id=&quot;markdown-toc-code-used&quot;&gt;Code used&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#1unzipsh&quot; id=&quot;markdown-toc-1unzipsh&quot;&gt;1.unzip.sh&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#2scansh&quot; id=&quot;markdown-toc-2scansh&quot;&gt;2.scan.sh&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#3analyser&quot; id=&quot;markdown-toc-3analyser&quot;&gt;3.analyse.R&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#4chipsh&quot; id=&quot;markdown-toc-4chipsh&quot;&gt;4.chip.sh&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#5genelistsh&quot; id=&quot;markdown-toc-5genelistsh&quot;&gt;5.genelist.sh&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#6mclplofsh&quot; id=&quot;markdown-toc-6mclplofsh&quot;&gt;6.mclplof.sh&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;exec-summary&quot;&gt;Exec summary&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;We use existing data to show an improved method for identifying ZNF-dependent gene regulation at the:
    &lt;ol&gt;
      &lt;li&gt;gene level.&lt;/li&gt;
      &lt;li&gt;protein pathway level.&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;The method is statistically unbiased and might be used to produce more successful functional validation studies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will go thorough the steps to get to this final summary image:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/KZNF_protein_pathway_regulation/img3_main_plot.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Control of gene expression by KZNFs occurs throughout the genome.&lt;/li&gt;
  &lt;li&gt;Clusters of several genes within a single region are affected by one KZNF binding site.&lt;/li&gt;
  &lt;li&gt;Research often focuses on one KZNF at a time
    &lt;ul&gt;
      &lt;li&gt;functional experiment require a ZNF in vitro system and Chip-seq&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Research often focuses on the resulting best candidate:
    &lt;ul&gt;
      &lt;li&gt;a single genome site that had the larges change in expression&lt;/li&gt;
      &lt;li&gt;to top gene in that region (for whatever reason)&lt;/li&gt;
      &lt;li&gt;nearst gene to the regulotory region&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
  &lt;li&gt;ZNF genes that are intolerant to LoF (pLI = 1) are very interesting:
    &lt;ul&gt;
      &lt;li&gt;since their function is regulation of expression for target genes&lt;/li&gt;
      &lt;li&gt;the targets themselves must also have a intollerance to LoF&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;ZNF targets can be many gene (i) genome wide (ii) clustered together.&lt;/li&gt;
  &lt;li&gt;Genome-wide ZNF targets might form a protein pathway&lt;/li&gt;
  &lt;li&gt;pLI ZNF targets may also be (i) individually pLI or (ii) cumulatively pLI for a pathway.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;typical-approach&quot;&gt;Typical approach&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Make a ZNF in vitro assay&lt;/li&gt;
  &lt;li&gt;Chip-seq&lt;/li&gt;
  &lt;li&gt;Find genes with DE&lt;/li&gt;
  &lt;li&gt;Explain the top hit&lt;/li&gt;
  &lt;li&gt;Or pass all hits into GO for a pathway report.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Opinion:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Result #4 is interesting and requires a lot of functional work.&lt;/li&gt;
  &lt;li&gt;Result #5 is generally not very insightful and often focuses on a cluster of genes, rather than genome-wide.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;new-approach&quot;&gt;New approach&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Select a pLI ZNF (pilot study)&lt;/li&gt;
  &lt;li&gt;Make a ZNF in vitro assay&lt;/li&gt;
  &lt;li&gt;Chip-seq&lt;/li&gt;
  &lt;li&gt;Find genes with DE&lt;/li&gt;
  &lt;li&gt;Cluster all target genes into their protein pathways&lt;/li&gt;
  &lt;li&gt;Identify pathways where all genes are (i) individually pLI or (ii) cumulatively pLI for the pathway.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Opinion:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;This result unbiased for picking what you determine to be the “top hit”.&lt;/li&gt;
  &lt;li&gt;The mechanisms of pLI (and therefore ZFN downstream effect) can be interpreted from existing evidence
    &lt;ul&gt;
      &lt;li&gt;for all genes withing the pathway, functional explanation is likely avaiable (Uniprot + GTEx)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;A successful pilot in the most damaging pLI will provide a route to more complex and subtle regulatory mechanisms
    &lt;ul&gt;
      &lt;li&gt;i.e. genes/pathways that are tollerant to haploinsuffiency which would otherwise be at the mercy of blind functional analysis.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;case-study&quot;&gt;Case study&lt;/h2&gt;
&lt;h3 id=&quot;protocol&quot;&gt;Protocol&lt;/h3&gt;

&lt;h4 id=&quot;0-select-a-pli-znf-pilot-study&quot;&gt;0. Select a pLI ZNF (pilot study)&lt;/h4&gt;
&lt;p&gt;The closer pLI is to one, the more intolerant of protein-truncating variants the transcript appears to be. 
We consider pLI ≥ 0.9 as an extremely intolerant set of transcripts. 
&lt;a class=&quot;citation&quot; href=&quot;#lek2016analysis&quot;&gt;(Lek et al., 2016)&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#1unzipsh&quot;&gt;Code block 1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To find ZNF genes with pLI &amp;gt;0.9 I used&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://gnomad.broadinstitute.org/downloads&quot;&gt;https://gnomad.broadinstitute.org/downloads&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Section “Constraint”&lt;/li&gt;
  &lt;li&gt;File “pLoF Metrics by Gene TSV”&lt;/li&gt;
  &lt;li&gt;Ref &lt;a href=&quot;https://www.nature.com/articles/s41586-020-2308-7&quot;&gt;https://www.nature.com/articles/s41586-020-2308-7&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a class=&quot;citation&quot; href=&quot;#lek2016analysis&quot;&gt;(Lek et al., 2016)&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Handling requires installation of samtools &lt;a href=&quot;https://www.htslib.org&quot;&gt;https://www.htslib.org&lt;/a&gt; (Samtools, BCFtools, HTSlib).
Uncompress the data with bgzip.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#2scansh&quot;&gt;Code block 2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Query the whole genoe pLI score for all ZNF genes.
Clean the dataset to extract the data of interest:
Gene name, numbers of observed/expected LoF, pLI score, etc.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#3analyser&quot;&gt;Code block 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The R code performs a lot of formatting and cleaning to interpret the dataset.
We find all genes wiht pLI &amp;gt; 0.9.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/KZNF_protein_pathway_regulation/img3_main_plot_A.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I have pre-emptively shown ZNF777 in red. I will show in the next section why I chose this as a pilot study.&lt;/p&gt;

&lt;h4 id=&quot;1-make-a-znf-in-vitro-assay&quot;&gt;1. Make a ZNF in vitro assay&lt;/h4&gt;
&lt;p&gt;This was found from existing data, as shown in the next section.&lt;/p&gt;

&lt;h4 id=&quot;2-chip-seq&quot;&gt;2. Chip-seq&lt;/h4&gt;
&lt;p&gt;This was found from existing data, as shown in the next section.&lt;/p&gt;

&lt;h4 id=&quot;3-find-genes-with-de&quot;&gt;3. Find genes with DE&lt;/h4&gt;
&lt;p&gt;I used existing chip-seq data.
This will pair one ZNF to a list of all genes that it ostensibly regulates.
I believe this dataset to be a useful example:
&lt;a class=&quot;citation&quot; href=&quot;#imbeault2017krab&quot;&gt;(Imbeault et al., 2017)&lt;/a&gt;,
&lt;a href=&quot;https://www.nature.com/articles/nature21683&quot;&gt;https://www.nature.com/articles/nature21683&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;“We then determined the genomic targets of human KZFPs by performing chromatin immunoprecipitation with exonuclease digestion (ChIP–exo)[13] on 257 HEK293T cell lines, each engineered to express one haemagglutinin (HA)-tagged family member. We obtained single base-pair resolution binding sites for 222 human KZFPs (Supplementary Table 4), 12 of them in duplicate (Extended Data Fig. 2), with the number of high-quality peaks per protein ranging from more than 10,000 to only around 15 (Extended Data Fig. 3a).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I compared my list of all ZNF for pLI &amp;gt;0.9 with the ZNF genes available in this study.
The first match was ZNF777, which has pLI 0.99 (as shown in Figure 1).&lt;/p&gt;

&lt;p&gt;I downloaded the data:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;GSM2466654	ZNF777
  -&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE78099&quot;&gt;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE78099&lt;/a&gt;
      -&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM2466654&quot;&gt;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM2466654&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Methods in &lt;a href=&quot;#3analyser&quot;&gt;Code block 3&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I cleaned and visualise the MACS (Model-based Analysis of ChIP-Seq) peak data.
&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3120977/&quot;&gt;What is MACS? - ref 1&lt;/a&gt;,
&lt;a class=&quot;citation&quot; href=&quot;#feng2011using&quot;&gt;(Feng et al., 2011)&lt;/a&gt;.
&lt;a href=&quot;https://hbctraining.github.io/Intro-to-ChIPseq/lessons/05_peak_calling_macs.html&quot;&gt;What is MACS? - ref 2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/KZNF_protein_pathway_regulation/img3_main_plot_B.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;4-cluster-all-target-genes-into-their-protein-pathways&quot;&gt;4. Cluster all target genes into their protein pathways&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#4chipsh&quot;&gt;Code block 4&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#5genelistsh&quot;&gt;Code block 5&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#6mclplofsh&quot;&gt;Code block 6&lt;/a&gt;
(The data contained genomic coordinates so I needed to gene coding region protein for further steps.)&lt;/li&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#3analyser&quot;&gt;Code block 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The genes “affected” by ZNF777 were clustered into their best-fit protein pathways using &lt;a href=&quot;https://string-db.org&quot;&gt;STRING-db&lt;/a&gt;.
String uses experimental and curated data from Biocarta, BioCyc, GO, KEGG, and Reactome, BIND, DIP, GRID, HPRD, IntAct, MINT, and PID.&lt;/p&gt;

&lt;p&gt;I access the database using &lt;a href=&quot;https://cytoscape.org&quot;&gt;Cytoscape&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pathway clustering was performed using 
Markov Cluster Algorithm (MCL), an unsupervised cluster algorithm for networks based on stochastic flow.
&lt;a class=&quot;citation&quot; href=&quot;#van2000graph&quot;&gt;(Van Dongen, 2000)&lt;/a&gt;,
&lt;a href=&quot;https://micans.org/mcl/&quot;&gt;https://micans.org/mcl/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In Figure A, I show all of the gene target of ZNF777.&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Each node illustrates one protein.&lt;/li&gt;
  &lt;li&gt;Each edge illustrates the weight of evidence connecting proteins.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since we cannot define the protein pathways clearly here 
(something that you find when reading hierarchical &lt;a href=&quot;http://geneontology.org&quot;&gt;GO enrichement analsis&lt;/a&gt; results), 
I instead cluster these proteins into the best-fitting pathways.&lt;/p&gt;

&lt;p&gt;The result of MCL clustering is show in Figure B.&lt;/p&gt;

&lt;p&gt;Now we have a set of distinct protein pathways.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/KZNF_protein_pathway_regulation/img1-2_merged.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;5-identify-pathways-with-pli&quot;&gt;5. Identify pathways with pLI&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Methods in &lt;a href=&quot;#3analyser&quot;&gt;Code block 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, since we started with a ZNF that is intolerant to LoF, we also believe that the target genes must also be&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;(i) individually pLI&lt;/li&gt;
  &lt;li&gt;or (ii) cumulatively pLI for the pathway.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I cleaned the data further to group gene according to their&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;protein pathways,&lt;/li&gt;
  &lt;li&gt;genomic distribution, and&lt;/li&gt;
  &lt;li&gt;pLI score.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Figure 3 shows the top 5 largest pathways.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/KZNF_protein_pathway_regulation/img3_main_plot_C.jpg&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Key:&lt;/strong&gt; We can see that rahter than focusing on a cluster of genes within one genomic region, we interpret genes according to their protein pthway.&lt;/li&gt;
  &lt;li&gt;Some protein pathways will have high pLI scores and may be considered the major ZNF targets.&lt;/li&gt;
  &lt;li&gt;Some protein pathways can tolerate LoF and maybe considered by-standers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These by-stander genes are still surely sensitive to regulation by ZNF777.
However, we can infer the evolutionary hierarchy of the major and minor protein pathways under the control of this ZNF.&lt;/p&gt;

&lt;h4 id=&quot;6-further-development&quot;&gt;6. Further development&lt;/h4&gt;
&lt;p&gt;There are 2 major further ways to progress.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Complete all avaialble Chip-seq datasets for pLI &amp;gt;0.9. Expand into pLI below 0.9 if possible.&lt;/li&gt;
  &lt;li&gt;Find which genes are in these protein pathways and see why they are not regulated by this ZNF (and see if there are sister ZNFs working in tangent to control pathway dosages).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;result&quot;&gt;Result&lt;/h3&gt;
&lt;p&gt;From all candidate ZNF gene targets, we find those which match it’s intolerance to LoF.
We therefore define these pathways as the major targets of this ZNF.&lt;/p&gt;

&lt;p&gt;I have purposely refrained for completing the final step: quantifying the top pLI pathway/gene list. 
This is to prevent from making a conclusion without testing the theory and stats.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;h2 id=&quot;code-used&quot;&gt;Code used&lt;/h2&gt;
&lt;h3 id=&quot;1unzipsh&quot;&gt;1.unzip.sh&lt;/h3&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/bin/bash

# dataset
# &amp;lt;https://gnomad.broadinstitute.org/downloads&amp;gt;
# Section &quot;Constraint&quot;
# File &quot;pLoF Metrics by Gene TSV&quot;
# Ref &amp;lt;https://www.nature.com/articles/s41586-020-2308-7&amp;gt;

# install samtools

bgzip -d gnomad.v2.1.1.lof_metrics.by_gene.txt.bgz

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;2scansh&quot;&gt;2.scan.sh&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/bin/bash
set -e
cd ../data/

output=ZNF_summary.tsv

# query gene and get summary column
head -1 gnomad.v2.1.1.lof_metrics.by_gene.txt | cut -f 1,17,18,20,21,42,43,46,75,76,77 &amp;gt; $output
# obs_lof	mu_lof	exp_lof	pLI	obs_het_lof	obs_hom_lof	exp_hom_lof

grep &quot;ZNF*&quot; gnomad.v2.1.1.lof_metrics.by_gene.txt | cut -f 1,17,18,20,21,42,43,46,75,76,77 &amp;gt;&amp;gt; $output

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;h3 id=&quot;3analyser&quot;&gt;3.analyse.R&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Major note:&lt;/strong&gt; some of the data used in the second half of this script requires the scripts from 4,5,6.
I have named them in the order of writing. 
It will be obvious when a dataset is missing, that the next bash script should be run.
If continuing this project, 3.analyse.R should be split into sequential order, or call the bash script within the R code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minor note:&lt;/strong&gt; This script is needlessly long and contains rough exploration work. If continuing this project, it can be heavily simplified.&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dplyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tidyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Multiple sequence alignment file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../data/ZNF_summary.tsv&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringsAsFactors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Note there is one KAZN gene included, figure out the grep correction later.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# The closer pLI is to one, the more intolerant of protein-truncating variants the transcript appears to be. We consider pLI ≥ 0.9 as an extremely intolerant set of transcripts. _Lek et al Nature 2016._&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# get the data in chr order&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arrange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# replace NAs with X chromosome &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.na&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;X&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# create the subset for example: ZNF777&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gene&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ZNF777&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pZNF777&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	ggplot(aes(x=start_position, y=pLI))+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pLI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	geom_point() +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_jitter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colour&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;red&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# this adds a red point&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ZNF777&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;.95&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colour&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;red&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_hline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linetype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dotted&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yintercept&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#facet_grid(~chromosome) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	facet_grid(.~factor(chromosome, stringr::str_sort(unique(chromosome), numeric = TRUE))) + &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	theme(legend.position=&quot;none&quot;, panel.background = element_rect(&quot;#F7F7F7&quot;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#			#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#			axis.text.x=element_blank()) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ZNF genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;pLI (≥ 0.9 extremely intolerant to LoF)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pZNF777&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Next I want to simulate some chip-seq data, matching one ZNF to a list of all genes that are regulated&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# I believe this dataset is useful&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### https://www.nature.com/articles/nature21683&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# We then determined the genomic targets of human KZFPs by performing chromatin immunoprecipitation with exonuclease digestion (ChIP–exo)13 on 257 HEK293T cell lines, each engineered to express one haemagglutinin (HA)-tagged family member. We obtained single base-pair resolution binding sites for 222 human KZFPs (Supplementary Table 4), 12 of them in duplicate (Extended Data Fig. 2), with the number of high-quality peaks per protein ranging from more than 10,000 to only around 15 (Extended Data Fig. 3a). &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#ZNF77 has pLI 0.99&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Download&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# GSM2466654	ZNF777&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE78099&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM2466654&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../data/GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringsAsFactors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;separate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;into&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;chromosome&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;chr&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;V2&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;start_position&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;V3&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;end_position&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;V5&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;score&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	geom_hline(linetype=&quot;dotted&quot;, yintercept=0.9) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#facet_grid(~chromosome) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;facet_grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_blank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Score MACS peak&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# aligned to GRCh37g1k &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# get gene symbtols via ensenbke biomart&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# http://grch37.ensembl.org/biomart/martview/35883cadc275e8738b07668ab1f8d58f&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# run 4.chip.sh&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Query for these coordinates and output attibutes:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Dataset : Human genes (GRCh37.p13)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Filters: coordinates, e.g. 1:100:10000:-1, 1:100000:200000:1: [ID-list specified]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Attributes:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Gene stable ID&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Gene start (bp)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Gene end (bp)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Chromosome/scaffold name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Transcription start site (TSS)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#Gene name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Output: GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.biomartformat_annotated.txt&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../data/GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.biomartformat_annotated.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringsAsFactors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# run 5.genelist.sh&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# cytoscape &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# string gene query, print list&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# MCL cluster, default 2.5 inflation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../output/img2_string_network_ZNF777_peaks_MCL2.5_table.csv&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;,&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringsAsFactors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X__mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;display.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# colnames(df)[colnames(df) == &apos;oldName&apos;] &amp;lt;- &apos;newName&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;display.name&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Gene.name&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;X__mclCluster&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;mclCluster&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# merge MCL clusters with the MACS peaks (genes named via biomart)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;merge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# get first in gene (instead of all TSS)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gene.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row_number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Chromosome.scaffold.name&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;chromosome&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Gene.start..bp.&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;start_position&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Gene.end..bp.&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;end_position&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	geom_hline(linetype=&quot;dotted&quot;, yintercept=0.9) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#facet_grid(~chromosome) +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;facet_grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_blank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Score MACS peak&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 


&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#geom_point() +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_jitter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_blank()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Protein pathway ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# merge protein pathway data with MACS peaks&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;merge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gene.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;merge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# get pLI scores&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# now get the plof for mcl genes&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gene_names&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gene.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gene_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;../output/gene_names_mcl_top5.csv&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;,&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;quote&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row.names&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;col.names&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# run 6.mclplof.sh&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Multiple sequence alignment file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df6&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../output/gene_names_mcl_top5_plof.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringsAsFactors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# colnames(df6)[colnames(df6) == &apos;oldName&apos;] &amp;lt;- &apos;newName&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;gene&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Gene.name&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;merge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all.x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	geom_point() +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_jitter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_blank()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Protein pathway ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	filter(chromosome ==1 ) %&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	geom_point() +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_jitter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_blank()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Protein pathway ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gene.name&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pLI&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.character&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mclCluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#geom_point() +&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_jitter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pLI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_blank()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Protein pathway ID&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gene.name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;black&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tmp&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#	filter(chromosome ==1 ) %&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_segment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xend&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yend&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;facet_grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stringr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;str_sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chromosome&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;none&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;panel.background&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;#F7F7F7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#axis.text.x=element_text(angle=45,hjust=1,vjust=0.5),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_blank&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Genes genomic position&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Score MACS peak&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Plot main figure ----&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gridExtra&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grid.arrange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pZNF777&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				 &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bottom&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
				 &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;4chipsh&quot;&gt;4.chip.sh&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/bin/bash

cd ../data/

# convert to correct format
# (Chr:Start:End:Strand) [Max 500 advised]
# e.g. 1:100:10000:-1, 1:100000:200000:1

cut -f 1-3 GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.txt |\
	sed &apos;s/	/:/g&apos; |\
	sed &apos;s/chr//g&apos; \
	&amp;gt; GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.biomartformat.txt

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;5genelistsh&quot;&gt;5.genelist.sh&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/bin/bash

# gene simple gene list to cluster with STRING
cut -f 6 GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.biomartformat_annotated.txt |\
	sort |\
	uniq \
	&amp;gt; GSM2466654_ZNF777_peaks_processed_score_signal_exo.bed.biomartformat_annotated_genes.txt

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;6mclplofsh&quot;&gt;6.mclplof.sh&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#!/bin/bash
cd ../output/

# for every line in the top 5 protein pathways, get the pLI data
head -1 ../data/gnomad.v2.1.1.lof_metrics.by_gene.txt | cut -f 1,17,18,20,21,42,43,46,75,76,77 &amp;gt; gene_names_mcl_top5_plof.txt

while read -r line; 
do 
grep &quot;$line&quot;  ../data/gnomad.v2.1.1.lof_metrics.by_gene.txt;
done &amp;lt; gene_names_mcl_top5.csv  |\
	cut -f 1,17,18,20,21,42,43,46,75,76,77 &amp;gt;&amp;gt; gene_names_mcl_top5_plof.txt

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;lek2016analysis&quot;&gt;Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., O’Donnell-Luria, A. H., Ware, J. S., Hill, A. J., Cummings, B. B., &amp;amp; others. (2016). Analysis of protein-coding genetic variation in 60,706 humans. &lt;i&gt;Nature&lt;/i&gt;, &lt;i&gt;536&lt;/i&gt;(7616), 285.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;imbeault2017krab&quot;&gt;Imbeault, M., Helleboid, P.-Y., &amp;amp; Trono, D. (2017). KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. &lt;i&gt;Nature&lt;/i&gt;, &lt;i&gt;543&lt;/i&gt;(7646), 550–554.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;feng2011using&quot;&gt;Feng, J., Liu, T., &amp;amp; Zhang, Y. (2011). Using MACS to identify peaks from ChIP-Seq data. &lt;i&gt;Current Protocols in Bioinformatics&lt;/i&gt;, &lt;i&gt;34&lt;/i&gt;(1), 2–14.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;van2000graph&quot;&gt;Van Dongen, S. M. (2000). &lt;i&gt;Graph clustering by flow simulation&lt;/i&gt; [PhD thesis].&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;

</description>
        <pubDate>Fri, 24 Jun 2022 02:00:01 +0200</pubDate>
        <link>http://localhost:4000//2022/06/24/KZNF_protein_pathway_regulation.html</link>
        <guid isPermaLink="true">http://localhost:4000//2022/06/24/KZNF_protein_pathway_regulation.html</guid>
      </item>
    
      <item>
        <title>Reference genome choice</title>
        <description>&lt;h1 id=&quot;reference-genome-choice&quot;&gt;Reference genome choice&lt;/h1&gt;
&lt;p&gt;Reference genome choice is discussed succinctly in many difference places.
Therefore, I will not write a long piece on it but instead add links for what I think are usefule sources.&lt;/p&gt;

&lt;h2 id=&quot;heng-li---which-human-reference-genome-to-use&quot;&gt;Heng Li - Which human reference genome to use?&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use&quot;&gt;https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;illumina-review&quot;&gt;Illumina review&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://www.illumina.com/science/genomics-research/articles/dragen-demystifying-reference-genomes.html&quot;&gt;https://www.illumina.com/science/genomics-research/articles/dragen-demystifying-reference-genomes.html&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;/ol&gt;
</description>
        <pubDate>Mon, 31 Jan 2022 01:00:01 +0100</pubDate>
        <link>http://localhost:4000//2022/01/31/reference-genome-choice.html</link>
        <guid isPermaLink="true">http://localhost:4000//2022/01/31/reference-genome-choice.html</guid>
      </item>
    
      <item>
        <title>Who&apos;s afraid of the big bad geneticist?</title>
        <description>&lt;h1 id=&quot;whos-afraid-of-the-big-bad-geneticist&quot;&gt;Who’s afraid of the big bad geneticist?&lt;/h1&gt;
&lt;p&gt;Unfinished article&lt;/p&gt;

&lt;p&gt;Understanding public perception of genomics is crucial for ethically establishing systems of translational medicine.
Despite efforts towards effective altruism,
the use and strategies of genomic intervention ought remain in the control of the public; 
the greatest tasking lies in adequate technical education for a topics with a high barrier to entry.
Human genomics may be one field where oversimplified summaries are not sufficient to demonstrate the grand value for human health, agriculture, and other unexpected benefits.
I shall continue to add high-quality resources that may be useful to this post, over time.&lt;/p&gt;

&lt;h2 id=&quot;public-perception&quot;&gt;Public perception&lt;/h2&gt;
&lt;p&gt;In studying the “Global Public Perceptions of Genomic Data Sharing: What Shapes the Willingness to Donate DNA and Health Data?”,
&lt;a class=&quot;citation&quot; href=&quot;#middleton2020global&quot;&gt;(Middleton et al., 2020)&lt;/a&gt;
perform analysis of 
peoples’ familiarity with the concepts of DNA and genomics, 
and associations in the willingness to donate DNA and medical information. 
Publishor url: &lt;a href=&quot;https://www.cell.com/ajhg/fulltext/S0002-9297(20)30292-5&quot;&gt;https://www.cell.com/ajhg/fulltext/S0002-9297(20)30292-5&lt;/a&gt;
and &lt;a href=&quot;https://github.com/DylanLawless/genomics_tools/blob/master/global_genomics_docs/2020_Middleton_AMHG_Global%20Public%20Perceptions%20of%20Genomic%20Data%20Sharing-%20What%20Shapes%20the%20Willingness%20to%20Donate%20DNA%20and%20Health%20Data.pdf&quot;&gt;link to pdf&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The analysis have been adjusted for sociodemographic factors, etc.
Not all demographies were homogeneously familiar with genetics, but the meta-analysis showed associations between understanding and participation willingness.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;There was evidence for between-country heterogeneity (\(I^2\) = 48% and \(I^2\) = 61% for familiarity with genetics and personal experience, respectively).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Compared to participants who were unfamiliar with genetics, 
those who were familiar or had personal experience had greater odds of being willing to donate 
(odds ratio [OR] = 1.85, 95% CI = 1.11-2.00; 
OR = 2.70, 95% CI = 2.37- 3.09, respectively).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Similar results were seen when considering single groups of recipients of donated DNA and medical information, i.e. doctors, non-profit researchers, for-profit researchers.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The association between genetic exceptionalism (perception DNA is different from other medical data) and willingness to donate was OR 1.60 (95% CI = 1.47–1.75).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There was a strong association between trust in multiple actors (doctors, non-profit researchers, commercial researchers, governments, etc.) and willingness to donate DNA and medical information. The pooled OR was 3.85 (95% CI = 3.34–4.44).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To re-illustrate the authors’ work I have downloaded the data and reproduces the characteristics plots here as follows.
This data was used for the meta-analysis to report the ORs listed above. 
The R code that I used for this process is included.&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Data source: DOI:https://doi.org/10.1016/j.ajhg.2020.08.023&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Download Supplemental data (mmc3.xlsx) Table S2. Unadjusted Results for Full Dataset, Stratified by Country&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Convert data to tsv format.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;mmc3.tsv&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
             &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;header&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
             &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\t&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tidyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gather&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cohort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Total&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;United.States&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;na.rm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;convert&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor_key&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;separate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;\\(|\\)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;into&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;samples&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Get variable names for filtering&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then plotting the main figures as follows&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Figure 1. Willingness to Donate Anonymous DNA and Medical Information to Different Recipient Groups, Stratified by Country&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Willingness to donate to at least two recipient groups&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Willingness to donate to non-profit researchers&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Willingness to donate to for-profit researchers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Yes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cohort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;identity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dodge&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;angle&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;45&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;bottom&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;vertical&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cohort&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot; Percentage willingness &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Figure 1. Willingness to Donate Anonymous DNA and\nMedical Information to Different Recipient Groups&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2021-09-21-global_perception/fig1.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Figure 2 Familiarity with Genetics (Including Familiarity Gained through Personal Experience), Stratified by Country&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Familiarity with genetics&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Personal&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Familiar&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Unfamiliar&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cohort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;identity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dodge&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;angle&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;45&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;bottom&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;vertical&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cohort&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot; Percentage&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Figure 2 Familiarity with Genetics \nIncluding Familiarity Gained through Personal Experience&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/images/posts/2021-09-21-global_perception/fig2.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Figure 3 Perception of DNA as Being Different from Other Medical Information (Genetic Exceptionalism) versus DNA as Being the Same (or Unsure), Stratified by Country&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Perception of DNA being different to other medical information versus DNA being the same&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Same/Unsure&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Different&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cohort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;identity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dodge&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;angle&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;45&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;bottom&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;horizontal&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cohort&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot; Percentage&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Figure 3 Perception of DNA as Being Different from\nOther Medical Information (Genetic Exceptionalism)\nversus DNA as Being the Same (or Unsure)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/images/posts/2021-09-21-global_perception/fig3.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Figure 4 Trust in Donating DNA and Medical Information to More than One User (Including Doctor, Researcher, Company, Government, etc.), Stratified by Country&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_l_sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Trust in donating DNA and medical information to more than one user&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_by&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&amp;gt;%&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cohort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as.numeric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Categories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;identity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dodge&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis.text.x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;angle&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;45&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;bottom&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend.direction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;horizontal&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cohort&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot; Percentage&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Figure 4 Trust in Donating DNA and Medical Information to \nMore than One User (incl. Doctor, Researcher, Government, etc.)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/images/posts/2021-09-21-global_perception/fig4.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;middleton2020global&quot;&gt;Middleton, A., Milne, R., Almarri, M. A., Anwer, S., Atutornu, J., Baranova, E. E., Bevan, P., Cerezo, M., Cong, Y., Critchley, C., &amp;amp; others. (2020). Global public perceptions of genomic data sharing: what shapes the willingness to donate DNA and health data? &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;, &lt;i&gt;107&lt;/i&gt;(4), 743–752.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;
</description>
        <pubDate>Tue, 21 Sep 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//2021/09/21/whos_afraid.html</link>
        <guid isPermaLink="true">http://localhost:4000//2021/09/21/whos_afraid.html</guid>
      </item>
    
      <item>
        <title>Weeknotes</title>
        <description>&lt;h1 id=&quot;weeknotes&quot;&gt;Weeknotes&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;13 Sept 2021&lt;/p&gt;

&lt;h1 id=&quot;work&quot;&gt;Work&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Writing paper on viral infections.&lt;/li&gt;
  &lt;li&gt;Creating genome and gene domain structure illustrations with features in illustrator and bed, MSA, and fasta files.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;personal&quot;&gt;Personal&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Working on astro biology projects.&lt;/li&gt;
  &lt;li&gt;Learning a 2d/3d architecture system for design and rendering.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;reading&quot;&gt;Reading&lt;/h1&gt;
&lt;ol&gt;
  &lt;li&gt;A Short History of the Middle East: From Ancient Empires to Islamic State by Gordon Kerr &lt;a href=&quot;https://www.goodreads.com/book/show/29456677-a-short-history-of-the-middle-east&quot;&gt;link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Killing Hope: U.S. Military and C.I.A. Interventions Since World War II by William Blum &lt;a href=&quot;https://www.goodreads.com/book/show/78130.Killing_Hope&quot;&gt;link&lt;/a&gt; - highly recommended.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;other-media&quot;&gt;Other media&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Podcast - Afropop Worldwide &lt;a href=&quot;https://afropop.org/audio-programs/the-cameroon-cuba-connection&quot;&gt;The Cameroon Cuba connection&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Mon, 13 Sep 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//2021/09/13/weeknotes_16.html</link>
        <guid isPermaLink="true">http://localhost:4000//2021/09/13/weeknotes_16.html</guid>
      </item>
    
      <item>
        <title>Weeknotes</title>
        <description>&lt;h1 id=&quot;weeknotes&quot;&gt;Weeknotes&lt;/h1&gt;

&lt;p class=&quot;meta&quot;&gt;6 Sept 2021&lt;/p&gt;

&lt;h1 id=&quot;work&quot;&gt;Work&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Writing paper on viral infections.&lt;/li&gt;
  &lt;li&gt;Review grant.&lt;/li&gt;
  &lt;li&gt;Prep for incoming student projects.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;personal&quot;&gt;Personal&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Working on astro biology projects.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;reading&quot;&gt;Reading&lt;/h1&gt;
&lt;ol&gt;
  &lt;li&gt;A Short History of the Middle East: From Ancient Empires to Islamic State by Gordon Kerr &lt;a href=&quot;https://www.goodreads.com/book/show/29456677-a-short-history-of-the-middle-east&quot;&gt;link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Smiley’s People by John le Carré - &lt;a href=&quot;https://www.goodreads.com/book/show/18999.Smiley_s_People&quot;&gt;link&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;other-media&quot;&gt;Other media&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Podcast - Afropop Worldwide &lt;a href=&quot;https://afropop.org/audio-programs/new-moves-in-afro-jazz&quot;&gt;New Moves in Afro-Jazz&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Podcast - Afropop Worldwide &lt;a href=&quot;https://afropop.org/audio-programs/hip-deep-angola-4-the-cuban-intervention-in-angola&quot;&gt;Hip Deep Angola 4: The Cuban Intervention in Angola&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Mon, 06 Sep 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//2021/09/06/weeknotes_15.html</link>
        <guid isPermaLink="true">http://localhost:4000//2021/09/06/weeknotes_15.html</guid>
      </item>
    
		
		
      <item>
        <title>Aggregated Cauchy Association Test (ACAT)</title>
        <description>&lt;h1 id=&quot;aggregated-cauchy-association-test-acat&quot;&gt;Aggregated Cauchy Association Test (ACAT)&lt;/h1&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#abbreviations&quot; id=&quot;markdown-toc-abbreviations&quot;&gt;Abbreviations&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#intro-to-this-topic&quot; id=&quot;markdown-toc-intro-to-this-topic&quot;&gt;Intro to this topic&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#papers&quot; id=&quot;markdown-toc-papers&quot;&gt;Papers&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#step-by-step-explanation-of-acatr&quot; id=&quot;markdown-toc-step-by-step-explanation-of-acatr&quot;&gt;Step-by-step explanation of ACAT.R&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#applying-staar-o-for-multiple-annotation-weights&quot; id=&quot;markdown-toc-applying-staar-o-for-multiple-annotation-weights&quot;&gt;Applying STAAR-O for multiple annotation weights&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#non-gene-centric-analysis-using-dynamic-windows-with-scang-staar&quot; id=&quot;markdown-toc-non-gene-centric-analysis-using-dynamic-windows-with-scang-staar&quot;&gt;Non-gene-centric analysis using dynamic windows with SCANG-STAAR&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#multi-weight-annotation-analysis&quot; id=&quot;markdown-toc-multi-weight-annotation-analysis&quot;&gt;Multi-weight annotation analysis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#main-equations-for-acat&quot; id=&quot;markdown-toc-main-equations-for-acat&quot;&gt;Main equations for ACAT&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#tan-and-pi&quot; id=&quot;markdown-toc-tan-and-pi&quot;&gt;tan and \(\pi\)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#original-r-code-from-yaowuliu&quot; id=&quot;markdown-toc-original-r-code-from-yaowuliu&quot;&gt;Original R code from yaowuliu&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p class=&quot;meta&quot;&gt;25 Apr 2023 - last update&lt;/p&gt;

&lt;h2 id=&quot;abbreviations&quot;&gt;Abbreviations&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;ACAT: Aggregated Cauchy Association Test&lt;/li&gt;
  &lt;li&gt;ACAT-V: Aggregated Cauchy Association Test - Variant level&lt;/li&gt;
  &lt;li&gt;ACAT-O: Aggregated Cauchy Association Test - Omnibus&lt;/li&gt;
  &lt;li&gt;SKAT: Sequence Kernel Association Test&lt;/li&gt;
  &lt;li&gt;ARIC: Atherosclerosis Risk in Communities&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;intro-to-this-topic&quot;&gt;Intro to this topic&lt;/h2&gt;

&lt;p&gt;The Aggregated Cauchy Association Test (ACAT) is a statistical method used for rare-variant association tests (RVATs) in genetic studies. ACAT is designed to aggregate the association signals of multiple rare genetic variants within a genomic region or a gene, while accounting for the directions of the effects of these variants on the phenotype of interest. The ACAT method utilizes a Cauchy distribution, which allows for improved performance in identifying true associations, especially when the directions and magnitudes of variant effects are heterogeneous.&lt;/p&gt;

&lt;p&gt;First, here is a great talk by author of SKAT and other methods:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=URGJIAdRTi0&amp;amp;pp=ygUbU2VxdWVuY2Uga2VybmFsIGFzc29jaWF0aW9u&quot;&gt;Watch on YouTube&lt;/a&gt; 
&lt;strong&gt;Dr. Xihong Lin&lt;/strong&gt;:
&lt;em&gt;Overview of Rare Variant Analysis of Whole Genome Sequencing Association Studies&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The major part starts at time: &lt;a href=&quot;https://youtu.be/URGJIAdRTi0?t=1344&quot;&gt;22:30&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;and the best is at: &lt;a href=&quot;https://youtu.be/URGJIAdRTi0?t=1850&quot;&gt;30:50&lt;/a&gt; where she describes the aggregated Cauchy association test (ACAT) method for combining multiple annotations (like CADD score, MAF, etc.) to calculate the final P-value.&lt;/li&gt;
  &lt;li&gt;This is their annotation database discussed: &lt;a href=&quot;https://favor.genohub.org&quot;&gt;https://favor.genohub.org&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;papers&quot;&gt;Papers&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002929719300023?via%3Dihub&quot;&gt;ACAT paper&lt;/a&gt;, Yaowu Liu, et al AJGH 2019.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.nature.com/articles/s41588-020-0676-4&quot;&gt;Application of STAAR protocol to TOPMed&lt;/a&gt;, Xihao Li, et al. NatGen 2020.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1038/s41592-022-01640-x&quot;&gt;STAAR pipeline methods paper&lt;/a&gt;, Zilin Li, et al. NatMethods 2022.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/xihaoli/STAARpipeline&quot;&gt;STAAR pipeline github&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Controlling SKAT function: &lt;a href=&quot;https://lawlessgenomics.com/topic/skat#skat-r-package-by-leelabsg&quot;&gt;Here’s a summary of the SKAT package functions&lt;/a&gt; - which are easier to understand than reading the notation in the SKAT papers. If you read the code you see each new implement is added sequentially and how weights work. Although, the ACAT git repo is independent.&lt;/li&gt;
  &lt;li&gt;ACAT git repo: &lt;a href=&quot;https://github.com/yaowuliu/ACAT/blob/master/R/ACAT.R&quot;&gt;https://github.com/yaowuliu/ACAT/blob/master/R/ACAT.R&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;step-by-step-explanation-of-acatr&quot;&gt;Step-by-step explanation of ACAT.R&lt;/h2&gt;

&lt;p&gt;This discussion refers to code in the main ACAT function found at &lt;a href=&quot;https://github.com/yaowuliu/ACAT&quot;&gt;https://github.com/yaowuliu/ACAT&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/acat_fig_1.jpg&quot; width=&quot;80%&quot; style=&quot;display: block; margin: auto;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Figure 1.&lt;/strong&gt; Summary of the Proposed Methods ACAT, ACAT-V, and ACAT-O and the Relationship Among Them. From the ACAT paper&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The R code defines several functions to perform the Aggregated Cauchy Association Test (ACAT) and the ACAT-V test.
    &lt;ul&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACAT&lt;/code&gt;: This function combines p-values using the Cauchy distribution.&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACAT_V&lt;/code&gt;: A set-based test that uses ACAT to combine the variant-level p-values.&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NULL_Model&lt;/code&gt;: Computes model parameters and residuals for ACAT-V.&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Get.marginal.pval&lt;/code&gt;: A helper function to calculate the marginal p-values for ACAT-V.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACAT&lt;/code&gt; function
    &lt;ul&gt;
      &lt;li&gt;a. It accepts &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pvals&lt;/code&gt; (p-values), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;weights&lt;/code&gt;, and an optional &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;is.check parameter&lt;/code&gt; to validate the input.&lt;/li&gt;
      &lt;li&gt;b. Checks for NA, p-value range (0 to 1), and existence of both 0 and 1 p-values in the same column.&lt;/li&gt;
      &lt;li&gt;c. If weights are not provided, equal weights are used. Otherwise, user-supplied weights are validated and standardized.&lt;/li&gt;
      &lt;li&gt;d. The function calculates the Cauchy statistics and returns the ACAT p-value(s).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ACAT_V&lt;/code&gt; function
    &lt;ul&gt;
      &lt;li&gt;a. It accepts &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;G&lt;/code&gt; (genotype matrix), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;obj&lt;/code&gt; (output object of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NULL_Model&lt;/code&gt;), &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;weights.beta&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;weights&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mac.thresh&lt;/code&gt;.&lt;/li&gt;
      &lt;li&gt;b. It checks for the validity of input weights.&lt;/li&gt;
      &lt;li&gt;c. Based on the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mac.thresh&lt;/code&gt; value, it decides to use the Burden test, the Cauchy method, or a combination of both.&lt;/li&gt;
      &lt;li&gt;d. It calculates the final p-value and returns it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NULL_Model&lt;/code&gt; function
    &lt;ul&gt;
      &lt;li&gt;a. It accepts &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; (outcome phenotypes) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Z&lt;/code&gt; (covariates).&lt;/li&gt;
      &lt;li&gt;b. It determines if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; is continuous or binary.&lt;/li&gt;
      &lt;li&gt;c. It fits a linear regression model if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; is continuous and a logistic model if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Y&lt;/code&gt; is binary.&lt;/li&gt;
      &lt;li&gt;d. It returns an object with model parameters and residuals.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Get.marginal.pval&lt;/code&gt; function
    &lt;ul&gt;
      &lt;li&gt;a. It accepts &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;G&lt;/code&gt; (genotype matrix) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;obj&lt;/code&gt; (output object of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NULL_Model&lt;/code&gt;).&lt;/li&gt;
      &lt;li&gt;b. It checks the validity of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;obj&lt;/code&gt; input.&lt;/li&gt;
      &lt;li&gt;c. It calculates the marginal p-values and returns them.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Aggregated Cauchy Association Test (ACAT) is a powerful and computationally efficient method designed to improve the analysis of rare and low-frequency genetic variants in sequencing studies. Traditional set-based tests can experience power loss when only a small proportion of variants are causal, and their power can be sensitive to factors such as the number, effect sizes, and effect directions of causal variants, as well as weight choices.&lt;/p&gt;

&lt;p&gt;ACAT addresses these issues by combining variant-level p-values to create a set-based test called ACAT-V. ACAT-V is particularly powerful when there are only a few causal variants in a set, making it a valuable tool for genetic analysis. Additionally, ACAT can be used to create an omnibus test called ACAT-O by combining different variant-set-level p-values. ACAT-O incorporates the strengths of multiple complementary set-based tests, such as the burden test, sequence kernel association test (SKAT), and ACAT-V.&lt;/p&gt;

&lt;p&gt;By analyzing extensive simulated data and real-world data from the Atherosclerosis Risk in Communities (ARIC) study, it has been demonstrated that ACAT-V complements other tests like SKAT and the burden test. Furthermore, ACAT-O consistently delivers more robust and higher power than alternative tests, making it a valuable addition to the toolkit of researchers working with sequencing studies.&lt;/p&gt;

&lt;p&gt;ACAT is designed to combine p-values from multiple variants or tests rather than combining annotation scores directly. If you have p-values associated with each of the 5 annotation columns (CADD_score, MAF, GnomAD_AF, REVEL_score, ClinVar_score) for a single variant, you could potentially use ACAT to combine these p-values to obtain a single combined p-value for that variant. However, it’s essential to ensure that the p-values are valid and independent for ACAT to be effective.&lt;/p&gt;

&lt;p&gt;To do this see the STAAR framework for this.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/acat_fig_2.png&quot; width=&quot;80%&quot; style=&quot;display: block; margin: auto;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Figure 2.&lt;/strong&gt; Slide from presentation of ACAT method.&lt;/p&gt;

&lt;h2 id=&quot;applying-staar-o-for-multiple-annotation-weights&quot;&gt;Applying STAAR-O for multiple annotation weights&lt;/h2&gt;
&lt;p&gt;In a separate page, I discuss the STAAR method. 
The following passages are included in both pages since they related.&lt;/p&gt;

&lt;p&gt;In the STAAR Nature Methods paper, the section &lt;em&gt;Gene-centric analysis of the noncoding genome&lt;/em&gt; 
shows how the STAAR method can indeed be used to capitalize on the ACAT method to obtain a combined p-value from a set of annotations for a single variant. The STAAR framework incorporates multiple functional annotation scores into the RVATs (rare-variant association tests) to increase the power of association analysis. In this context, it uses the STAAR-O test, an omnibus test that aggregates annotation-weighted burden test, SKAT, and ACAT-V within the STAAR framework.&lt;/p&gt;

&lt;p&gt;By incorporating multiple functional annotation scores, such as CADD, LINSIGHT, FATHMM-XF, and annotation principal components (aPCs), the STAAR method enhances the ability to detect associations between variants and traits of interest. Therefore, the STAAR framework can be used to leverage the strengths of the ACAT method and obtain a combined p-value from a set of annotations for a single variant or a set of variants.&lt;/p&gt;

&lt;h2 id=&quot;non-gene-centric-analysis-using-dynamic-windows-with-scang-staar&quot;&gt;Non-gene-centric analysis using dynamic windows with SCANG-STAAR&lt;/h2&gt;

&lt;p&gt;The SCANG-STAAR method is an improvement over the fixed-size sliding window RVAT in the STAAR framework. It proposes a dynamic window-based approach called SCANG-STAAR, which extends the SCANG procedure by incorporating multidimensional functional annotations. This method allows for flexible detection of locations and sizes of signal windows across the genome, as the locations of regions associated with a disease or trait are often unknown in advance, and their sizes may vary across the genome. Using a prespecified fixed-size sliding window for RVAT can lead to power loss if the prespecified window sizes do not align with the true locations of the signals.&lt;/p&gt;

&lt;p&gt;The SCANG-STAAR method has two main procedures: SCANG-STAAR-S and SCANG-STAAR-B. SCANG-STAAR-S extends the SCANG-SKAT (SCANG-S) procedure by calculating the STAAR-SKAT (STAAR-S) p-value in each overlapping window by incorporating multiple variant functional annotations, instead of using just the MAF-weight-based SKAT p-value. SCANG-STAAR-B is based on the STAAR-Burden p-value. SCANG-STAAR-S has two advantages over SCANG-STAAR-B in detecting noncoding associations using dynamic windows: first, the effects of causal variants in a neighborhood in the noncoding genome tend to be in different directions, especially in intergenic regions; second, due to the different correlation structures of the two test statistics for overlapping windows, the genome-wide significance threshold of SCANG-STAAR-B is lower than that of SCANG-STAAR-S.&lt;/p&gt;

&lt;p&gt;SCANG-STAAR also provides the SCANG-STAAR-O procedure, based on an omnibus p-value of SCANG-STAAR-S and SCANG-STAAR-B calculated by the ACAT method. However, unlike STAAR-O, the ACAT-V test is not incorporated into the omnibus test because it is designed for sparse alternatives, and as a result, it tends to detect the region with the smallest size that contains the most significant variant in the dynamic window procedure.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/acat_fig_3.png&quot; width=&quot;80%&quot; style=&quot;display: block; margin: auto;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Figure 3.&lt;/strong&gt; Slide from presentation of ACAT application in STAAR.&lt;/p&gt;

&lt;h2 id=&quot;multi-weight-annotation-analysis&quot;&gt;Multi-weight annotation analysis&lt;/h2&gt;

&lt;p&gt;The STAAR framework can be used to combine the p-values associated with each of the 5 annotation columns (CADD_score, MAF, GnomAD_AF, REVEL_score, ClinVar_score) for a single variant. STAAR incorporates multiple functional annotation scores as weights when constructing its statistics, making it suitable for combining p-values from different annotation columns to obtain a single combined p-value for that variant.&lt;/p&gt;

&lt;h2 id=&quot;main-equations-for-acat&quot;&gt;Main equations for ACAT&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;ACAT test statistic:&lt;/li&gt;
&lt;/ol&gt;

\[T_{ACAT} = \sum_{i=1}^{k} w_i \tan{[(0.5 - p_i)\pi]}\]

&lt;p&gt;where \(p_i\) are the p-values, and \(w_i\) are non-negative weights.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;P-value calculation for the ACAT test statistic:&lt;/li&gt;
&lt;/ol&gt;

\[p \text{-value} \approx 1 - \frac{1}{2} + \frac{\arctan{(T_{ACAT} / w)}}{\pi}\]

&lt;p&gt;where \(w = \sum_{i=1}^{k} w_i\).&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;ACAT is a general and flexible method of combining p-values, which can represent the statistical significance of different kinds of genetic variations in sequencing studies.&lt;/li&gt;
  &lt;li&gt;ACAT only aggregates p-values, so one can automatically control cryptic relatedness and/or population stratification by fitting appropriate models from which p-values are calculated through methods such as principal-component analysis or mixed models.&lt;/li&gt;
  &lt;li&gt;The null distribution of the test statistic \(T_{ACAT}\) can be well approximated by a Cauchy distribution without the need for estimating and accounting for the correlation among p-values.&lt;/li&gt;
  &lt;li&gt;Calculating the p-value of ACAT requires almost negligible computation and is extremely fast.&lt;/li&gt;
  &lt;li&gt;The approximation is particularly accurate when ACAT has a very small p-value, which is useful in sequencing studies because only very small p-values can pass the stringent genome-wide significance threshold and are of particular interest.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;tan-and-pi&quot;&gt;tan and \(\pi\)&lt;/h2&gt;

&lt;p&gt;In the ACAT method, the “tan” and “π” functions are used to transform the p-values in such a way that they follow a standard Cauchy distribution under the null hypothesis. 
This transformation is essential to the ACAT method because it allows for an efficient and accurate combination of p-values, even when they are correlated.&lt;/p&gt;

&lt;p&gt;The reason for using the tangent function (“tan”) specifically is because of its connection to the Cauchy distribution. 
The Cauchy distribution has some unique properties, such as having a heavy tail, which make it suitable for handling correlated p-values in this context. 
The transformation function used in the ACAT method, given by \(tan((0.5 - p_i) \pi)\), ensures that if the p-value \(p_i\) is from the null distribution, the transformed value will follow a standard Cauchy distribution.&lt;/p&gt;

&lt;p&gt;The constant \(\pi\) (Pi) is used in the formula because it is a natural component of the tangent function. 
In the context of the ACAT method,  \(\pi\) is used to scale the input of the tangent function, which is necessary to map the range of p-values (0 to 1) to the entire domain of the tangent function. 
This ensures that the transformed values will follow the desired Cauchy distribution.&lt;/p&gt;

&lt;p&gt;Therefore, the “tan” and \(\pi\) functions in the ACAT method are used to transform p-values so that they follow a standard Cauchy distribution under the null hypothesis, which allows for an efficient and accurate combination of correlated p-values.&lt;/p&gt;

&lt;h2 id=&quot;original-r-code-from-yaowuliu&quot;&gt;Original R code from yaowuliu&lt;/h2&gt;

&lt;p&gt;This code is the main ACAT function found at
&lt;a href=&quot;https://github.com/yaowuliu/ACAT&quot;&gt;https://github.com/yaowuliu/ACAT&lt;/a&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; Aggregated Cauchy Assocaition Test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; A p-value combination method using the Cauchy distribution.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param weights a numeric vector/matrix of non-negative weights for the combined p-values. When it is NULL, the equal weights are used.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param Pvals a numeric vector/matrix of p-values. When it is a matrix, each column of p-values is combined by ACAT.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param is.check logical. Should the validity of \emph{Pvals} (and \emph{weights}) be checked? When the size of \emph{Pvals} is large and one knows \emph{Pvals} is valid, then the checking part can be skipped to save memory.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @return The p-value(s) of ACAT.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @author Yaowu Liu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples p.values&amp;lt;-c(2e-02,4e-04,0.2,0.1,0.8);ACAT(Pvals=p.values)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples ACAT(matrix(runif(1000),ncol=10))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @references Liu, Y., &amp;amp; Xie, J. (2019). Cauchy combination test: a powerful test with analytic p-value calculation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; under arbitrary dependency structures. \emph{Journal of American Statistical Association},115(529), 393-402. (\href{https://amstat.tandfonline.com/doi/abs/10.1080/01621459.2018.1554485}{pub})&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @export&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACAT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.check&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.check&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### check if there is NA&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.na&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cannot have NAs in the p-values!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### check if Pvals are between 0 and 1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;P-values must be between 0 and 1!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### check if there are pvals that are either exactly 0 or 1.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.zero&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.one&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.zero&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cannot have both 0 and 1 p-values in the same column!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.zero&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;warning&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;There are p-values that are exactly 0!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;warning&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;There are p-values that are exactly 1!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### Default: equal weights. If not, check the validity of the user supplied weights and standadize them.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.weights.null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.weights.null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The dimensions of weights and Pvals must be the same!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.check&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;All the weights must be nonnegative!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w.sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w.sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;At least one weight should be positive in each column!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w.sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### check if there are very small non-zero p values and calcuate the cauchy statistics&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1e-15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.weights.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tan&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pi&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cct.stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colMeans&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tan&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.small&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pi&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cct.stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### return the ACAT p value(s).&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pcauchy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cct.stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower.tail&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; A set-based test that uses ACAT to combine the variant-level p-values.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param G a numeric matrix or dgCMatrix with each row as a different individual and each column as a separate gene/snp. Each genotype should be coded as 0, 1, 2.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param obj an output object of the \code{\link{NULL_Model}} function.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param weights.beta a numeric vector of parameters for the beta weights for the weighted kernels. If you want to use your own weights, please use the “weights” parameter. It will be ignored if “weights” parameter is not null.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param weights a numeric vector of weights for the SNP p-values. When it is NULL, the beta weight with the “weights.beta” parameter is used.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param mac.thresh a threshold of the minor allele count (MAC). The Burden test will be used to aggregate the SNPs with MAC less than this thrshold.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @return The p-value of ACAT-V.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @details The Burden test is first used to aggregate very rare variants with Minor Allele Count (MAC) &amp;lt; \emph{mac.thresh} (e.g., 10), and a Burden p-value is obtained. For each of the variants with MAC &amp;gt;= \emph{mac.thresh}, a variant-level p-value is calculated. Then, ACAT is used to combine the variant-level p-values and the Burden test p-value of very rare variants.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; If \emph{weights.beta} is used, then the weight for the Burden test p-value is demetermined by the average Minor Allele Frequency (MAF) of the variants with MAC &amp;lt; \emph{mac.thresh}; if the user-specified \emph{weights} is used, then the weight for the Burden test p-value is the average of \emph{weights} of the variants with MAC &amp;lt; \emph{mac.thresh}.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; Note that the \emph{weights} here are for the SNP p-vlaues. In SKAT, the weights are for the SNP score test statistics. To transfrom the SKAT weights to the \emph{weights} here, one can use the formula that \emph{weights} = (skat_weights)^2*MAF*(1-MAF).&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @author Yaowu Liu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @references Liu, Y., et al. (2019). ACAT: A fast and powerful p value combination&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; method for rare-variant analysis in sequencing studies.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; \emph{American Journal of Humann Genetics 104}(3), 410-421.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; (\href{https://www.sciencedirect.com/science/article/pii/S0002929719300023}{pub})&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  library(Matrix)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  data(Geno)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  G&amp;lt;-Geno[,1:100] # Geno is a dgCMatrix of genotypes&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  Y&amp;lt;-rnorm(nrow(G)); Z&amp;lt;-matrix(rnorm(nrow(G)*4),ncol=4)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  obj&amp;lt;-NULL_Model(Y,Z)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  ACAT_V(G,obj)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @export&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACAT_V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac.thresh&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### check weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The length of weights must equal to the number of variants!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### remove SNPs with mac=0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;drop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The genotype matrix do not have non-zero element!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### p and n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;###&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac.thresh&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## only Burden&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Burden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac.thresh&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## only cauchy method&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbeta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbeta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Get.marginal.pval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## Burden + Cauchy method&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac.thresh&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.sparse&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.dense&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval.dense&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Burden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;drop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Get.marginal.pval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;drop&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval.dense&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mac&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mafs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.very.rare&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## maf for p-values&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbeta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mafs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbeta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mafs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.dense&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.sparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;


        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.keep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.keep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;which&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## remove p-values of 1.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mpvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.keep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is.keep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; Get parameters and residuals from the NULL model&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; Compute model parameters and residuals for ACAT-V&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param Y a numeric vector of outcome phenotypes.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @param Z a numeric matrix of covariates. Z must be full-rank. Do not include intercept in Z. The intercept will be added automatically.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @return This function returns an object that has model parameters and residuals of the NULL model of no association between genetic variables and outcome phenotypes. After obtaining it, please use \code{\link{ACAT_V}} function to conduct the association test.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @details \emph{Y} could only be continuous or binary. If \emph{Y} is continuous, a linear regression model is fitted. If \emph{Y} is binary, it must be coded as 0,1 and a logistic model is fitted.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @author Yaowu Liu&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  Y&amp;lt;-rnorm(10000)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  Z&amp;lt;-matrix(rnorm(10000*4),ncol=4)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @examples  obj&amp;lt;-NULL_Model(Y,Z)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;cd&quot;&gt;#&apos; @export&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NULL_Model&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### check the type of Y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### Add intercept&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cbind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### estimate of sigma square&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;solve&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## Z.med%*%t(Z.med) is the projection matrix of Z.tilde&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### output&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.med&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### fit null model&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;family&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;binomial&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prob.est&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;fitted.values&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### unstandarized residuals&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prob.est&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### Sigma when rho=0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prob.est&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prob.est&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### variance of each Y_i&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### output&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.tilde&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2.Y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;




&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Get.marginal.pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### check obj&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from MOAT_NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all.equal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.med&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from MOAT_NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.med&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SST&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all.equal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.tilde&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2.Y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from MOAT_NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.tilde&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2.Y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;matrix&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dgCMatrix&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The class of G must be matrix or dgCMatrix!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G_tX.med&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;crossprod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### Sigma^2 of G&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sigma2.G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G_tX.med&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SSR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sigma2.G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SSR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sigma2.G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df.2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t.stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;suppressWarnings&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SSR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SST&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SSR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;marginal.pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t.stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower.tail&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.stat0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### Sigma when rho=0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tG_X.tilde_sigma2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;crossprod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sigma2.G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diag&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tG_X.tilde_sigma2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;solve&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tG_X.tilde_sigma2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;marginal.pvals&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pnorm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.stat0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sigma2.G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower.tail&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;marginal.pvals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;


&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Burden&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear.weighted&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### check obj&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;out_type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all.equal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.med&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.med&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## rescaled residules&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;all.equal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.tilde&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2.Y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;obj is not calculated from NULL_MODEL!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Z.tilde&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Y.res&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;sigma2.Y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;### MAF&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colSums&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#### weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear.weighted&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is.null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbeta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights.beta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;weights&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The length of weights must equal to the number of variants!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The kernel name is not valid!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;###### if G is sparse or not&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;matrix&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dgCMatrix&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.stat.sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma.z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.med&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out_type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;D&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.stat.sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y.res&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;W&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma.z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;solve&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Gw&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma2.Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%*%&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.tilde&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma.z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma.z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;The class of G must be matrix or dgCMatrix!&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Z.stat.sum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigma.z&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;## Q test statistic&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pchisq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Q&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
</description>
        <pubDate>Tue, 25 Apr 2023 00:00:00 +0000</pubDate>
        <link>http://localhost:4000//topic/statistics/acat/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/acat/</guid>
      </item>
    
      <item>
        <title>Analysis of methods</title>
        <description>&lt;h1 id=&quot;analysis-of-methods&quot;&gt;Analysis of methods&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#the-fianl-comparison-results&quot; id=&quot;markdown-toc-the-fianl-comparison-results&quot;&gt;The fianl comparison results&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#scores-per-sample-and-bland-altman-plot&quot; id=&quot;markdown-toc-scores-per-sample-and-bland-altman-plot&quot;&gt;Scores per sample and Bland-Altman plot&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#correlation-test-result-and-repeatability-coefficients&quot; id=&quot;markdown-toc-correlation-test-result-and-repeatability-coefficients&quot;&gt;Correlation test result and repeatability coefficients&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#code&quot; id=&quot;markdown-toc-code&quot;&gt;Code&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#summary-of-the-analysis-of-method-comparison-studies&quot; id=&quot;markdown-toc-summary-of-the-analysis-of-method-comparison-studies&quot;&gt;Summary of the analysis of method comparison studies&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Altman, D. G., and J. M. Bland. “Measurement in Medicine: The Analysis of Method Comparison Studies.” Journal of the Royal Statistical Society. Series D (The Statistician), vol. 32, no. 3, 1983, pp. 307–17. JSTOR, &lt;a href=&quot;https://doi.org/10.2307/2987937&quot;&gt;https://doi.org/10.2307/2987937&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The paper is a pivotal guide discussing the analysis of method comparison studies, particularly in the field of medicine.
It proposes a pragmatic approach to analyze such studies, stressing the importance of simplicity especially when the results need to be explained to non-statisticians.
I work through it here to compare if a new methods is as good or better than an existing one for clinical sepsis scores with example data.&lt;/p&gt;

&lt;p&gt;For more similar papers see the series of BMJ statistical notes by Altman &amp;amp; Bland
(&lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;).&lt;/p&gt;

&lt;h2 id=&quot;the-fianl-comparison-results&quot;&gt;The fianl comparison results&lt;/h2&gt;
&lt;h3 id=&quot;scores-per-sample-and-bland-altman-plot&quot;&gt;Scores per sample and Bland-Altman plot&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/images/altman_bland_analysis_of_methods.png&quot; width=&quot;100%&quot; /&gt;
&lt;strong&gt;Figure 1.&lt;/strong&gt; Scores per sample and Bland-Altman plot as produced by the provided R code.&lt;/p&gt;

&lt;h3 id=&quot;correlation-test-result-and-repeatability-coefficients&quot;&gt;Correlation test result and repeatability coefficients&lt;/h3&gt;
&lt;p&gt;Results of the analysis as produced by the provided R code:&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Repeatability_Score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.674812&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Repeatability_Score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.267645&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Correlation_Test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

	&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pearson&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;product&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;moment&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;correlation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;and&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;21.693&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;197&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2.2e-16&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alternative&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hypothesis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;correlation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;not&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;equal&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;95&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;percent&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;confidence&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
 &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.7931229&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.8763442&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;estimates&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cor&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.8395928&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Repeatability Scores:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Repeatability Score 1:&lt;/strong&gt; 4.674812&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Repeatability Score 2:&lt;/strong&gt; 4.267645&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repeatability scores are measures of how consistent each method is. In this case, both methods seem to have fairly similar repeatability scores, indicating a similar level of consistency or reliability within each method. Without context or a benchmark to compare to, it’s challenging to definitively say whether these scores are good or not, but the similarity suggests comparable repeatability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correlation Test:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Pearson’s Correlation Coefficient (r):&lt;/strong&gt; 0.8395928&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;95% Confidence Interval for r:&lt;/strong&gt; [0.7931229, 0.8763442]&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;P-value:&lt;/strong&gt; \(&amp;lt; 2.2e-16\)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Pearson’s correlation coefficient is quite high, indicating a strong positive linear relationship between the scores from the two methods. The nearly zero p-value (less than 2.2e-16) strongly suggests that this correlation is statistically significant, and it’s highly unlikely that this observed correlation occurred by chance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Considering the high correlation coefficient and the comparable repeatability scores, it seems that the new method is quite similar to the old one in terms of both reliability (as indicated by the repeatability scores) and agreement (as indicated by the correlation coefficient).&lt;/p&gt;

&lt;h2 id=&quot;code&quot;&gt;Code&lt;/h2&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Here is a really cool set of notes in BMJ about all kinds of clinical data analysis.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# https://www-users.york.ac.uk/~mb55/pubs/pbstnote.htm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# For example, if you every like to go further with the score that you working on, this paper is very famous for showing how to compare two such clinical score methods to show if the new one performs better. &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# https://sci-hub.hkvisa.net/10.2307/2987937&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# (Measurement in Medicine: The Analysis of Method Comparison Studies Author(s): D. G. Altman and J. M. Bland 1983)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Example ----&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# psofa.score - Total number of organ failures according to 2017 pSOFA definitions (Matics et al. (2017) (PMID 28783810)). The classification of organ failures is based on the worst vital signs and the worst lab values during the first 7 days from blood culture sampling.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# pelod.score - Total number of organ failures according to PELOD-2 definitions (Leteurtre et al. (2013) (PMID 23685639)). The classification of organ failures is based on the worst vital signs and the worst lab values during the first 7 days from blood culture sampling.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dplyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tidyr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;patchwork&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Read data&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read.csv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;../data/example_sepsis_scores.tsv&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sep&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot; &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Take 200 rows as an example&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Renaming columns and adding index column&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;psofa.score&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pelod.score&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rownames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Adding a small amount of random noise to each value&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# NOTE: THIS WOULD BE USED FOR ADDING SOME NOISE TO MAKE THE DATA MORE ANONYMOUS FOR PUBLISHING AN EXAMPLE FIGURE&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set.seed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;123&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Setting a seed to ensure reproducibility&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rnorm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rnorm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Calculate necessary statistics: the average and the difference of score1 and score2.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;means&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diffs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Compute repeatability coefficients for each method separately&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeatability_score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;na.rm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeatability_score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;na.rm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Compute correlation to check independence of repeatability from the size of the measurements&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cor_test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cor.test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;pearson&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Average difference (aka the bias)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diffs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;na.rm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Sample standard deviation&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diffs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;na.rm&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Limits of agreement&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1.96&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1.96&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Additional statistics for confidence intervals&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conf_int&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.95&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;qt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conf_int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;qt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conf_int&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_loas&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa_ci_lower&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_loas&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa_ci_upper&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_loas&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias_ci_lower&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias_ci_upper&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_bias&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa_ci_lower&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_loas&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa_ci_upper&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_loas&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Create Bland-Altman plot&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;means&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diffs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;red&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggtitle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Bland-Altman plot for PELOD and pSOFA scores&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Difference between\ntwo scores&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Average of two scores&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme_bw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_hline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yintercept&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linetype&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;solid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_hline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yintercept&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa_ci_lower&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lower_loa_ci_upper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa_ci_lower&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper_loa_ci_upper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linetype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;dotted&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_hline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yintercept&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se_bias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Create scatter plot with a line of equality&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;score2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;red&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_abline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;intercept&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;slope&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;theme_bw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggtitle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;PELOD and pSOFA scores per sample&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;pelod.score&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlab&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;psofa.score&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Combine plots&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;patch1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot_annotation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tag_levels&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;A&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;patch1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Output the correlation test result and repeatability coefficients&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Repeatability_Score1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeatability_score1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Repeatability_Score2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeatability_score2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Correlation_Test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cor_test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;summary-of-the-analysis-of-method-comparison-studies&quot;&gt;Summary of the analysis of method comparison studies&lt;/h2&gt;

&lt;p&gt;In their landmark paper, D.G. Altman and J.M. Bland outline a structured approach to evaluating whether a new method of medical measurement is as good as or better than an existing one. 
The approach encapsulates several critical components, emphasizing not only statistical analyses but also the importance of effective communication, especially to non-expert audiences. 
Key points from the paper:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Bland-Altman Plot&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Introduced as a graphical method to assess the agreement between two different measurement techniques. This method involves plotting the difference between two methods against their mean, which assists in identifying any biases and analyzing the limits of agreement between the two methods.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Bias and Limits of Agreement&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The authors recommend calculating the bias (the mean difference between two methods) and limits of agreement (bias ± 1.96 times the standard deviation of the differences) to quantify the agreement between the two methods. A smaller bias and narrower limits of agreement generally indicate that a new method might be comparable or superior to the existing one.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Investigating Relationship with Measurement Magnitude&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Encourages the investigation of whether the differences between the methods relate to the measurement’s magnitude. Transformations or regression approaches might be necessary depending on the observed association, to correct for it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Repeatability&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;&lt;strong&gt;Assessment&lt;/strong&gt;: It’s crucial to assess repeatability for each method separately using replicated measurements on a sample of subjects. This analysis derives from the within-subject standard deviation of the replicates.&lt;/li&gt;
      &lt;li&gt;&lt;strong&gt;Graphical Methods and Correlation Tests&lt;/strong&gt;: Apart from Bland-Altman plots, graphical methods (like plotting standard deviation against the mean) and correlation coefficient tests are suggested for examining the independence of repeatability from the size of measurements.&lt;/li&gt;
      &lt;li&gt;&lt;strong&gt;Potential Influences&lt;/strong&gt;: It highlights the possible influences on measurements, such as observer variability, time of day, or the position of the subject, and differentiates between repeatability and reproducibility (agreement of results under different conditions).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Comparison of Methods&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The core emphasis is on directly comparing results obtained by different methods to determine if one can replace another without compromising accuracy for the intended purpose of the measurement. Initial data plotting is encouraged, ideally plotting the difference between methods against the average of the methods, providing insight into disagreement, outliers, and potential trends. Testing for independence between method differences and the size of measurements is necessary, as it can influence the analysis and interpretation of bias and error.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Addressing Alternative Analyses&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The paper discusses alternative approaches like least squares regression, principal component analysis, and regression models with errors in both variables, but finds them to generally add complexity without necessarily improving the simple comparison intended.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Effective Communication&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The authors emphasize the importance of communicating results effectively to non-experts, such as clinicians, to facilitate practical application of the findings.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Challenges in Method Comparison Studies&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The paper highlights the challenges faced in method comparison studies, primarily due to the lack of professional statistical expertise and reliance on incorrect methods replicated from existing literature. It calls for improved awareness among statisticians about this issue and encourages journals to foster the use of appropriate techniques through peer review.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Thus, we can perform an objective evaluation of whether a new measurement method is as good or potentially better than an existing one by assessing agreement, bias, and repeatability, among other factors.&lt;/p&gt;

</description>
        <pubDate>Sat, 16 Sep 2023 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//topic/statistics/altman_bland_analysis_of_methods/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/altman_bland_analysis_of_methods/</guid>
      </item>
    
      <item>
        <title>Correlation, regression and repeated data</title>
        <description>&lt;h1 id=&quot;correlation-regression-and-repeated-data&quot;&gt;Correlation, regression and repeated data&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#correlation-within-subjects-part-1&quot; id=&quot;markdown-toc-correlation-within-subjects-part-1&quot;&gt;Correlation within subjects, part 1&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#correlation-within-subjects-part-2&quot; id=&quot;markdown-toc-correlation-within-subjects-part-2&quot;&gt;Correlation within subjects, part 2&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;This topic is introduced as the first paper 
&lt;a class=&quot;citation&quot; href=&quot;#bland1994correlation&quot;&gt;(Bland &amp;amp; Altman, 1994)&lt;/a&gt;
in a series of BMJ statistical notes by Altman &amp;amp; Bland
(&lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;):
1. Bland JM, Altman DG. (1994) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/308/6933/896&quot;&gt; Correlation, regression and repeated data.&lt;/a&gt; &lt;b&gt;308&lt;/b&gt;, 896.
&lt;sup id=&quot;a1&quot;&gt;&lt;a href=&quot;#f1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;It concerns the analysis of paired data where there is more than one observation per subject.
They point out that it could be highly misleading to analyse such data by combining repeated observations from several subjects and then calculating the correlation coefficient as if the data were a simple sample.&lt;/p&gt;

&lt;p&gt;Many researchers would assume that it is acceptable to gather repeated measurements for individuals and  put all the data together.&lt;/p&gt;

&lt;p&gt;They use simulated data showing five pairs of measurements of two uncorrelated variables X and Y for subjects 1, 2, 3, 4, and 5.
Using each subject’s mean values, they show correlation coefficient r=-0.67, df=3, P=0.22.
However, when they put all 25 observations together they get r=-0.47, df=23, P=0.02.
When the calculation is performed as if they have 25 subjects, the number of degrees of freedom for the significance test is increased incorrectly and a spurious significant difference is produced.
Thus demonstrating that one should not mix observations from different subjects indiscriminately, whether using correlation or the closely related regression analysis.&lt;/p&gt;

&lt;h2 id=&quot;correlation-within-subjects-part-1&quot;&gt;Correlation within subjects, part 1&lt;/h2&gt;

&lt;p&gt;The methods to use in these circumstances are later discussed in another note for the BMJ series
&lt;a class=&quot;citation&quot; href=&quot;#bland1995statistics&quot;&gt;(Bland &amp;amp; Altman, 1995)&lt;/a&gt;,
number 11 on the list &lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;:
11. Bland JM, Altman DG. (1995) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/310/6977/446&quot;&gt;Calculating correlation coefficients with repeated observations: Part 1, correlation within subjects.&lt;/a&gt; &lt;b&gt;310&lt;/b&gt;, 446.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Notes: I am replacing their terms for my notes:&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;X = Paco2&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Y = pHi&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this note they show an example table using 8 subjects with &lt;em&gt;4-8 observations&lt;/em&gt; for X and Y (&lt;strong&gt;table I&lt;/strong&gt;):&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Subject&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Y&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;X&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6.68&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3.97&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6.53&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;…&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;…&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;…&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;ul&gt;
  &lt;li&gt;If subject’s average Y is related to the subject’s average X
    &lt;ul&gt;
      &lt;li&gt;We can use the correlation between the subject means, which they shall describe in a subsequent note.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;If an increase in Y within the individual was associated with an increase in X
    &lt;ul&gt;
      &lt;li&gt;We want to remove the differences between subjects and look only at changes within.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To look at variation within the subject we can use multiple regression.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Make one variable, X or Y, the outcome variable and the other variable and the subject the predictor variables.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The subject is treated as a categorical factor using dummy variables and so has seven degrees of freedom.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Using an analysis of variance table for the regression (table II) shows how the variability in Y can be partitioned into components due to different sources.
    &lt;ul&gt;
      &lt;li&gt;Also known as analysis of covariance&lt;/li&gt;
      &lt;li&gt;Equivalent to fitting parallel lines through each subject’s data (&lt;strong&gt;Figure 1.&lt;/strong&gt;)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;table-wrapper&quot;&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th style=&quot;text-align: left&quot;&gt;Source of variation&lt;/th&gt;
        &lt;th style=&quot;text-align: right&quot;&gt;Degrees of freedom&lt;/th&gt;
        &lt;th style=&quot;text-align: right&quot;&gt;Sum of squares&lt;/th&gt;
        &lt;th style=&quot;text-align: right&quot;&gt;Mean square&lt;/th&gt;
        &lt;th style=&quot;text-align: right&quot;&gt;Variance ratio (F)&lt;/th&gt;
        &lt;th style=&quot;text-align: right&quot;&gt;Probability&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;Subjects&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;7&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;2.9661&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.4237&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;48.3&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;\(&amp;lt;\)0.0001&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;X&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.1153&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.1153&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;13.1&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.0008&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;Residual&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;38&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.3337&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.0088&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt; &lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt; &lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;Total&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;46&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;3.3139&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt;0.0720&lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt; &lt;/td&gt;
        &lt;td style=&quot;text-align: right&quot;&gt; &lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

&lt;/div&gt;

&lt;p&gt;Table II. Analysis of variance for the data in table I (as shown in Altman &amp;amp; Bland). At the end of this page, this table is reproduced based on the original data and new R code, as shown. (Note that there is a slight varition between the published version and my replicated version of table II and Figure 1 below; probably due to a minor data entry error by the publisher or authors).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The residual sum of squares in &lt;strong&gt;table II&lt;/strong&gt; represents the variation about regression lines.&lt;/li&gt;
  &lt;li&gt;This removes the variation due to subjects (and any other nuisance variables which might be present) and express the variation in Y due to X as a proportion of what’s left:
    &lt;ul&gt;
      &lt;li&gt;(Sum of squares for X)/(Sum of squares for X + residual sum of squares).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;The magnitude of the correlation coefficient within subjects is the square root of this proportion.
    &lt;ul&gt;
      &lt;li&gt;For &lt;strong&gt;table II&lt;/strong&gt; this is: \(\sqrt{ \frac{0.1153}{0.1153+0.3337} } = 0.51\)&lt;/li&gt;
      &lt;li&gt;The sign of the correlation coefficient is given by the sign of the regression coefficient for X.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Regression slope is -0.108&lt;/li&gt;
  &lt;li&gt;So the correlation coefficient within subjects is -0.51.&lt;/li&gt;
  &lt;li&gt;The P value is found either from:
    &lt;ul&gt;
      &lt;li&gt;F test in the associated analysis of variance table,&lt;/li&gt;
      &lt;li&gt;t test for the regression slope.&lt;/li&gt;
      &lt;li&gt;It doesn’t matter which variable we regress on which;&lt;/li&gt;
      &lt;li&gt;we get the same correlation coefficient and P value either way.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Incorrectly calculating the correlation coefficient 
by ignoring the fact that we have 47 observations on only 8 subjects, would produce -0.07, P=0.7.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/altman_bland_repeated_observations.png&quot; width=&quot;100%&quot; /&gt;
&lt;strong&gt;Figure 1.&lt;/strong&gt; Recreation of “(Y) pH against (X) PaCO2 for eight subjects, with parallel lines fitted for each subject” as used in 
&lt;a class=&quot;citation&quot; href=&quot;#bland1995statistics&quot;&gt;(Bland &amp;amp; Altman, 1995)&lt;/a&gt;.
Interestingly, replotting this data shows that their figure was not fully accurate (forgivable before the days of Rstudio in 1995, and not important for this example).&lt;/p&gt;

&lt;h2 id=&quot;correlation-within-subjects-part-2&quot;&gt;Correlation within subjects, part 2&lt;/h2&gt;

&lt;p&gt;The second part shows how to find the correlation between the subject means 
&lt;a class=&quot;citation&quot; href=&quot;#bland1995calculating&quot;&gt;(Bland &amp;amp; Altman, 1995)&lt;/a&gt;,
number 12 on the list &lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;:
12. Bland JM, Altman DG. (1995) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/310/6980/633&quot;&gt;Calculating correlation coefficients with repeated observations: Part 2, correlation between subjects.&lt;/a&gt; &lt;b&gt;310&lt;/b&gt;, 633.&lt;/p&gt;

&lt;p&gt;In this note they show the example table using the same 8 subjects with &lt;em&gt;one mean observation&lt;/em&gt; for X and Y:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Subject&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Y&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;X&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Number&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6.49&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.04&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.05&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5.37&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.36&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.83&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.33&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5.31&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.31&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.40&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.32&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.92&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6.91&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;6.60&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.12&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.78&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;ul&gt;
  &lt;li&gt;If subject’s average Y is related to the subject’s average X
    &lt;ul&gt;
      &lt;li&gt;We can use the correlation between the subject means.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They calculate the usual correlation coefficient for the mean Y and mean X; r=0.09, P=0.8.
Does not take into account the different numbers of measurements on each subject.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Does this matter?:
    &lt;ul&gt;
      &lt;li&gt;Depends on how different the numbers of observations are&lt;/li&gt;
      &lt;li&gt;whether the measurements within subjects vary much compared with the means between subjects&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can calculate a weighted correlation coefficient using the number of observations as weights. 
Many computer programs will calculate this, but it is not difficult to do by hand.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;They denote the mean Y and X for subject i by \(\bar{x}_i\) and \(\bar{y}_i\),&lt;/li&gt;
  &lt;li&gt;the number of observations for subject i by \(m_i\),&lt;/li&gt;
  &lt;li&gt;and the number of subjects by \(n\).&lt;/li&gt;
  &lt;li&gt;The weighted mean of the \(\bar{x}_i\) is 
\(\frac{ \sum{ m_i \bar{x}_i } }{ \sum{ m_i } }\)
&lt;!-- (summation)mixi/(summation)mi. --&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the usual case, where there is one observation per subject, the \(m_i\) 
are all one and this formula gives the usual mean 
\(\frac{ \sum{\bar{x}_i} }{n}\).
&lt;!-- (summation)xi/n. --&gt;&lt;/p&gt;

&lt;p&gt;An easy way to calculate the weighted correlation coefficient is to replace each individual observation by its subject mean. 
Thus the table would yield 47 pairs of observations, 
the first four of which would each be pH=6.49 and Paco2=4.04, and so on.&lt;/p&gt;

&lt;p&gt;If we use the usual formula for the correlation coefficient on the expanded data we will get the weighted correlation coefficient. 
However, we must be careful when it comes to the P value. 
We have only 8 observations (n in general), not 47. 
We should ignore any P value printed by our computer program, and use a statistical table instead.&lt;/p&gt;

&lt;p&gt;The formula for a weighted correlation coefficient is:&lt;/p&gt;

&lt;div class=&quot;math-container&quot;&gt;
$$
\frac{
\sum{m_i \bar{x}_i \bar{y}_i} 
-
\sum{m_i \bar{x}_i} \sum{m_i \bar{y}_i} 
\mathbin{/}
\sum{m_i}
}{
\sqrt{
 ( 
 \sum{m_i \bar{y}_i^2} 
 -
 (\sum{m_i \bar{y}_i})^2 
 \mathbin{/}
 \sum{m_i}
 )
 ( 
 \sum{m_i \bar{y}_i^2} 
 -
 (\sum{m_i \bar{y}_i})^2 
 \mathbin{/}
 \sum{m_i}
 )}
 }
 $$
&lt;/div&gt;

&lt;p&gt;where all summations are from \(i=1\) to \(n\). 
When all the \(m_i\) are equal they cancel out, 
giving the usual formula for a correlation coefficient.&lt;/p&gt;

&lt;p&gt;For the data in the table the weighted correlation coefficient is r=0.08, P=0.9. 
There is no evidence that subjects with a high Y also have a high X. 
However, as they have already shown in part 1, 
within the subject a rise in Y was associated with a fall in X.&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;## Code and raw data for Table I, Analysis of variance table II, and Figure 1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data.frame&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Subject&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.68&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.53&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.43&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.33&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.85&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.06&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.17&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.41&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.37&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.34&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.35&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.28&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.34&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.36&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.33&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.29&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.35&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.35&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.37&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.27&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.28&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.38&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.29&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.33&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.31&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.33&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.86&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.94&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.92&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.19&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.29&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.21&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7.19&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.77&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.82&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3.97&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.09&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3.97&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.27&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.37&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.41&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.67&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3.64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.73&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.96&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.04&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.22&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.82&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.07&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.67&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.53&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.75&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.51&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.28&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3.23&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.46&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.72&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.75&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.99&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.78&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.73&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.93&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.93&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.85&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.52&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.28&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.56&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.34&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.41&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3.69&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6.09&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5.58&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Run the Analysis of Variance with mutiple variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aov&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Subject&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#runs the ANOVA test&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#lists the items stored by the test.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#give the basic ANOVA output.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Output the column totals to match Altman &amp;amp; Bland table&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.data.frame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;summary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Total Degrees of freedom&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Total Sum Sq&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Total Mean Sq&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;table-wrapper&quot;&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt;Degrees of freedom&lt;/th&gt;
        &lt;th&gt;Sum of squares&lt;/th&gt;
        &lt;th&gt;Mean Square&lt;/th&gt;
        &lt;th&gt;Variance ratio (F)&lt;/th&gt;
        &lt;th&gt;Probability&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;df$Subject&lt;/td&gt;
        &lt;td&gt;7&lt;/td&gt;
        &lt;td&gt;2.8648&lt;/td&gt;
        &lt;td&gt;0.4093&lt;/td&gt;
        &lt;td&gt;46.60&lt;/td&gt;
        &lt;td&gt;&amp;lt; 2e-16&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;df$X&lt;/td&gt;
        &lt;td&gt;1&lt;/td&gt;
        &lt;td&gt;0.1153&lt;/td&gt;
        &lt;td&gt;0.1153&lt;/td&gt;
        &lt;td&gt;13.13&lt;/td&gt;
        &lt;td&gt;0.000847&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Residuals&lt;/td&gt;
        &lt;td&gt;38&lt;/td&gt;
        &lt;td&gt;0.3337&lt;/td&gt;
        &lt;td&gt;0.0088&lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

&lt;/div&gt;

&lt;p&gt;Replicated version of Table II. Analysis of variance for the data in table I.
Default R output headings modified:
Degrees of freedom (Df),
Sum of squares (Sum Sq),
Mean square (Mean Sq),
Variance ratio F (F value),
Probability (Pr(&amp;gt;F)).
(Repeated note: there is a slight varition between the published version and my replicated version of table II and Figure 1; probably due to a minor data entry error by the publisher or authors).&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# code used to produce Figure 1. &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;require&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Subject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Subject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_smooth&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Subject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;lm&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;formula&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;se&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;FALSE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# The dataset is cited by Bland &amp;amp; Altman 1995 as: &quot;Boyd O, Mackay CJ, Lamb G, Bland JM, Grounds RM, Bennett ED.Comparison of clinical information gained from routine blood-gas analysis and from gastric tonometry for intramural pH.Lancet1993;341:142–6.&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;bland1994correlation&quot;&gt;Bland, J. M., &amp;amp; Altman, D. G. (1994). Correlation, regression, and repeated data. &lt;i&gt;BMJ: British Medical Journal&lt;/i&gt;, &lt;i&gt;308&lt;/i&gt;(6933), 896.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;bland1995statistics&quot;&gt;Bland, J. M., &amp;amp; Altman, D. G. (1995). Statistics notes: Calculating correlation coefficients with repeated observations: Part 1—correlation within subjects. &lt;i&gt;Bmj&lt;/i&gt;, &lt;i&gt;310&lt;/i&gt;(6977), 446.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;bland1995calculating&quot;&gt;Bland, J. M., &amp;amp; Altman, D. G. (1995). Calculating correlation coefficients with repeated observations: Part 2—Correlation between subjects. &lt;i&gt;Bmj&lt;/i&gt;, &lt;i&gt;310&lt;/i&gt;(6980), 633.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Footnote&lt;/strong&gt;
&lt;b id=&quot;f1&quot;&gt;1&lt;/b&gt; This article is almost identical to the original version in acknowledgment to Altman and Bland. It is adapted here as part of a set of curated, consistent, and minimal examples of statistics required for human genomic analysis.
&lt;a href=&quot;#a1&quot;&gt;↩&lt;/a&gt;&lt;/p&gt;
</description>
        <pubDate>Tue, 29 Jun 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//topic/statistics/altman_bland_correlation/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/altman_bland_correlation/</guid>
      </item>
    
      <item>
        <title>Odds ratios, SE &amp; CI</title>
        <description>&lt;h1 id=&quot;odds-ratios-se--ci&quot;&gt;Odds ratios, SE &amp;amp; CI&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#odds-ratio&quot; id=&quot;markdown-toc-odds-ratio&quot;&gt;Odds ratio&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#perspective-1&quot; id=&quot;markdown-toc-perspective-1&quot;&gt;Perspective 1&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#perspective-2&quot; id=&quot;markdown-toc-perspective-2&quot;&gt;Perspective 2&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#standard-error&quot; id=&quot;markdown-toc-standard-error&quot;&gt;Standard error&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#confidence-interval&quot; id=&quot;markdown-toc-confidence-interval&quot;&gt;Confidence interval&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Altman &amp;amp; Bland review the use of odds ratio (OR), standard error (SE), and confidence interval (CI) with some examples in 
&lt;a class=&quot;citation&quot; href=&quot;#bland2000odds&quot;&gt;(Bland &amp;amp; Altman, 2000)&lt;/a&gt;; 
the 42nd paper on the list of statistical notes in their BMJ series,
(&lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;):
42. Bland JM, Altman DG. (2000) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/320/7247/1468&quot;&gt;The odds ratio.&lt;/a&gt; &lt;b&gt;320&lt;/b&gt;, 1468.
&lt;sup id=&quot;a1&quot;&gt;&lt;a href=&quot;#f1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;In reproducing their examples I use \(X\) and \(Y\);&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;X = Eczema&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Y = Hay fever&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an example dataset, they cite the following table; Association between hay fever (Y) and eczema (X) in 11 year old children.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;Y Yes&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;Y No&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;Y Total&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;b&gt;X Yes&lt;/b&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;141&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;420&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;561&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;b&gt;X No&lt;/b&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;928&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;13 525&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;14 453&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;b&gt;X Total&lt;/b&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1069&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;13 945&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;15 522&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;odds-ratio&quot;&gt;Odds ratio&lt;/h2&gt;

&lt;h3 id=&quot;perspective-1&quot;&gt;Perspective 1&lt;/h3&gt;
&lt;p&gt;The &lt;em&gt;probability&lt;/em&gt; that a child with X will also have Y
is estimated by the proportion \(\dfrac{141}{561}\) (25.1%)
and &lt;em&gt;odds&lt;/em&gt; is estimated by \(\dfrac{141}{420}\).&lt;/p&gt;

&lt;p&gt;Similarly, for children without X the &lt;em&gt;probability&lt;/em&gt; of having Y is estimated by \(\dfrac{928}{14 453}\) (6.4%) 
and the odds is \(\dfrac{928}{13 525}\).&lt;/p&gt;

&lt;p&gt;They compare the groups in several ways:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;By the difference between the proportions 
\(\dfrac{141}{561} - \dfrac{928}{14 453} = 0.187\) (or 18.7 percentage points).&lt;/li&gt;
  &lt;li&gt;The ratio of the proportions (also called the relative risk) 
\(\dfrac{ \left(\dfrac{141}{561}\right) }{ \left(\dfrac{928}{14 453}\right) } = 3.91\).&lt;/li&gt;
  &lt;li&gt;The OR
\(\dfrac{ \left(\dfrac{141}{420}\right) }{ \left(\dfrac{928}{13 525}\right) } = 4.89\).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;perspective-2&quot;&gt;Perspective 2&lt;/h3&gt;
&lt;p&gt;Looking at the table the other way round, 
What is the &lt;em&gt;probability&lt;/em&gt; that a child with Y will also have X?&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;proportion&lt;/em&gt; is \(\dfrac{141}{1069}\) (13.2%) 
and the &lt;em&gt;odds&lt;/em&gt; is \(\dfrac{141}{928}\).&lt;/p&gt;

&lt;p&gt;For a child without Y, the &lt;em&gt;proportion&lt;/em&gt; with X is 
\(\dfrac{420}{13 945}\) (3.0%) 
and the &lt;em&gt;odds&lt;/em&gt; is \(\dfrac{420}{13 525}\).&lt;/p&gt;

&lt;p&gt;Comparing the proportions this way,&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The difference is 
\(\dfrac{141}{1069} - \dfrac{420}{13 945} = 0.102\) (or 10.2 percentage points);&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;The ratio (relative risk) 
\(\dfrac{ \left(\dfrac{141}{1069}\right) }{ \left(\dfrac{420}{13 945}\right) } = 4.38\);&lt;/li&gt;
  &lt;li&gt;The OR
\(\dfrac{ \left(\dfrac{141}{928}\right) }{ \left(\dfrac{420}{13 525}\right) } = 4.89\).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The OR is the same whichever way round we look at the table, 
but the difference and ratio of proportions are not.
This is because the two OR are&lt;/p&gt;

&lt;p&gt;\(\dfrac{ 141 / \textbf{420} }{ \textbf{928} / 13 525 }\) and
\(\dfrac{ 141 / \textbf{928} }{ \textbf{420} / 13 525 }\) 
which can both be rearranged to give
\(\dfrac{ 141 \times 13 525 }{ 928 \times 420 }\).&lt;/p&gt;

&lt;p&gt;Swapping orders for rows and columns produces the same OR.&lt;/p&gt;

&lt;p&gt;Swapping orders for either only rows or only columns produces the the reciprocal of the OR, \(1/4.89 = 0.204\).&lt;/p&gt;

&lt;p&gt;Thus, OR can indicate the strength of the relationship.
OR cannot be negative but is not limited in the positive direction, producing a skew distribution. 
Reversing the order of categories for one variables simply results in a reversed sign of log OR:&lt;/p&gt;

&lt;p&gt;\(log(4.89) = 1.59\),&lt;/p&gt;

&lt;p&gt;\(log(0.204) = - 1.59\).&lt;/p&gt;

&lt;h2 id=&quot;standard-error&quot;&gt;Standard error&lt;/h2&gt;
&lt;p&gt;The standard error (SE) can be calculated for the log OR and hence a confidence interval (CI).&lt;/p&gt;

&lt;p&gt;The SE of log OR is simply estimated by the square root of the sum of the reciprocals of the four frequencies. 
For the example,&lt;/p&gt;

&lt;div class=&quot;table-wrapper&quot;&gt;

\[\text{SE(}log \text{OR)} =
\sqrt{
\frac{1}{141} + 
\frac{1}{420} + 
\frac{1}{928} + 
\frac{1}{13 525}}
=
0.103\]

&lt;/div&gt;

&lt;h2 id=&quot;confidence-interval&quot;&gt;Confidence interval&lt;/h2&gt;

&lt;p&gt;A 95% confidence interval (CI) for the log OR is obtained as 1.96 standard errors on either side of the estimate.&lt;/p&gt;

&lt;p&gt;For the example, 
the log OR is 
\(log_{e} (4.89) = 1.588\) 
and the confidence interval is 
\(1.588 \pm 1.96 \times 0.103\), 
which gives \(1.386\) to \(1.790\).&lt;/p&gt;

&lt;p&gt;The antilog of these limits to give a 95% CI for the OR itself, 
as 
\(exp(1.386) = 4.00\) to 
\(exp(1.790) = 5.99\).&lt;/p&gt;

&lt;p&gt;The observed OR, 4.89, 
is not in the centre of the confidence interval because of the asymmetrical nature of the OR scale. 
For this reason, in graphs ORs are often plotted using a logarithmic scale. 
The OR is 1 when there is no relationship. 
We can test the null hypothesis that the OR is 1 by the usual 
\({\chi}^2\) test for a two by two table.&lt;/p&gt;

&lt;p&gt;Despite their usefulness, ORs can cause difficulties in interpretation. 
Altman &amp;amp; Bland review this debate and also discuss ORs in logistic regression and case-control studies in future Statistics Notes.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;bland2000odds&quot;&gt;Bland, J. M., &amp;amp; Altman, D. G. (2000). The odds ratio. &lt;i&gt;Bmj&lt;/i&gt;, &lt;i&gt;320&lt;/i&gt;(7247), 1468.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Footnote&lt;/strong&gt;
&lt;b id=&quot;f1&quot;&gt;1&lt;/b&gt; This article is almost identical to the original version in acknowledgment to Altman and Bland. It is adapted here as part of a set of curated, consistent, and minimal examples of statistics required for human genomic analysis.
&lt;a href=&quot;#a1&quot;&gt;↩&lt;/a&gt;&lt;/p&gt;
</description>
        <pubDate>Sun, 04 Jul 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//topic/statistics/altman_bland_odds_ratios/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/altman_bland_odds_ratios/</guid>
      </item>
    
      <item>
        <title>Receiver operating characteristic plots</title>
        <description>&lt;h1 id=&quot;receiver-operating-characteristic-plots&quot;&gt;Receiver operating characteristic plots&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#receiver-operating-characteristic-plots-1&quot; id=&quot;markdown-toc-receiver-operating-characteristic-plots-1&quot;&gt;Receiver operating characteristic plots&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#example-roc-cuvre&quot; id=&quot;markdown-toc-example-roc-cuvre&quot;&gt;Example ROC cuvre&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;receiver-operating-characteristic-plots-1&quot;&gt;Receiver operating characteristic plots&lt;/h2&gt;
&lt;p&gt;This article covers the fifth paper in the series of statistics notes &lt;a class=&quot;citation&quot; href=&quot;#altman1994diagnostic&quot;&gt;(Altman &amp;amp; Bland, 1994)&lt;/a&gt;
(&lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;): 5. Altman DG, Bland JM. (1994) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/309/6948/188&quot;&gt; Diagnostic tests 3: receiver operating characteristic plots.&lt;/a&gt; &lt;b&gt;309&lt;/b&gt;, 188,
and concerns &lt;em&gt;quantitative diagnostic tests&lt;/em&gt;.
&lt;sup id=&quot;a1&quot;&gt;&lt;a href=&quot;#f1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;
Diagnosis based on yes or no answers are covered in another note by Bland and Altman.
The same statistical methods for quantifying yes or no answers can be applied here when there is a cut off threshold for defining &lt;em&gt;normal&lt;/em&gt; and &lt;em&gt;abnormal&lt;/em&gt; test results.
For simplicity, I will call someone who is diagnosed by a clinical test a “&lt;em&gt;case&lt;/em&gt;” and someone who is not diagnosed by a test/healthy/normal, a “&lt;em&gt;control&lt;/em&gt;”.
These terms are incorrect but much simpler to repeatedly read than “people who are diagnosed by a test”.&lt;/p&gt;

&lt;p&gt;The receiver operating characteristic (ROC) plot can be used measure how test results compare between cases and controls.
Altman and Bland mention that this method was developed in the 1950s for evaluating radar signal detection.
An aside for history buffs, from &lt;a href=&quot;https://en.wikipedia.org/wiki/Receiver_operating_characteristic&quot;&gt;wikipedia&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;The ROC curve was first used during World War II for the analysis of radar signals before it was employed in signal detection theory.[56] Following the attack on Pearl Harbor in 1941, the United States army began new research to increase the prediction of correctly detected Japanese aircraft from their radar signals. For these purposes they measured the ability of a radar receiver operator to make these important distinctions, which was called the Receiver Operating Characteristic.[57]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;/images/receiver_operating_characteristic_plots.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The example shown in Figure 1 uses graft versus host disease, with an index measurement whose definition is not important.
The &lt;em&gt;Yes&lt;/em&gt; indicate &lt;em&gt;cases&lt;/em&gt; and &lt;em&gt;No&lt;/em&gt; indicate &lt;em&gt;controls&lt;/em&gt; in our terminology, respectively. 
The usefulness of the test for predicting graft versus host disease will clearly relate to the degree of non-overlap between the two distributions.
A ROC plot is obtained by calculating the&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;sensitivity and&lt;/li&gt;
  &lt;li&gt;specificity 
of every observed data value and plotting, as in Figure 2,&lt;/li&gt;
  &lt;li&gt;Y axis = sensitivity,&lt;/li&gt;
  &lt;li&gt;X axis = 1 - specificity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A test that perfectly defines cases and cotrols would have a curve that aligns withe Y axis and top.
A test that does not work would produce a straight line matching the centerline.
In practice, overlaps always occur such that the curve usually lies somewhere between, as shown in Figure 2.&lt;/p&gt;

&lt;p&gt;The performance of the test (diagnostic accuracy) is reported as the &lt;em&gt;area under the ROC curve&lt;/em&gt;.
The area is equal to the probability that a random case has a higher measurement than that of a control.
This probability is .5 for a test that does not work (e.g. coin-toss; straight line curve).
This discriminatory power assessment is important for a clinical test if it is to be sufficient to discriminate cases and controls.&lt;/p&gt;

&lt;p&gt;At this stage we have the global assessment of discriminatory power showing that a test can divide cases and control.
A cut off for clinical use also requires a local assessment. 
As per Altman and Bland; the simple approach of minimising “errors” (equivalent to maximising the sum of the sensitivity and specificity) is not necessarily best. 
We must consider any type of costs of&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;false negatives&lt;/li&gt;
  &lt;li&gt;false positives&lt;/li&gt;
  &lt;li&gt;and prevalence of disease in the test cohort.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In their example:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;cancer in general population
    &lt;ul&gt;
      &lt;li&gt;most cases should be detected (high sensitivity)&lt;/li&gt;
      &lt;li&gt;many false positives (low specificity), who could then be eliminated by a further test.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For comparing two or more measures, the ROC plot is useful. 
The curve wholly above another is clearly the better test. 
Altman and Bland cite a review for methods for comparing the areas under two curves for both paired and unpaired data.&lt;/p&gt;

&lt;p&gt;In my (reccomended) pocket-sized copy of 
&lt;em&gt;Oxford handbook of medical statistics&lt;/em&gt;
&lt;a class=&quot;citation&quot; href=&quot;#peacock2011oxford&quot;&gt;(Peacock &amp;amp; Peacock, 2011)&lt;/a&gt;,
a clinical example uses a chosen cut-off of sensitivity \(&amp;gt;81\%\) and specificity \(28\%\). 
The area under ROC curve was .65, thus a moderately high predictive power. 
The accuracy (proportion of all correctly identified cases) was 
\(\frac{ 30 + 42 }{ 185 } = 39\%\)&lt;/p&gt;

&lt;div class=&quot;table-wrapper&quot;&gt;

\[\frac{\text{No. cases above cutoff} + \text{No. controls below cutoff }}{ \text{cohort total} }\]

&lt;/div&gt;

&lt;h2 id=&quot;example-roc-cuvre&quot;&gt;Example ROC cuvre&lt;/h2&gt;
&lt;p&gt;To implement this method, I include here an example in R code.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/receiver_operating_characteristic_curve.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Modified example from https://stackoverflow.com/questions/31138751/roc-curve-from-training-data-in-caret&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;caret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mlbench&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Dataset&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sonar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# An example dataset for classification of sonar signals using a neural network. The task is to train a network to discriminate between sonar signals bounced off a metal cylinder and those bounced off a roughly cylindrical rock. Each pattern is a set of 60 numbers in the range 0.0 to 1.0. Each number represents the energy within a particular frequency band, integrated over a certain period of time. Labels: &quot;R&quot; if the object is a rock and &quot;M&quot; if it is a mine (metal cylinder). &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctrl&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;trainControl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;cv&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
                     &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;summaryFunction&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;twoClassSummary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
                     &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classProbs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                     &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savePredictions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Class&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sonar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
               &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;rf&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;preProc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;center&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;scale&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
               &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;trControl&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctrl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pROC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Select a parameter setting&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;selectedIndices&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mtry&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Plot:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot.roc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;selectedIndices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;selectedIndices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print.auc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print.thres&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#thresh .557 shown&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
         &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legacy.axes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# With ggplot&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plotROC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;selectedIndices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
       &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ifelse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rfFit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;selectedIndices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;M&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
           &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
           &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_roc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vjust&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;coord_equal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;style_roc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;annotate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.75&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;AUC =&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;round&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;calc_auc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AUC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;altman1994diagnostic&quot;&gt;Altman, D. G., &amp;amp; Bland, J. M. (1994). Diagnostic tests 3: receiver operating characteristic plots. &lt;i&gt;BMJ: British Medical Journal&lt;/i&gt;, &lt;i&gt;309&lt;/i&gt;(6948), 188.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;peacock2011oxford&quot;&gt;Peacock, J., &amp;amp; Peacock, P. (2011). &lt;i&gt;Oxford handbook of medical statistics&lt;/i&gt;. Oxford university press.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Footnote&lt;/strong&gt;
&lt;b id=&quot;f1&quot;&gt;1&lt;/b&gt; This article is almost identical to the original version in acknowledgment to Altman and Bland. It is adapted here as part of a set of curated, consistent, and minimal examples of statistics required for human genomic analysis.
&lt;a href=&quot;#a1&quot;&gt;↩&lt;/a&gt;&lt;/p&gt;
</description>
        <pubDate>Fri, 16 Jul 2021 00:00:00 +0000</pubDate>
        <link>http://localhost:4000//topic/statistics/altman_bland_roc_curve/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/altman_bland_roc_curve/</guid>
      </item>
    
      <item>
        <title>Sensitivity and specificity</title>
        <description>&lt;h1 id=&quot;sensitivity-and-specificity&quot;&gt;Sensitivity and specificity&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#sensitivity-and-specificity-1&quot; id=&quot;markdown-toc-sensitivity-and-specificity-1&quot;&gt;Sensitivity and specificity&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#confidence-intervals&quot; id=&quot;markdown-toc-confidence-intervals&quot;&gt;Confidence intervals&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#quantifying-the-diagnostic-ability&quot; id=&quot;markdown-toc-quantifying-the-diagnostic-ability&quot;&gt;Quantifying the diagnostic ability&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;sensitivity-and-specificity-1&quot;&gt;Sensitivity and specificity&lt;/h2&gt;
&lt;p&gt;The third paper on the list of BMJ statistics notes by Altman and Bland,
(&lt;a href=&quot;https://github.com/DylanLawless/notes/blob/main/202106291417-lit-altman_bland.md&quot;&gt; lit-altman_bland.md &lt;/a&gt;),
&lt;a class=&quot;citation&quot; href=&quot;#altman1994diagnostic1&quot;&gt;(Altman &amp;amp; Bland, 1994)&lt;/a&gt; 3. Altman DG, Bland JM. (1994) &lt;a href=&quot;http://www.bmj.com/cgi/content/full/308/6943/1552&quot;&gt;
Diagnostic tests 1: sensitivity and specificity.&lt;/a&gt; &lt;b&gt;308&lt;/b&gt;, 1552.
&lt;sup id=&quot;a1&quot;&gt;&lt;a href=&quot;#f1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The simple diagnostic test such as an x-ray is used to classify patients into two groups:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Presence of a symptom or sign
    &lt;ul&gt;
      &lt;li&gt;Yes&lt;/li&gt;
      &lt;li&gt;No&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Altman and Bland use the following cited example;
The results of a scan (&lt;strong&gt;test&lt;/strong&gt;) compared to the correct diagnosis (&lt;strong&gt;true positive&lt;/strong&gt;) based on either necropsy, biopsy, or surgical inspection. 
i.e. How good is the scan for correct diagnosis?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Table 1.&lt;/strong&gt; &lt;em&gt;Relation between results of liver scan and correct diagnosis.&lt;/em&gt;&lt;/p&gt;

&lt;div class=&quot;table-wrapper&quot;&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th style=&quot;text-align: left&quot;&gt;Liver scan&lt;/th&gt;
        &lt;th style=&quot;text-align: center&quot;&gt;: Pathology (diagnosis) :&lt;/th&gt;
        &lt;th style=&quot;text-align: center&quot;&gt; &lt;/th&gt;
        &lt;th style=&quot;text-align: center&quot;&gt; &lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt; &lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Abnormal (+)&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Normal (-)&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;&lt;strong&gt;Abnormal(+)&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;&lt;em&gt;231&lt;/em&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;32&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;263&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;&lt;strong&gt;Normal(-)&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;27&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;&lt;em&gt;54&lt;/em&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;81&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td style=&quot;text-align: left&quot;&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;258&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;86&lt;/td&gt;
        &lt;td style=&quot;text-align: center&quot;&gt;344&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

&lt;/div&gt;

&lt;p&gt;Patients who are correctly labelled are:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Disease signs and abnormal liver
    &lt;ul&gt;
      &lt;li&gt;258 true positives&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;No signs and healthy liver
    &lt;ul&gt;
      &lt;li&gt;86 true negatives&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The proportions of these two groups that were also correctly diagnosed by the scan were \(231/258=0.90\) and \(54/86=0.63\), respectively.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Sensitivity
    &lt;ul&gt;
      &lt;li&gt;Proportion of &lt;strong&gt;true positives&lt;/strong&gt; that are &lt;strong&gt;correctly identified&lt;/strong&gt; by the test.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Specificity
    &lt;ul&gt;
      &lt;li&gt;Proportion of &lt;strong&gt;true negatives&lt;/strong&gt; that are &lt;strong&gt;correctly identified&lt;/strong&gt; by the test.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Based on Altman and Bland’s example sample, 
we expect 90% true positives (&lt;em&gt;patients with abnormal pathology to have abnormal (positive) liver scans&lt;/em&gt;), 
and 63% true negatives (&lt;em&gt;those with normal pathology would have normal (negative) liver scans&lt;/em&gt;).&lt;/p&gt;

&lt;h2 id=&quot;confidence-intervals&quot;&gt;Confidence intervals&lt;/h2&gt;
&lt;p&gt;The sensitivity and specificity are proportions, so confidence intervals can be calculated.
This uses standard methods for proportions
&lt;a class=&quot;citation&quot; href=&quot;#gardner1989calculating&quot;&gt;(Gardner &amp;amp; Altman, 1989)&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;quantifying-the-diagnostic-ability&quot;&gt;Quantifying the diagnostic ability&lt;/h2&gt;
&lt;p&gt;Sensitivity and specificity are one approach to quantifying the diagnostic ability of the test.
In this case, we already have the final results of &lt;strong&gt;tests&lt;/strong&gt; and &lt;strong&gt;diagnosis&lt;/strong&gt; from the sample set.
For an individual patient we only have the &lt;strong&gt;test&lt;/strong&gt; result.
We want to quantify how well the test can predict true positives.&lt;/p&gt;

&lt;p&gt;This is answered in the next statistical note; &lt;em&gt;predictive values&lt;/em&gt;.
It defines &lt;em&gt;positive&lt;/em&gt; and &lt;em&gt;negative predictive values&lt;/em&gt; and requires the use of &lt;em&gt;sensitivity,&lt;/em&gt; &lt;em&gt;specificity,&lt;/em&gt; and &lt;em&gt;prevalence&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;altman1994diagnostic1&quot;&gt;Altman, D. G., &amp;amp; Bland, J. M. (1994). Diagnostic tests. 1: Sensitivity and specificity. &lt;i&gt;BMJ: British Medical Journal&lt;/i&gt;, &lt;i&gt;308&lt;/i&gt;(6943), 1552.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;gardner1989calculating&quot;&gt;Gardner, M. J., &amp;amp; Altman, D. G. (1989). Calculating confidence intervals for proportions and their differences. &lt;i&gt;Statistics with Confidence. London: BMJ Publishing Group&lt;/i&gt;, 28–33.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Footnote&lt;/strong&gt;
&lt;b id=&quot;f1&quot;&gt;1&lt;/b&gt; This article is almost identical to the original version in acknowledgment to Altman and Bland. It is adapted here as part of a set of curated, consistent, and minimal examples of statistics required for human genomic analysis.
&lt;a href=&quot;#a1&quot;&gt;↩&lt;/a&gt;&lt;/p&gt;
</description>
        <pubDate>Mon, 19 Jul 2021 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//topic/statistics/altman_bland_sensitivity_specificity/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/altman_bland_sensitivity_specificity/</guid>
      </item>
    
      <item>
        <title>Bayesian statistics introduction</title>
        <description>&lt;h1 id=&quot;bayesian-statistics-introduction&quot;&gt;Bayesian statistics introduction&lt;/h1&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#bayes-theorem-formula&quot; id=&quot;markdown-toc-bayes-theorem-formula&quot;&gt;Bayes’ Theorem Formula&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#bayesian-and-frequentist-methods-two-statistical-philosophies&quot; id=&quot;markdown-toc-bayesian-and-frequentist-methods-two-statistical-philosophies&quot;&gt;Bayesian and Frequentist Methods: Two Statistical Philosophies&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#bayesian-approach&quot; id=&quot;markdown-toc-bayesian-approach&quot;&gt;Bayesian Approach&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#frequentist-approach&quot; id=&quot;markdown-toc-frequentist-approach&quot;&gt;Frequentist Approach&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#bayesian-methods-an-overview-and-examples&quot; id=&quot;markdown-toc-bayesian-methods-an-overview-and-examples&quot;&gt;Bayesian Methods: An Overview and Examples&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#how-bayesian-methods-work&quot; id=&quot;markdown-toc-how-bayesian-methods-work&quot;&gt;How Bayesian Methods Work&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#example-1-estimating-prevalence&quot; id=&quot;markdown-toc-example-1-estimating-prevalence&quot;&gt;Example 1: Estimating Prevalence&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#example-2-using-bayes-theorem-to-update-estimates&quot; id=&quot;markdown-toc-example-2-using-bayes-theorem-to-update-estimates&quot;&gt;Example 2: Using Bayes’ Theorem to Update Estimates&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#bayesian-terminology&quot; id=&quot;markdown-toc-bayesian-terminology&quot;&gt;Bayesian Terminology&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#prior-distributions-in-bayesian-methods&quot; id=&quot;markdown-toc-prior-distributions-in-bayesian-methods&quot;&gt;Prior Distributions in Bayesian Methods&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#introduction-to-prior-distributions&quot; id=&quot;markdown-toc-introduction-to-prior-distributions&quot;&gt;Introduction to Prior Distributions&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#common-default-prior-distributions&quot; id=&quot;markdown-toc-common-default-prior-distributions&quot;&gt;Common ‘Default’ Prior Distributions&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#sensitivity-analyses&quot; id=&quot;markdown-toc-sensitivity-analyses&quot;&gt;Sensitivity Analyses&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#key-points-on-prior-distributions-5&quot; id=&quot;markdown-toc-key-points-on-prior-distributions-5&quot;&gt;Key Points on Prior Distributions [5]&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#likelihood-and-posterior-distributions-in-bayesian-methods&quot; id=&quot;markdown-toc-likelihood-and-posterior-distributions-in-bayesian-methods&quot;&gt;Likelihood and Posterior Distributions in Bayesian Methods&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#likelihood&quot; id=&quot;markdown-toc-likelihood&quot;&gt;Likelihood&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#posterior-distribution&quot; id=&quot;markdown-toc-posterior-distribution&quot;&gt;Posterior Distribution&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#example-combining-prior-distribution-and-data&quot; id=&quot;markdown-toc-example-combining-prior-distribution-and-data&quot;&gt;Example: Combining Prior Distribution and Data&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#conjugate-distributions&quot; id=&quot;markdown-toc-conjugate-distributions&quot;&gt;Conjugate Distributions&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#summarizing-and-presenting-bayesian-results&quot; id=&quot;markdown-toc-summarizing-and-presenting-bayesian-results&quot;&gt;Summarizing and Presenting Bayesian Results&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#estimates&quot; id=&quot;markdown-toc-estimates&quot;&gt;Estimates&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#posterior-probabilities&quot; id=&quot;markdown-toc-posterior-probabilities&quot;&gt;Posterior Probabilities&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#credible-intervals-posterior-interval&quot; id=&quot;markdown-toc-credible-intervals-posterior-interval&quot;&gt;Credible Intervals (Posterior Interval)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#significance-tests&quot; id=&quot;markdown-toc-significance-tests&quot;&gt;Significance Tests&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#using-bayesian-analyses-in-medicine&quot; id=&quot;markdown-toc-using-bayesian-analyses-in-medicine&quot;&gt;Using Bayesian Analyses in Medicine&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#example-great--a-bayesian-re-analysis&quot; id=&quot;markdown-toc-example-great--a-bayesian-re-analysis&quot;&gt;Example: GREAT – A Bayesian Re-analysis&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#using-bayesian-analyses&quot; id=&quot;markdown-toc-using-bayesian-analyses&quot;&gt;Using Bayesian Analyses&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#paroxetine-and-suicide-attempts-a-bayesian-analysis&quot; id=&quot;markdown-toc-paroxetine-and-suicide-attempts-a-bayesian-analysis&quot;&gt;Paroxetine and Suicide Attempts: A Bayesian Analysis&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#posterior-distributions&quot; id=&quot;markdown-toc-posterior-distributions&quot;&gt;Posterior Distributions&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#bayesian-checklist&quot; id=&quot;markdown-toc-bayesian-checklist&quot;&gt;Bayesian Checklist&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#robust-reporting-of-bayes-used-in-clinical-studies&quot; id=&quot;markdown-toc-robust-reporting-of-bayes-used-in-clinical-studies&quot;&gt;ROBUST (Reporting Of Bayes Used in Clinical STudies)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#comparison-of-bayesian-and-frequentist-methods&quot; id=&quot;markdown-toc-comparison-of-bayesian-and-frequentist-methods&quot;&gt;Comparison of Bayesian and Frequentist Methods&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This material is reproduced based on the examples from:
Oxford handbook of medical statistics (Janet Peacock Philip J Peacock) [1].&lt;/p&gt;

&lt;h3 id=&quot;bayes-theorem-formula&quot;&gt;Bayes’ Theorem Formula&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Describes the Bayesian approach to statistical analysis in contrast to the frequentist approach&lt;/li&gt;
  &lt;li&gt;Covers prior and posterior distributions, their roles, and choices&lt;/li&gt;
  &lt;li&gt;Discusses examples, pros and cons of Bayesian methods in medicine, and how to interpret them in medical literature&lt;/li&gt;
  &lt;li&gt;Developed by Reverend Thomas Bayes in 1763&lt;/li&gt;
  &lt;li&gt;Statement about conditional probabilities, used in many statistical areas&lt;/li&gt;
  &lt;li&gt;Events A and B:&lt;/li&gt;
  &lt;li&gt;\(Pr(A \mid B)\): ‘the probability of A happening given that B has already happened’&lt;/li&gt;
  &lt;li&gt;\(Pr(A \mid B)\) = \(\frac{Pr(B \mid A) \times Pr(A)}{Pr(B)}\)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;bayesian-and-frequentist-methods-two-statistical-philosophies&quot;&gt;Bayesian and Frequentist Methods: Two Statistical Philosophies&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Bayes’ theorem forms the basis for competing statistical philosophies: Bayesian and frequentist methods&lt;/li&gt;
  &lt;li&gt;They differ in their definitions of probability&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;bayesian-approach&quot;&gt;Bayesian Approach&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Probability interpreted as degree of belief that an event will occur&lt;/li&gt;
  &lt;li&gt;Degree of belief comes from past data or experience&lt;/li&gt;
  &lt;li&gt;Unknown quantities (e.g., means, proportions) follow probability distributions expressing certainty about true value&lt;/li&gt;
  &lt;li&gt;Degree of belief can be updated with further information&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;frequentist-approach&quot;&gt;Frequentist Approach&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Probability is the long-run frequency of events (r) that occur in n trials&lt;/li&gt;
  &lt;li&gt;Probabilities estimated directly from samples&lt;/li&gt;
  &lt;li&gt;Unknown quantities (e.g., means, proportions) considered fixed, estimated from data with confidence intervals&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;bayesian-methods-an-overview-and-examples&quot;&gt;Bayesian Methods: An Overview and Examples&lt;/h2&gt;

&lt;h3 id=&quot;how-bayesian-methods-work&quot;&gt;How Bayesian Methods Work&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Combines anticipated values or distribution of values (prior) with new data&lt;/li&gt;
  &lt;li&gt;Updates the distribution of the true values&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;example-1-estimating-prevalence&quot;&gt;Example 1: Estimating Prevalence&lt;/h4&gt;
&lt;ol&gt;
  &lt;li&gt;Specify anticipated value or distribution for prevalence (prior) using national data&lt;/li&gt;
  &lt;li&gt;Collect regional data and calculate prevalence&lt;/li&gt;
  &lt;li&gt;Combine observed area prevalence with prior distribution to update the distribution of true prevalence in the region&lt;/li&gt;
&lt;/ol&gt;

\[\text{Prior} + \text{New data} = \text{Updated estimate}\]

&lt;h4 id=&quot;example-2-using-bayes-theorem-to-update-estimates&quot;&gt;Example 2: Using Bayes’ Theorem to Update Estimates&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Study investigated new D-dimer test for diagnosing venous thromboembolism (VTE) in patients with suspicious symptoms [3]&lt;/li&gt;
  &lt;li&gt;Calculate updated probability that a patient has VTE given a positive D-dimer test result: \(\Pr(VTE+ \mid D+)\)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Using Bayes’ theorem:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;\(Pr(VTE^{+})\) = anticipated prevalence of VTE = 14% (0.14)&lt;/li&gt;
  &lt;li&gt;\(Pr(D^{+})\) = proportion who test positive on D-dimer = 32% (0.32)&lt;/li&gt;
  &lt;li&gt;\(Pr(D^{+} \mid VTE^{+})\) = probability of positive D-dimer test if the patient truly has VTE = 79% (0.79, sensitivity)&lt;/li&gt;
&lt;/ul&gt;

\[Pr(VTE+ \mid D+) = \frac{Pr(D+ \mid VTE+) \times Pr(VTE+)}{Pr(D+)} = \frac{0.79 \times 0.14}{0.32} = 0.346 = 34.6\%\]

&lt;ul&gt;
  &lt;li&gt;Updated probability that a patient testing positive on D-dimer has VTE is approximately 35%&lt;/li&gt;
  &lt;li&gt;Bayesian approach updates estimates, arguably providing better estimates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;bayesian-terminology&quot;&gt;Bayesian Terminology&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Prior beliefs: Prior distribution&lt;/li&gt;
  &lt;li&gt;New data: Likelihood&lt;/li&gt;
  &lt;li&gt;Updated estimate: Posterior distribution&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;prior-distributions-in-bayesian-methods&quot;&gt;Prior Distributions in Bayesian Methods&lt;/h2&gt;

&lt;h3 id=&quot;introduction-to-prior-distributions&quot;&gt;Introduction to Prior Distributions&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Prior distribution: distribution of unknown quantity combined with new data to provide updated estimate&lt;/li&gt;
  &lt;li&gt;Three categories of prior distribution [4]:
    &lt;ol&gt;
      &lt;li&gt;Frequency distribution based on past data&lt;/li&gt;
      &lt;li&gt;Objective representation of reasonable beliefs about a quantity&lt;/li&gt;
      &lt;li&gt;Subjective measure of an individual’s beliefs&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;common-default-prior-distributions&quot;&gt;Common ‘Default’ Prior Distributions&lt;/h3&gt;
&lt;ol&gt;
  &lt;li&gt;Non-informative/reference priors: e.g., uniform distribution; used when range of values can be pre-specified but no clear opinion about which value is most likely&lt;/li&gt;
  &lt;li&gt;Informative-sceptical prior: used to express scepticism about the estimated quantity, reduces chances of spuriously large effects&lt;/li&gt;
  &lt;li&gt;Informative-enthusiastic prior: counterbalance of sceptical prior; used when positive effect is expected&lt;/li&gt;
  &lt;li&gt;Informative prior based on prior beliefs, which are formally elicited: shape of distribution varies by context, Normal distribution sometimes used&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;sensitivity-analyses&quot;&gt;Sensitivity Analyses&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Choice of prior distribution can affect final estimate&lt;/li&gt;
  &lt;li&gt;Good practice to test sensitivity of assumptions by using different forms&lt;/li&gt;
  &lt;li&gt;If choice doesn’t affect updated estimate, results are more robust&lt;/li&gt;
  &lt;li&gt;If choice matters, present range of results to demonstrate sensitivity to the prior&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;key-points-on-prior-distributions-5&quot;&gt;Key Points on Prior Distributions [5]&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Choice is based on judgement, subjectivity is unavoidable&lt;/li&gt;
  &lt;li&gt;Use a range of options to test sensitivity of choice&lt;/li&gt;
  &lt;li&gt;Clearly justify choice(s) of prior to make results credible to external consumers&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;likelihood-and-posterior-distributions-in-bayesian-methods&quot;&gt;Likelihood and Posterior Distributions in Bayesian Methods&lt;/h2&gt;

&lt;h3 id=&quot;likelihood&quot;&gt;Likelihood&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Summary of evidence from new data&lt;/li&gt;
  &lt;li&gt;Combined with prior distribution to create updated posterior distribution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;posterior-distribution&quot;&gt;Posterior Distribution&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Updated probability distribution for unknown quantity&lt;/li&gt;
  &lt;li&gt;Reflects range of possible values and degree of belief associated with each value&lt;/li&gt;
  &lt;li&gt;Less uncertainty than prior distribution; tends to be narrower than prior distribution and likelihood&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;example-combining-prior-distribution-and-data&quot;&gt;Example: Combining Prior Distribution and Data&lt;/h3&gt;
&lt;!-- ![Bayesian Analysis Example](https://www.bmj.com/content/bmj/319/7208/508/F1.large.jpg) --&gt;
&lt;ul&gt;
  &lt;li&gt;Prior distribution: evidence available before study&lt;/li&gt;
  &lt;li&gt;Likelihood: evidence from study itself&lt;/li&gt;
  &lt;li&gt;Posterior distribution: pools two sources of evidence by multiplying curves together [6]&lt;/li&gt;
  &lt;li&gt;Prior distribution pulls likelihood towards null value (0), making final result less extreme&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/bayes_intro_fig1.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1. Illustration of how Bayesian analysis combines a prior distribution (top graph) with the data (‘likelihood’, middle graph) to give the posterior distribution (bottom graph). Reproduced from BMJ, Spiegelhalter et al, 319, 508  1999.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;conjugate-distributions&quot;&gt;Conjugate Distributions&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Common for prior and posterior distributions to be related (same distribution or family of distributions)&lt;/li&gt;
  &lt;li&gt;Makes calculations more feasible&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;summarizing-and-presenting-bayesian-results&quot;&gt;Summarizing and Presenting Bayesian Results&lt;/h2&gt;

&lt;h3 id=&quot;estimates&quot;&gt;Estimates&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Summary measure (mean, median) often presented&lt;/li&gt;
  &lt;li&gt;Other estimates (standard deviation, interquartile range) depend on distribution shape and context&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;posterior-probabilities&quot;&gt;Posterior Probabilities&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Posterior distribution is a probability distribution&lt;/li&gt;
  &lt;li&gt;Can calculate probabilities for a specific range of values for estimated quantity&lt;/li&gt;
  &lt;li&gt;Example: Probability that relative risk in a trial is greater than 1 (showing efficacy)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;credible-intervals-posterior-interval&quot;&gt;Credible Intervals (Posterior Interval)&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Commonly present 95% credible intervals for posterior estimates&lt;/li&gt;
  &lt;li&gt;Represents range within which true value lies with 95% probability&lt;/li&gt;
  &lt;li&gt;Different from 95% confidence intervals (based on sampling distribution, not probability distribution)&lt;/li&gt;
  &lt;li&gt;Easy to calculate for unimodal and symmetrical posterior probability distributions; more complex otherwise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;95% Credible Interval vs. 95% Confidence Interval&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;95% probability that true value lies within 95% credible interval&lt;/li&gt;
  &lt;li&gt;95% probability that a 95% confidence interval contains true value&lt;/li&gt;
  &lt;li&gt;Differences between the two are subtle and may not be practically significant for data interpretation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;significance-tests&quot;&gt;Significance Tests&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;No formal place in Bayesian framework&lt;/li&gt;
  &lt;li&gt;Emphasis on distribution of estimates, not testing against a single value&lt;/li&gt;
  &lt;li&gt;Posterior probability distribution can be used to calculate probability that true value takes specific values&lt;/li&gt;
  &lt;li&gt;Bayesian approach provides information needed, rather than yes/no approach of significance testing&lt;/li&gt;
  &lt;li&gt;Both Bayesian and frequentist statisticians agree that single value or test against single value is of limited usefulness&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;using-bayesian-analyses-in-medicine&quot;&gt;Using Bayesian Analyses in Medicine&lt;/h2&gt;

&lt;p&gt;Bayesian methods are used in various medical research areas:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Observational studies&lt;/li&gt;
  &lt;li&gt;Design, monitoring, and analysis of trials&lt;/li&gt;
  &lt;li&gt;Meta-analyses&lt;/li&gt;
  &lt;li&gt;Missing data imputation&lt;/li&gt;
  &lt;li&gt;Decision making&lt;/li&gt;
  &lt;li&gt;Health economics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Applicable in situations similar to frequentist methods (single quantity estimation, regression analysis, multifactorial regression, multilevel models)&lt;/p&gt;

&lt;h3 id=&quot;example-great--a-bayesian-re-analysis&quot;&gt;Example: GREAT – A Bayesian Re-analysis&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Original GREAT trial examined effect of thrombolytic therapy on patients with suspected myocardial infarction&lt;/li&gt;
  &lt;li&gt;Frequentist analysis reported significant beneficial effect on mortality&lt;/li&gt;
  &lt;li&gt;Results challenged due to unexpected size, small trial, and more modest effect in unpublished bigger European trial&lt;/li&gt;
  &lt;li&gt;Pocock and Spiegelhalter conducted Bayesian re-analysis, constructing prior distribution to express plausible reduction in mortality&lt;/li&gt;
  &lt;li&gt;Bayesian analysis showed a reduction in mortality of 25% compared to 49% in frequentist analysis, suggesting over-optimistic trial results&lt;/li&gt;
  &lt;li&gt;Demonstrates how Bayesian analysis can be used to combine prior evidence and new data for more reasonable estimates in cases of unexpected results&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;using-bayesian-analyses&quot;&gt;Using Bayesian Analyses&lt;/h2&gt;

&lt;h3 id=&quot;paroxetine-and-suicide-attempts-a-bayesian-analysis&quot;&gt;Paroxetine and Suicide Attempts: A Bayesian Analysis&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Meta-analysis examined the potential link between antidepressant drugs and increased suicides in adults [10]&lt;/li&gt;
  &lt;li&gt;Included unpublished data, corrected for duration of medication and placebo treatment&lt;/li&gt;
  &lt;li&gt;7 suicide attempts in patients taking the drug, 1 in patient taking placebo&lt;/li&gt;
  &lt;li&gt;Prior distribution assumed to be gamma&lt;/li&gt;
  &lt;li&gt;Three different prior distributions tested: pessimistic, slightly pessimistic, and slightly optimistic (Fig. 2)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/bayes_intro_fig2.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 2. Three different prior distributions used in a meta-analysis of antidepressant drugs and suicide in adults. From Aursnes et al. [10]&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;posterior-distributions&quot;&gt;Posterior Distributions&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Posterior distributions correspond to the three prior distributions&lt;/li&gt;
  &lt;li&gt;Majority of each distribution is greater than 0, indicating evidence in favor of an adverse effect of paroxetine on suicide risk&lt;/li&gt;
  &lt;li&gt;Authors reported relative risks of 2.46 (pessimistic prior), 2.20 (slightly pessimistic prior), and 2.34 (optimistic prior)&lt;/li&gt;
  &lt;li&gt;Bayesian approach supports recent meta-analyses results, suggesting increased risk of suicidal activity in adults taking certain antidepressant drugs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/bayes_intro_fig3.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 3. Three posterior distributions corresponding to the three priors used in a meta-analysis of antidepressant drugs and suicide in adults. From Aursnes et al. [10]&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;bayesian-checklist&quot;&gt;Bayesian Checklist&lt;/h2&gt;

&lt;p&gt;Sung et. al have generated a checklist of seven items (ROBUST) that should be included when a Bayesian analysis is reported[11]. These are helpful in interpreting a Bayesian analysis.&lt;/p&gt;

&lt;h3 id=&quot;robust-reporting-of-bayes-used-in-clinical-studies&quot;&gt;ROBUST (Reporting Of Bayes Used in Clinical STudies)&lt;/h3&gt;

&lt;p&gt;The box below lists the items included in ROBUST. The checklist can be scored to provide a measure of the quality of reporting, but here it is given as a guide to what points to check when reading an article where Bayesian methods have been used.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Prior distribution: specified&lt;/li&gt;
  &lt;li&gt;Prior distribution: justified&lt;/li&gt;
  &lt;li&gt;Prior distribution: sensitivity analysis&lt;/li&gt;
  &lt;li&gt;Analysis: statistical model&lt;/li&gt;
  &lt;li&gt;Analysis: analytical technique&lt;/li&gt;
  &lt;li&gt;Results: central tendency&lt;/li&gt;
  &lt;li&gt;Results: standard deviation or credible interval&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;comparison-of-bayesian-and-frequentist-methods&quot;&gt;Comparison of Bayesian and Frequentist Methods&lt;/h2&gt;

&lt;p&gt;This table is adapted from Spiegelhalter et al. and gives a helpful summary of the two approaches.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;&lt;strong&gt;Issue&lt;/strong&gt;&lt;/th&gt;
      &lt;th&gt;&lt;strong&gt;Frequentist methods&lt;/strong&gt;&lt;/th&gt;
      &lt;th&gt;&lt;strong&gt;Bayesian methods&lt;/strong&gt;&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Prior information other than that in the study being analysed&lt;/td&gt;
      &lt;td&gt;Interpretation of the parameter of interest: Informally used when choosing a model/form of analysis, A fixed unknown value&lt;/td&gt;
      &lt;td&gt;Used formally by specifying a prior probability distribution, An unknown quantity which can have a probability distribution&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Basic statistical question&lt;/td&gt;
      &lt;td&gt;How likely are the data, given a particular value of the parameter?&lt;/td&gt;
      &lt;td&gt;How likely is the particular value of the parameter given the data?&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Presentation of results&lt;/td&gt;
      &lt;td&gt;P values, estimates, confidence intervals&lt;/td&gt;
      &lt;td&gt;Plots of posterior distribution of the parameter, calculation of specific posterior probabilities of interest, and use of the posterior distribution in formal decision analysis. Expected value and credible intervals&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dealing with subsets in trials&lt;/td&gt;
      &lt;td&gt;Adjusted P values (e.g. Bonferroni)&lt;/td&gt;
      &lt;td&gt;Subset effects shrunk towards zero by a ‘sceptical’ prior&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;Adapted from Spiegelhalter et al. 1999.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;Oxford handbook of medical statistics (Janet Peacock Philip J Peacock).&lt;/li&gt;
  &lt;li&gt;Bayes T. An essay towards solving a problem in the doctrine of chances. Philos Trans Roy Soc 1763; 53:370–418.&lt;/li&gt;
  &lt;li&gt;Kovacs MJ, Mackinnon KM, Anderson D, O’Rourke K, Keeney M, Kearon C et al. A comparison of three rapid D-dimer methods for the diagnosis of venous thromboembolism. Br J Haematol 2001; 115(1):140–4.&lt;/li&gt;
  &lt;li&gt;Ashby D. Bayesian statistics in medicine: a 25 year review. Stat Med 2006; 25(21):3589–631.&lt;/li&gt;
  &lt;li&gt;Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian approaches to clinical trials and health-care evaluation. Chichester, West Sussex: Wiley, 2004.&lt;/li&gt;
  &lt;li&gt;Spiegelhalter DJ, Myles JP, Jones DR, Abrams KR. Methods in health service research: An introduction to Bayesian methods in health technology assessment. BMJ 1999; 319(7208):508–12.&lt;/li&gt;
  &lt;li&gt;Gelman A, Carlin John B, Stern Hal S, Rubin DB. Bayesian data analysis. 2nd ed. Boca Raton, FL: Chapman &amp;amp; Hall/CRC, 2004.&lt;/li&gt;
  &lt;li&gt;GREAT group. Feasibility, safety, and efficacy of domiciliary thrombolysis by general practition- ers: Grampian Region Early Anistreplase Trial. GREAT Group. BMJ 1992; 305(6853):548–53.&lt;/li&gt;
  &lt;li&gt;Pocock SJ, Spiegelhalter DJ. Domiciliary thrombolysis by general practitioners. BMJ 1992; 305(6860):1015.&lt;/li&gt;
  &lt;li&gt;Aursnes I, Tvete IF, Gaasemyr J, Natvig B. Suicide attempts in clinical trials with paroxetine randomised against placebo. BMC Med 2005; 3:14.&lt;/li&gt;
  &lt;li&gt;Sung L, Hayden J, Greenberg ML, Koren G, Feldman BM, Tomlinson GA. Seven items were identified for inclusion when reporting a Bayesian analysis of a clinical study. J Clin Epidemiol 2005; 58(3):261–8.&lt;/li&gt;
&lt;/ol&gt;
</description>
        <pubDate>Sat, 22 Apr 2023 00:00:00 +0000</pubDate>
        <link>http://localhost:4000//topic/statistics/bayesian_statistics_intro/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/bayesian_statistics_intro/</guid>
      </item>
    
      <item>
        <title>Burden testing with variant collapse</title>
        <description>&lt;h1 id=&quot;burden-testing-with-variant-collapse&quot;&gt;Burden testing with variant collapse&lt;/h1&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#to-do&quot; id=&quot;markdown-toc-to-do&quot;&gt;To do&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#main-papers-in-order&quot; id=&quot;markdown-toc-main-papers-in-order&quot;&gt;Main papers in order&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-classes-of-tests&quot; id=&quot;markdown-toc-major-classes-of-tests&quot;&gt;Major Classes of Tests&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#burden-tests-so-far&quot; id=&quot;markdown-toc-burden-tests-so-far&quot;&gt;Burden Tests So Far&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#tests&quot; id=&quot;markdown-toc-tests&quot;&gt;Tests&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#power-of-burden-tests-depends-on&quot; id=&quot;markdown-toc-power-of-burden-tests-depends-on&quot;&gt;Power of burden tests depends on&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#li-and-leal-2008&quot; id=&quot;markdown-toc-li-and-leal-2008&quot;&gt;Li and Leal 2008&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#background&quot; id=&quot;markdown-toc-background&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#genetic-model&quot; id=&quot;markdown-toc-genetic-model&quot;&gt;Genetic model&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#single-marker-test&quot; id=&quot;markdown-toc-single-marker-test&quot;&gt;Single-marker test&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#multiple-marker-test&quot; id=&quot;markdown-toc-multiple-marker-test&quot;&gt;Multiple-marker test&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#basic-collapsing-method&quot; id=&quot;markdown-toc-basic-collapsing-method&quot;&gt;Basic collapsing method&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#cmc-method&quot; id=&quot;markdown-toc-cmc-method&quot;&gt;CMC method&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#power-of-cmc-method&quot; id=&quot;markdown-toc-power-of-cmc-method&quot;&gt;Power of CMC method&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#misclassification&quot; id=&quot;markdown-toc-misclassification&quot;&gt;Misclassification&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#effects-of-linkage-disequilibrium-ld&quot; id=&quot;markdown-toc-effects-of-linkage-disequilibrium-ld&quot;&gt;Effects of Linkage Disequilibrium (LD)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#evaluation-of-type-i-error-rate&quot; id=&quot;markdown-toc-evaluation-of-type-i-error-rate&quot;&gt;Evaluation of Type I Error Rate&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#problems-with-this-paper&quot; id=&quot;markdown-toc-problems-with-this-paper&quot;&gt;Problems with this paper&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#madsen-and-browning-2009&quot; id=&quot;markdown-toc-madsen-and-browning-2009&quot;&gt;Madsen and Browning 2009&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#background-1&quot; id=&quot;markdown-toc-background-1&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#main-theory&quot; id=&quot;markdown-toc-main-theory&quot;&gt;Main theory&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#methods&quot; id=&quot;markdown-toc-methods&quot;&gt;Methods&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#weighted-sum-method&quot; id=&quot;markdown-toc-weighted-sum-method&quot;&gt;Weighted-Sum Method&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#power-simulations&quot; id=&quot;markdown-toc-power-simulations&quot;&gt;Power simulations&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#encode-data&quot; id=&quot;markdown-toc-encode-data&quot;&gt;ENCODE data&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#results&quot; id=&quot;markdown-toc-results&quot;&gt;Results&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#proportion-of-variants-containing-mutations&quot; id=&quot;markdown-toc-proportion-of-variants-containing-mutations&quot;&gt;Proportion of Variants Containing Mutations&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#discussion&quot; id=&quot;markdown-toc-discussion&quot;&gt;Discussion&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#price-2010&quot; id=&quot;markdown-toc-price-2010&quot;&gt;Price 2010&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#methods-1&quot; id=&quot;markdown-toc-methods-1&quot;&gt;Methods&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#simulation-framework&quot; id=&quot;markdown-toc-simulation-framework&quot;&gt;Simulation Framework&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#weighted-approaches-correspond-to-implicit-assumptions-about-log-odds-ratios&quot; id=&quot;markdown-toc-weighted-approaches-correspond-to-implicit-assumptions-about-log-odds-ratios&quot;&gt;Weighted Approaches Correspond to Implicit Assumptions about Log Odds Ratios&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#fixed-threshold-approach&quot; id=&quot;markdown-toc-fixed-threshold-approach&quot;&gt;Fixed-Threshold Approach&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#weighted-approach&quot; id=&quot;markdown-toc-weighted-approach&quot;&gt;Weighted Approach&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#variable-threshold-approach&quot; id=&quot;markdown-toc-variable-threshold-approach&quot;&gt;Variable-Threshold Approach&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#cheating-approach-to-incorporating-varphip-phip&quot; id=&quot;markdown-toc-cheating-approach-to-incorporating-varphip-phip&quot;&gt;Cheating Approach to Incorporating \(\varphi(p)\) (“phi(p)”)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#incorporation-of-computational-predictions-of-functional-effects&quot; id=&quot;markdown-toc-incorporation-of-computational-predictions-of-functional-effects&quot;&gt;Incorporation of Computational Predictions of Functional Effects&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#application-to-empirical-data-sets&quot; id=&quot;markdown-toc-application-to-empirical-data-sets&quot;&gt;Application to Empirical Data Sets&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#results-and-discussion&quot; id=&quot;markdown-toc-results-and-discussion&quot;&gt;Results and Discussion&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#neale-2011&quot; id=&quot;markdown-toc-neale-2011&quot;&gt;Neale 2011&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#c-alpha-tests&quot; id=&quot;markdown-toc-c-alpha-tests&quot;&gt;C-alpha tests&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#skat&quot; id=&quot;markdown-toc-skat&quot;&gt;SKAT&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#introduction-1&quot; id=&quot;markdown-toc-introduction-1&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#methods-2&quot; id=&quot;markdown-toc-methods-2&quot;&gt;Methods&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#sequencing-kernel-association-test&quot; id=&quot;markdown-toc-sequencing-kernel-association-test&quot;&gt;Sequencing Kernel Association Test&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#skat-model-and-test-for-linear-snp-effects&quot; id=&quot;markdown-toc-skat-model-and-test-for-linear-snp-effects&quot;&gt;SKAT Model and Test for Linear SNP Effects&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#relationship-between-linear-skat-and-individual-variant-test-statistics&quot; id=&quot;markdown-toc-relationship-between-linear-skat-and-individual-variant-test-statistics&quot;&gt;Relationship between Linear SKAT and Individual Variant Test Statistics&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#accommodating-epistatic-effects-and-prior-information-under-the-skat&quot; id=&quot;markdown-toc-accommodating-epistatic-effects-and-prior-information-under-the-skat&quot;&gt;Accommodating Epistatic Effects and Prior Information under the SKAT&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#skat-is-a-generalization-of-the-c-alpha-test&quot; id=&quot;markdown-toc-skat-is-a-generalization-of-the-c-alpha-test&quot;&gt;SKAT Is a Generalization of the C-Alpha Test&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#lee-2012&quot; id=&quot;markdown-toc-lee-2012&quot;&gt;Lee 2012&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#introduction-2&quot; id=&quot;markdown-toc-introduction-2&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#methods-3&quot; id=&quot;markdown-toc-methods-3&quot;&gt;Methods&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#protein-pathway-analysis&quot; id=&quot;markdown-toc-protein-pathway-analysis&quot;&gt;Protein pathway analysis&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#zhang-2021-ashg&quot; id=&quot;markdown-toc-zhang-2021-ashg&quot;&gt;Zhang 2021 ASHG&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#problems-with-this-paper-1&quot; id=&quot;markdown-toc-problems-with-this-paper-1&quot;&gt;Problems with this paper&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#cluster-method&quot; id=&quot;markdown-toc-cluster-method&quot;&gt;Cluster method&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#carrier-counts-instead-of-allele-dosage&quot; id=&quot;markdown-toc-carrier-counts-instead-of-allele-dosage&quot;&gt;Carrier counts instead of allele dosage&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#itan--casanova-contributions&quot; id=&quot;markdown-toc-itan--casanova-contributions&quot;&gt;Itan &amp;amp; Casanova contributions&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#historical-review&quot; id=&quot;markdown-toc-historical-review&quot;&gt;Historical review&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#burden-test-power-calculation&quot; id=&quot;markdown-toc-burden-test-power-calculation&quot;&gt;Burden test power calculation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#references&quot; id=&quot;markdown-toc-references&quot;&gt;References&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;In this series I will go over methods and theory for gene burden testing.
I am publishing this live as I work since there are several heavy papers to get through.
&lt;!--For the working writeup see: pri_net_manuscript/pre_print_md/detailed_methods/statistics.md--&gt;&lt;/p&gt;

&lt;h2 id=&quot;to-do&quot;&gt;To do&lt;/h2&gt;

&lt;p&gt;Running code on test data: Remake the gVCF as BP_RESOLUTION to list a site for every position. 
Filter: If variants are MAF &amp;gt;1.1 in gnomAD then run a filter to remove variants that are not sequenced in controls.
Justification: If we ran that filter for every singleton it would remove the very rare variants that would have been a true positive if it had been sequenced in the controls.&lt;/p&gt;

&lt;p&gt;Methods for variant collapse:
&lt;a href=&quot;https://www.nature.com/articles/s41576-019-0177-4/&quot;&gt;https://www.nature.com/articles/s41576-019-0177-4/&lt;/a&gt;
&lt;a class=&quot;citation&quot; href=&quot;#Povysil2019rare&quot;&gt;(Povysil et al., 2019)&lt;/a&gt;
One of the most important starting positions in our analysis is determined on variant collapse.
Each project and biological system has unique features that dertermine how variants should be grouped for joint analysis.
Some more context is available in this blogpost
&lt;a href=&quot;https://lawlessgenomics.com/2021/05/28/pathway_analysis.html&quot;&gt;https://lawlessgenomics.com/2021/05/28/pathway_analysis.html&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;main-papers-in-order&quot;&gt;Main papers in order&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data
&lt;a class=&quot;citation&quot; href=&quot;#li2008methods&quot;&gt;(Li &amp;amp; Leal, 2008)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;A groupwise association test for rare mutations using a weighted sum statistic.
(missing reference).&lt;/li&gt;
  &lt;li&gt;An evaluation of statistical approaches to rare variant analysis in genetic association studies.
&lt;a class=&quot;citation&quot; href=&quot;#morris2010evaluation&quot;&gt;(Morris &amp;amp; Zeggini, 2010)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Pooled association tests for rare variants in exon-resequencing studies.
&lt;a class=&quot;citation&quot; href=&quot;#price2010pooled&quot;&gt;(Price et al., 2010)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Testing for an unusual distribution of rare variants.
&lt;a class=&quot;citation&quot; href=&quot;#neale2011testing&quot;&gt;(Neale et al., 2011)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test.
(missing reference).&lt;/li&gt;
  &lt;li&gt;Optimal tests for rare variant effects in sequencing association studies.
&lt;a class=&quot;citation&quot; href=&quot;#Lee2012Optimal&quot;&gt;(Lee et al., 2012)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies.
&lt;a class=&quot;citation&quot; href=&quot;#Lee2012Optimalunified&quot;&gt;(Lee et al., 2012)&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants.
&lt;a class=&quot;citation&quot; href=&quot;#IonitaLaza2013Sequence&quot;&gt;(Ionita-Laza et al., 2013)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;major-classes-of-tests&quot;&gt;Major Classes of Tests&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Burden/Collapsing tests&lt;/li&gt;
  &lt;li&gt;Supervised/Adaptive Burden/Collapsing tests&lt;/li&gt;
  &lt;li&gt;Variance component (similarity) based tests&lt;/li&gt;
  &lt;li&gt;Omnibus tests: hedge against difference scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;burden-tests-so-far&quot;&gt;Burden Tests So Far&lt;/h3&gt;

&lt;h4 id=&quot;tests&quot;&gt;Tests&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Binary Collapsing: CAST&lt;/li&gt;
  &lt;li&gt;CMC&lt;/li&gt;
  &lt;li&gt;Count Collapsing: MZ (GRANVIL)&lt;/li&gt;
  &lt;li&gt;Weighted Sum Test&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;power-of-burden-tests-depends-on&quot;&gt;Power of burden tests depends on&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Number of associated variants&lt;/li&gt;
  &lt;li&gt;Number of non-associated variants&lt;/li&gt;
  &lt;li&gt;Direction of the effects.&lt;/li&gt;
  &lt;li&gt;Powerful if most variants are causal and have effects in the same direction.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;li-and-leal-2008&quot;&gt;Li and Leal 2008&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data
&lt;a class=&quot;citation&quot; href=&quot;#li2008methods&quot;&gt;(Li &amp;amp; Leal, 2008)&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;The collapsing method, which involves collapsing genotypes across variants and applying a univariate test, is powerful for analyzing rare variants.&lt;/li&gt;
  &lt;li&gt;The multivariate analysis is robust against inclusion of non-causal variants.&lt;/li&gt;
  &lt;li&gt;Both methods are superior to analyzing each variant individually with univariate tests.&lt;/li&gt;
  &lt;li&gt;Combined Multivariate and Collapsing (CMC) method unifies the advantages of both collapsing and multiple-marker tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;background&quot;&gt;Background&lt;/h3&gt;

&lt;p&gt;Genotype phenotype association may be tested by:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A single market test (i.e. GWAS) may use standard univariate statistical tests
(e.g., chi-squared test, Fisher’s exact test, or Cochran Armitage test for trend) 
and with the family-wise error rate (FWER) controlled by a multiple-comparison correction (e.g., Bonferroni, permutation).&lt;/li&gt;
  &lt;li&gt;A multiple-marker test,
    &lt;ul&gt;
      &lt;li&gt;which tests multiple variant sites simultaneously with the use of multivariate methods,&lt;/li&gt;
      &lt;li&gt;such as the Fisher product method,&lt;/li&gt;
      &lt;li&gt;Hotelling’s T test 
or logistic regression.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Collapsing variants can increase power.&lt;/li&gt;
  &lt;li&gt;Classification of variants can be a major problem.&lt;/li&gt;
  &lt;li&gt;If classified correct power of causal detection is higher than single- and multiple-marker testing.&lt;/li&gt;
  &lt;li&gt;If miss-classified, non-functional variants could introduce error.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;genetic-model&quot;&gt;Genetic model&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;In a locus M variants can independently cause disease.&lt;/li&gt;
  &lt;li&gt;A = allele&lt;/li&gt;
  &lt;li&gt;1-M = allele label&lt;/li&gt;
  &lt;li&gt;p = allele frequency&lt;/li&gt;
  &lt;li&gt;Gk = 0/1/2 genotype&lt;/li&gt;
  &lt;li&gt;Fki = penertance of geno for ith variant.&lt;/li&gt;
  &lt;li&gt;Relative risk of each variant i in models: additive, multiplicative, dominant, and recessive.&lt;/li&gt;
  &lt;li&gt;Prevalence of disease caused by each individual variant is calculated.&lt;/li&gt;
  &lt;li&gt;Prevalence of disease caused by entire locus can therefore be calculated.&lt;/li&gt;
  &lt;li&gt;Total prevalence can be approximated by the sum of the individual prev.&lt;/li&gt;
  &lt;li&gt;Same pheno can be due to different causal variants.&lt;/li&gt;
  &lt;li&gt;The proportion of cases as a result of variant i can be ascertained.&lt;/li&gt;
  &lt;li&gt;These cases are members of group i.&lt;/li&gt;
  &lt;li&gt;There is a total of M groups.&lt;/li&gt;
  &lt;li&gt;For group i the frequency of variant i can also be produced for 0,1,2.&lt;/li&gt;
  &lt;li&gt;Then, the expected freq of genotype G at variant i across all M groups of cases can be found.&lt;/li&gt;
  &lt;li&gt;Controls expected genotype freq at i variant can be found.&lt;/li&gt;
  &lt;li&gt;The expected freqs are used to calculate the power to detect assoc in this study.&lt;/li&gt;
  &lt;li&gt;Their focus is the omnibus test, which provides an association test of the entire locus and is not focused on any specific variant within the locus.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;single-marker-test&quot;&gt;Single-marker test&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;One approach of association studies is to test each variant site individually with the use of a univariate test and assess the significance of the &lt;em&gt;omnibus test&lt;/em&gt; (chi-squared test or F test) after correction for multiple comparisons.&lt;/li&gt;
  &lt;li&gt;For univariate tests, a contingency table can be constructed to compare genotype frequencies at each variant site in cases and controls.&lt;/li&gt;
  &lt;li&gt;Because an observation of individuals that are homozygous for the high-risk rare allele is extremely rare, AA genotypes are collapsed with Aa genotypes, and a table is constructed.&lt;/li&gt;
  &lt;li&gt;For an equal number of cases and controls, the classical Pearson chi-squared statistic for testing equal genotype frequencies in cases and controls is used.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;multiple-marker-test&quot;&gt;Multiple-marker test&lt;/h3&gt;

&lt;p&gt;Another option is to test all variants simultaneously by multivariate test&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;e.g. Fisher product method, Hotelling’s T-squared test, or multiple logistic regression.&lt;/li&gt;
  &lt;li&gt;(This is not the same as one regression run for every single variant with multiple covariates, as done in GWAS sometimes).&lt;/li&gt;
  &lt;li&gt;The authors give an example using Xiong et. al.,  (2002). Generalized T2 test for genome association studies. Am. J. Hum. Genet.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Note: Hotellings T-squared tests the differences between the (multivariate) means of different populations, where tests for univariate problems would make use of a t-test. The distribution is named for Harold Hotelling, who developed it as a generalization of Student’s t-distribution.&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;Rejecting the null hypothesis would indicate that at least one of the variants is assocciated with disease.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;basic-collapsing-method&quot;&gt;Basic collapsing method&lt;/h3&gt;

&lt;p&gt;Since variants are expected to be very rare, 
genotypes across all variants are collapsed.
An individual is coded once if they have any one variant for the locus.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;1 = rare variants present.&lt;/li&gt;
  &lt;li&gt;0 = no rare variants.
The classic Pearson chi-squared statistic can be used to test the summed proportion of variants in cases vs controls. 
The power can be calculated as shown by the authors.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;cmc-method&quot;&gt;CMC method&lt;/h3&gt;

&lt;p&gt;Unified method that combines collapsing and multivariate testing.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Markers are divided into subgroups (e.g. allele frequency).&lt;/li&gt;
  &lt;li&gt;Within group, markers are collapsed.&lt;/li&gt;
  &lt;li&gt;A multivariate test is then used (Hotelling’s T-squared).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;e.g. M markers in a locus are split into K groups.
There are n markers in group g, and so on for each group.
The multivariate test is done where for each group, the  individuals are coded with either 1 (a carrier of one or more variants) or 0 (wild-type).&lt;/p&gt;

&lt;h3 id=&quot;power-of-cmc-method&quot;&gt;Power of CMC method&lt;/h3&gt;

&lt;p&gt;Results:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Freq below 0.01 collapsed.&lt;/li&gt;
  &lt;li&gt;Freq above 0.01 not collapsed.&lt;/li&gt;
  &lt;li&gt;With misclassification, power is much better with CMC method compared to collapsing method
    &lt;ul&gt;
      &lt;li&gt;particularly when with a high frequency of the non-causal variant.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Slight (very slight I think) loss of power when &lt;em&gt;causal&lt;/em&gt; variants are high frequency compared to collapsing.
    &lt;ul&gt;
      &lt;li&gt;This scenario is unlikely otherwise single-marker test may be possible to detect.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;misclassification&quot;&gt;Misclassification&lt;/h3&gt;

&lt;p&gt;Two types of miss-classifications are considered:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;inclusion of non-functional variants and&lt;/li&gt;
  &lt;li&gt;exclusion of functional variants.&lt;/li&gt;
  &lt;li&gt;There is a large section on results of &lt;em&gt;Excluding Functional Variants&lt;/em&gt; and &lt;em&gt;Inclusion of Nonfunctional Variants&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;This section can be read for detail but is probably familiar enough for us to skip.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;effects-of-linkage-disequilibrium-ld&quot;&gt;Effects of Linkage Disequilibrium (LD)&lt;/h3&gt;

&lt;p&gt;A simulation is done in this section:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The locus has six variants, with a total allele frequency of 0.05.&lt;/li&gt;
  &lt;li&gt;Four of the variants have an allele frequency of 0.01 and are on different haplotypes.&lt;/li&gt;
  &lt;li&gt;Each of the remaining two variants, with allele frequencies of 0.005, is on one of the haplotypes where a variant with allele frequency of 0.01 resides;
    &lt;ul&gt;
      &lt;li&gt;there is complete LD between these variants (r 2 z0:5).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;For comparison purposes, a second simulation was carried out, in which all variants were on separate haplotypes.&lt;/li&gt;
  &lt;li&gt;One thousand replicates were generated, and the power was evaluated for an a level of 0.001.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Results:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In the presence of LD, the power for the single-marker test, Hotelling’s T2 test, and the collapsing method is
    &lt;ul&gt;
      &lt;li&gt;0.075, 0.63, and 0.85, respectively.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Data generated with each variant on a separate haplotype, the corresponding powers are
    &lt;ul&gt;
      &lt;li&gt;0.011, 0.451, and 0.737, respectively.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;evaluation-of-type-i-error-rate&quot;&gt;Evaluation of Type I Error Rate&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Simulation was used to generate data under the null hypothesis of no association between variants and disease status.&lt;/li&gt;
  &lt;li&gt;This process was repeated for 5000 replicates.&lt;/li&gt;
  &lt;li&gt;It was then evaluated whether or not each replicate had a p value % 0.05.&lt;/li&gt;
  &lt;li&gt;The type I error rate was estimated by the proportion of replicates with a p value below 0.05.&lt;/li&gt;
  &lt;li&gt;A type I error rate above 0.05 signifies a higher false-positive rate, and conversely, a type I error rate below 0.05 indicates a conservative test.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Hotelling’s T2 test and the collapsing method:
    &lt;ul&gt;
      &lt;li&gt;&lt;em&gt;well controlled and slightly conservative.&lt;/em&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Logistic regression for multiple-marker test and the likelihood-ratio test performed on the basis of an asymptotic chi-squared distribution:
    &lt;ul&gt;
      &lt;li&gt;&lt;em&gt;not well controlled.&lt;/em&gt;&lt;/li&gt;
      &lt;li&gt;&lt;em&gt;The inflation increases with decreasing allele frequencies.&lt;/em&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;CMC method, with multivariate Hotelling’s T2 test or logistic regression:
    &lt;ul&gt;
      &lt;li&gt;&lt;em&gt;well controlled&lt;/em&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;problems-with-this-paper&quot;&gt;Problems with this paper&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;The main problem with collapsing method is when single individuals have more than one causal variant being only counted as “1”. This more likely to occur for our protein-pathway collapse than a single gene/locus collapse.&lt;/li&gt;
  &lt;li&gt;Reason 1: dosage is not counted. Someone may have homozygous and only counts for 1 rather than 2 (or some other weight).&lt;/li&gt;
  &lt;li&gt;Reason 2: someone with more than one variant is only counted as 1 (if it is in the same allele frequency group for collapse). This may be rare in a single gene but is more likely in a protein network collapse. Usually it would be OK particularly if the second variant is in LD with the first and therefore &lt;em&gt;should&lt;/em&gt; only be counted once.&lt;/li&gt;
  &lt;li&gt;CMC code - maf threshold: In the R code, those variants with minor allele frequency below the specified maf threshold are collapsed into a single super variant. Can we group data into several frequency groups or is one &lt;em&gt;above/below&lt;/em&gt; threshold only possible?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;madsen-and-browning-2009&quot;&gt;Madsen and Browning 2009&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;A groupwise association test for rare mutations using a weighted sum statistic
&lt;a class=&quot;citation&quot; href=&quot;#madsen2009groupwise&quot;&gt;(Madsen &amp;amp; Browning, 2009)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;background-1&quot;&gt;Background&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;CAST: Cohort allelic sums test - an existing grouping method in which the number of individuals with one or more mutations in a group (e.g. gene) is compared between affected and unaffected individuals.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Cites three papers:
    &lt;ul&gt;
      &lt;li&gt;[Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, et al. (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305: 869–872.]&lt;/li&gt;
      &lt;li&gt;[Li and Leal 2008]&lt;/li&gt;
      &lt;li&gt;[Morgenthaler S, Thilly WG (2007) A strategy to discover genes that carry multi- allelic or mono-allelic risk for common diseases: A cohort allelic sums test (CAST). Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 615: 28–56.]&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;CMC: Combined Multivariate and Collapsing - the alternative method for grouping variants.
All rare variants are collapsed, 
as in the CAST method, 
and the collapsed variants are treated as a single common variant which is analysed together with the other common variants using multivariate analysis. 
In the CMC version used in [Li and Leal], 
rare variants are defined as those having a minor allele frequency (MAF) of at most 1%.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;In this study, they presume a group of candidate functional variants.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Propose a weighted-sum method in which mutations are
    &lt;ul&gt;
      &lt;li&gt;grouped according to function (e.g. gene),&lt;/li&gt;
      &lt;li&gt;and each individual is scored by a weighted sum of the mutation counts.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;To test for an excess of mutations in affected individuals, they use permutation of disease status among affected and unaffected individuals.
    &lt;ul&gt;
      &lt;li&gt;Using permutation adjusts for the weighting of the mutations and the requirement that a mutation must be observed to be included in the study.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Note that permutation of disease status results in correct type I error even in the presence of LD [Cheverud 2001, Churchill 1994], 
although relatively low LD is expected between rare variants [Li and Leal 2008, Pritchard 2001,2001].&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;main-theory&quot;&gt;Main theory&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The weighted-sum method deviates from the CAST method [5,27]:
weighting the variants differently when determining the genetic load of an individual. 
By weighting the signals from each mutation, 
the weighted sum method accentuates mutations that
are rare in the unaffected individuals, 
so that the test is not completely dominated by common mutations.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;In the CAST method,
    &lt;ul&gt;
      &lt;li&gt;common variants have a high impact on the group signal,&lt;/li&gt;
      &lt;li&gt;and if many common mutations are present in a group, 
almost all individuals will have one or more mutations.&lt;/li&gt;
      &lt;li&gt;To avoid this effect a threshold on the mutation-frequencies may be used,
as suggested in the CMC method [Li and Leal 2008].&lt;/li&gt;
      &lt;li&gt;May be difficulty to select biological meaningful threshold, affecting test outcome.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;In the weighted-sum method,
    &lt;ul&gt;
      &lt;li&gt;include mutations of all frequencies,&lt;/li&gt;
      &lt;li&gt;but mutations are weighted according to their frequency in the unaffected individuals.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;methods&quot;&gt;Methods&lt;/h3&gt;
&lt;h4 id=&quot;weighted-sum-method&quot;&gt;Weighted-Sum Method&lt;/h4&gt;
&lt;p&gt;Compares the number of mutations in a group of variants between case/control. 
Designed to identify an excess of mutations in case, compared to controls. 
Each variant belongs to a group (gene, pathway, etc.) and, 
for a group with L variants, the method is comprised of the following steps:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;(A)
    &lt;ul&gt;
      &lt;li&gt;For each variant, choose ALT/minor allele&lt;/li&gt;
      &lt;li&gt;define number in cases/controls.&lt;/li&gt;
      &lt;li&gt;calculate the weight for a variant.&lt;/li&gt;
      &lt;li&gt;It is used to down-weight mutation counts in constructing the weighted-sum score; see (B) and (C) below&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Eqn 1: Weight
    &lt;ul&gt;
      &lt;li&gt;qi = ( ((no. mut allele in control) + 1)/(2(no. controls) +2) )&lt;/li&gt;
      &lt;li&gt;Wi = root (total samples x qi(1-qi))&lt;/li&gt;
      &lt;li&gt;For 10 case (7 mut), 10 control (5 mut),
weight for variant is 0.447&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Mutation freq of controls is used so that true signal in cases is not deflated by using to total number.&lt;/li&gt;
  &lt;li&gt;A drawback when variance of scaled mut-freq, hence loss of power when mutation freq is high.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The &lt;em&gt;one&lt;/em&gt; and &lt;em&gt;two&lt;/em&gt; to the numerator and denominator, respectively, avoids zero estimates - used on next steps.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;(B)
    &lt;ul&gt;
      &lt;li&gt;Genetic score of each individual calculated.&lt;/li&gt;
      &lt;li&gt;j - Genetic score of individual.&lt;/li&gt;
      &lt;li&gt;Iij - number of mutations for variant i in individual j.&lt;/li&gt;
      &lt;li&gt;Generic model uses 0/1/2.&lt;/li&gt;
      &lt;li&gt;Recessive model only homozygous are assigned 1.&lt;/li&gt;
      &lt;li&gt;Dominant model both het and homo are assigned 1.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;(C)
    &lt;ul&gt;
      &lt;li&gt;All sample combined are ranked according to genetic scores.&lt;/li&gt;
      &lt;li&gt;The sum of ranks for cases is calculated.&lt;/li&gt;
      &lt;li&gt;Under null-hypothesis, normal distribution of random variables using a ranking procedure equivalent to that in Wilcoxon test.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;(D)
    &lt;ul&gt;
      &lt;li&gt;Case/control status is permuted&lt;/li&gt;
      &lt;li&gt;steps (A)-(C) repeated k times to sampled X1,..Xk.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;(E)
    &lt;ul&gt;
      &lt;li&gt;The average and sample standard deviation of X1,…,Xk are calculated and standardized score-sum is found (z).&lt;/li&gt;
      &lt;li&gt;Under the null hypothesis, this has an approximately standard normal distribution.&lt;/li&gt;
      &lt;li&gt;Thus, a p-value for association test can be obtained by comparing z in the quantiles of standard normal.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;An alternative p-value calculation is also provided using a standard permutation with a stopping rule (slighly slower for their power similations). That also gives uniform (0,1) distributed p-value, which might be useful for further analysis.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Permutation of case/control labels maintains the LD structure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;power-simulations&quot;&gt;Power simulations&lt;/h4&gt;
&lt;p&gt;Comparison of:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Weighted-sum method,&lt;/li&gt;
  &lt;li&gt;CAST,&lt;/li&gt;
  &lt;li&gt;CMC,&lt;/li&gt;
  &lt;li&gt;and variant-by-variant methods,&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For each set of parameters,&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;100 datasets are simulated,&lt;/li&gt;
  &lt;li&gt;the four methods are applied,&lt;/li&gt;
  &lt;li&gt;and the proportions of significant outcomes used as the power estimates.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To mimic a genome wide study of about 20,000 fairly independent human genes, 
we calculate a p-value for each gene, 
and use a significance threshold of 0.05/20000 = 2.561026 in all power simulations.&lt;/p&gt;

&lt;p&gt;Genetic Models:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Recessive, Additive and Dominant models:
    &lt;ul&gt;
      &lt;li&gt;the disease-related variants act independently,&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Recessive-Set model:
    &lt;ul&gt;
      &lt;li&gt;outcome of a mutation at one variant depends on the presence of a mutation at another variant.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Frequency spectra:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Wright’s formula used to sample the unaffected population frequency spectrum of the mutations at each variant
for the Recessive, Additive and Dominant models. 
e.g. &lt;em&gt;The mutation probability is calculated such that the probability (pM) that a haplotype contains at least one disease-risk mutation is fixed in unaffected individuals&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A method for sampling is provided and ignored in this summary.&lt;/p&gt;

&lt;p&gt;A method is provided to compare models with damaging variants and benign variants.&lt;/p&gt;

&lt;p&gt;NB. Tested Variants.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The mutation probabilities (p) can be very low for some of the sampled variants.
This means that some variants contain no mutations in any of the sampled individuals, and these variants are hence omitted in the tests.
    &lt;h3 id=&quot;encode-data&quot;&gt;ENCODE data&lt;/h3&gt;
    &lt;p&gt;The method was tested on rare variants with the frequency-spectrum of a naturally occurring population using the ENCODE resequencing data.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;results&quot;&gt;Results&lt;/h3&gt;

&lt;h4 id=&quot;proportion-of-variants-containing-mutations&quot;&gt;Proportion of Variants Containing Mutations&lt;/h4&gt;
&lt;p&gt;The mutation frequencies are sampled according to Wright’s formula, 
and hence mutations are very rare for some variants. 
Using 1000 affected and 1000 unaffected individuals, 
mutations are on average observed at only 49.4% of the variants (sd: 4.9%). 
This level is in concordance with the level from human resequencing studies.&lt;/p&gt;

&lt;p&gt;Skipping the section: Power under Varying Model Parameters.&lt;/p&gt;

&lt;h3 id=&quot;discussion&quot;&gt;Discussion&lt;/h3&gt;

&lt;p&gt;Analysis of pathways can be done in two different ways. 
One way is to use the pathway as a group, and run the test on the entire pathway. 
On the other hand, for large pathways, it may be beneficial to use a method that allows a gene with a strong signal to have a high impact on the combined pathway test-statistic (T). 
If a pathway contains G non-overlapping genes, a method to do this is to use the weighted-sum method on each gene, and combine the resulting p-values (p1,…,pG) with the Fisher product test statistic.&lt;/p&gt;

&lt;h3 id=&quot;summary&quot;&gt;Summary&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Weight based on frequency of variat in the controls.&lt;/li&gt;
  &lt;li&gt;Sum of ranks in cases.&lt;/li&gt;
  &lt;li&gt;Case/control labels permuted n times, mean, SD, standardised score-sum found (z).&lt;/li&gt;
  &lt;li&gt;p-value for association test obtained by comparing z in the quantiles of standard normal.&lt;/li&gt;
  &lt;li&gt;Alternative p-value using a standard permutation with a stopping rule gives uniform (0,1) distributed p-value, which might be useful for further analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;price-2010&quot;&gt;Price 2010&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Pooled association tests for rare variants in exon-resequencing studies.
&lt;a class=&quot;citation&quot; href=&quot;#price2010pooled&quot;&gt;(Price et al., 2010)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;The approach is based on the regression of
    &lt;ul&gt;
      &lt;li&gt;phenotypic values&lt;/li&gt;
      &lt;li&gt;on individuals’ genotype scores&lt;/li&gt;
      &lt;li&gt;subject to a variable allele-frequency threshold,&lt;/li&gt;
      &lt;li&gt;incorporating computational predictions of the functional effects of missense variants.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Statistical significance is assessed by permutation testing with variable thresholds.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;methods-1&quot;&gt;Methods&lt;/h3&gt;

&lt;h4 id=&quot;simulation-framework&quot;&gt;Simulation Framework&lt;/h4&gt;
&lt;p&gt;Simplified summary:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Model of European ancestry to recapitulate the site-frequency spectrum of non-synonymous human SNPs.&lt;/li&gt;
  &lt;li&gt;9kb for 10K individuals.&lt;/li&gt;
  &lt;li&gt;Missense only.&lt;/li&gt;
  &lt;li&gt;Phenotypes
    &lt;ul&gt;
      &lt;li&gt;quantitative normal distribution&lt;/li&gt;
      &lt;li&gt;carrier and non-carrier same variance&lt;/li&gt;
      &lt;li&gt;carriers shifted by \(\delta\) SD&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;weighted-approaches-correspond-to-implicit-assumptions-about-log-odds-ratios&quot;&gt;Weighted Approaches Correspond to Implicit Assumptions about Log Odds Ratios&lt;/h4&gt;
&lt;p&gt;(missing reference)
	- Based on the frequency in controls.
	- Relationship between log odds ratio and allele frequency the same as Madsen 2009.&lt;/p&gt;

&lt;h4 id=&quot;fixed-threshold-approach&quot;&gt;Fixed-Threshold Approach&lt;/h4&gt;
&lt;p&gt;(missing reference)&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;i indexes SNPs,&lt;/li&gt;
  &lt;li&gt;Ci is the reference allele count of SNP i in cases,&lt;/li&gt;
  &lt;li&gt;and xi (\(\xi_i\)) is equal to 1 if the frequency of SNP i is below a specified threshold (1% or 5%)&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;and is equal to 0 otherwise.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;They generalise this for quantitative phenotypes.
    &lt;ul&gt;
      &lt;li&gt;add the term: pj (\(\pi_j\)) - the phenotype of sample j.
  -
I need to compare these two papers here in more detail 
(genetic score section).
(missing reference)&lt;/li&gt;
      &lt;li&gt;Generic model uses 0/1/2.&lt;/li&gt;
      &lt;li&gt;Recessive model only homozygous are assigned 1.&lt;/li&gt;
      &lt;li&gt;Dominant model both het and homo are assigned 1.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;weighted-approach&quot;&gt;Weighted Approach&lt;/h4&gt;
&lt;p&gt;(missing reference)&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Also generalise by adding \(\pi_j\) for a quantitative phenotype value.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;variable-threshold-approach&quot;&gt;Variable-Threshold Approach&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;There exists some (unknown) threshold T for which variants with a minor allele frequency (MAF) below T are substantially more likely to be functional than are variants with an MAF above T.&lt;/li&gt;
  &lt;li&gt;z-score of a regression across samples of phenotypes versus counts of mutations meeting the allele-frequency threshold T.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;cheating-approach-to-incorporating-varphip-phip&quot;&gt;Cheating Approach to Incorporating \(\varphi(p)\) (“phi(p)”)&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Weight variants according to the probability \(\varphi(p)\)
    &lt;ul&gt;
      &lt;li&gt;that an allele of frequency p is functional,&lt;/li&gt;
      &lt;li&gt;as inferred by using the same simulated data used to evaluate power.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;incorporation-of-computational-predictions-of-functional-effects&quot;&gt;Incorporation of Computational Predictions of Functional Effects&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;Tested if incorporation of PolyPhen-2 scores improves statistical test.&lt;/li&gt;
  &lt;li&gt;To asses, simulated PolyPhen-2 predictions of damaging and neutral mutations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;application-to-empirical-data-sets&quot;&gt;Application to Empirical Data Sets&lt;/h4&gt;
&lt;p&gt;Applied to some real data.&lt;/p&gt;

&lt;h3 id=&quot;results-and-discussion&quot;&gt;Results and Discussion&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Skipping this.&lt;/li&gt;
  &lt;li&gt;The methods are an improvement.
(missing reference)&lt;/li&gt;
  &lt;li&gt;and additionaly adds a method for the quantitative variable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;neale-2011&quot;&gt;Neale 2011&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Testing for an unusual distribution of rare variants.
&lt;a class=&quot;citation&quot; href=&quot;#neale2011testing&quot;&gt;(Neale et al., 2011)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;c-alpha-tests&quot;&gt;C-alpha tests&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;They propose here the C-alpha test statistic as a novel approach for testing for
    &lt;ul&gt;
      &lt;li&gt;the presence of this mixture of effects across a set of rare variants.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Unlike existing burden tests, C-alpha, by testing the variance rather than the mean, maintains consistent power when the target set contains both risk and protective variants.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;skat&quot;&gt;SKAT&lt;/h2&gt;

&lt;p&gt;Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test
(missing reference)&lt;/p&gt;

&lt;h3 id=&quot;introduction-1&quot;&gt;Introduction&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;They propose the sequence kernel association test (SKAT),
    &lt;ul&gt;
      &lt;li&gt;regression method to test for&lt;/li&gt;
      &lt;li&gt;association between genetic variants (common and rare)&lt;/li&gt;
      &lt;li&gt;in a region&lt;/li&gt;
      &lt;li&gt;and a continuous or dichotomous trait&lt;/li&gt;
      &lt;li&gt;while easily adjusting for covariates.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;As a score-based variance-component test, SKAT can
    &lt;ul&gt;
      &lt;li&gt;quickly calculate p values analytically by&lt;/li&gt;
      &lt;li&gt;fitting the null model containing only the covariates,&lt;/li&gt;
      &lt;li&gt;and so can easily be applied to genome-wide data.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;A limitation for previous burden tests is that they implicitly assume that 
all rare variants influence the phenotype in the same direction 
and with the same magnitude of effect (after incorporating known weights).&lt;/li&gt;
  &lt;li&gt;However, one would expect most variants (common or rare) within a sequenced region 
to have little or no effect on phenotype,&lt;/li&gt;
  &lt;li&gt;whereas some variants are protective and others deleterious,&lt;/li&gt;
  &lt;li&gt;and the magnitude of each variant’s effect is likely to vary (e.g., rarer variants might have larger effects).&lt;/li&gt;
  &lt;li&gt;Collapsing across all variants is likely to introduce substantial noise.&lt;/li&gt;
  &lt;li&gt;Burden tests require either specification of thresholds for collapsing&lt;/li&gt;
  &lt;li&gt;or the use of permutation to estimate the threshold.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;methods-2&quot;&gt;Methods&lt;/h3&gt;

&lt;h4 id=&quot;sequencing-kernel-association-test&quot;&gt;Sequencing Kernel Association Test&lt;/h4&gt;
&lt;p&gt;*SKAT is a supervised test 
	- for the joint effects of multiple variants 
	- in a region 
	- on a phenotype. 
	-&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;For each region, SKAT analytically calculates a p value for association while adjusting for covariates.&lt;/li&gt;
  &lt;li&gt;Adjustments for multiple comparisons are necessary for analyzing multiple regions,&lt;/li&gt;
  &lt;li&gt;for example with the Bonferroni correction or FDR control.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notation:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;n subjects&lt;/li&gt;
  &lt;li&gt;p variants&lt;/li&gt;
  &lt;li&gt;covariates&lt;/li&gt;
  &lt;li&gt;\(y_i\) phenotype for sample i&lt;/li&gt;
  &lt;li&gt;\(X_i\) covariats&lt;/li&gt;
  &lt;li&gt;\(G_i\) genotypes (0,1,2 general)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;skat-model-and-test-for-linear-snp-effects&quot;&gt;SKAT Model and Test for Linear SNP Effects&lt;/h4&gt;
&lt;p&gt;Eqn 1 and 2:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;linear regression and&lt;/li&gt;
  &lt;li&gt;logistic regression&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eqn 3:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;variance-component score statistic&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;The weight is adjusted based on the MAF
    &lt;ul&gt;
      &lt;li&gt;if rarer variants are expected to be more likely to have larger effects.&lt;/li&gt;
      &lt;li&gt;still putting decent nonzero weights for variants with MAF 1%–5%.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;A special case of SKAT arises when the outcome is dichotomous, no covariates are included, and all \(W_j\) = 1.&lt;/li&gt;
  &lt;li&gt;Under these conditions, we show in Appendix A that the SKAT test statistic Q is equivalent to the C-alpha test statistic T.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;relationship-between-linear-skat-and-individual-variant-test-statistics&quot;&gt;Relationship between Linear SKAT and Individual Variant Test Statistics&lt;/h4&gt;
&lt;ul&gt;
  &lt;li&gt;One needs to fit the null model only a single time to be able to compute the Sj for all individual variants j as well as all regions to be tested.&lt;/li&gt;
  &lt;li&gt;Similarly, if multiple regions are under consideration, then the same mb0 can be used to compute the SKAT Q statistics for each region.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;accommodating-epistatic-effects-and-prior-information-under-the-skat&quot;&gt;Accommodating Epistatic Effects and Prior Information under the SKAT&lt;/h4&gt;
&lt;p&gt;Ability to model the epistatic effects of sequence variants on the phenotype within the flexible kernel machine regression framework.&lt;/p&gt;

&lt;p&gt;They replace Gi’b by a more flexible function f(Gi) in the linear and logistic models (1) and (2) where f(Gi) allows for rare variant by rare variant and common variant by rare-variant interactions&lt;/p&gt;

&lt;h3 id=&quot;skat-is-a-generalization-of-the-c-alpha-test&quot;&gt;SKAT Is a Generalization of the C-Alpha Test&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;The recently proposed the C-alpha test has advantages over burden tests in that
    &lt;ul&gt;
      &lt;li&gt;it explicitly models the possibility that minor alleles can be deleterious or protective.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;However, it does not currently allow for the analysis of quantitative outcomes or the inclusion of covariates and p value calculation requires permutation.&lt;/li&gt;
  &lt;li&gt;They demonstrate that for a dichotomous trait in the absence of covariates,
    &lt;ul&gt;
      &lt;li&gt;the C-alpha test statistic is equivalent to the SKAT statistic with unweighted linear kernel,&lt;/li&gt;
      &lt;li&gt;which is the same as the kernel machine test in Wu et al.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;lee-2012&quot;&gt;Lee 2012&lt;/h2&gt;
&lt;h3 id=&quot;introduction-2&quot;&gt;Introduction&lt;/h3&gt;
&lt;p&gt;SKAT-O&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Optimal tests for rare variant effects in sequencing association studies.
&lt;a class=&quot;citation&quot; href=&quot;#Lee2012Optimal&quot;&gt;(Lee et al., 2012)&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A class of tests that include burden tests and SKAT as special cases, and derive an optimal test within this class that maximizes power.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;methods-3&quot;&gt;Methods&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;They show the SKAT method&lt;/li&gt;
  &lt;li&gt;They show the buden testing methods&lt;/li&gt;
  &lt;li&gt;The show a method for:
    &lt;ul&gt;
      &lt;li&gt;\(Q_p\) = (1 - \(p\))\(Q_{SKAT}\) + \(pQ_{burden}\)&lt;/li&gt;
      &lt;li&gt;The resulting optimal test corresponds to a best linear combination of SKAT and burden tests that maximizes power.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;protein-pathway-analysis&quot;&gt;Protein pathway analysis&lt;/h2&gt;
&lt;h3 id=&quot;zhang-2021-ashg&quot;&gt;Zhang 2021 ASHG&lt;/h3&gt;

&lt;p&gt;Zhang et al. provide and excellent example of the same problem that we are
assessing. 
&lt;a class=&quot;citation&quot; href=&quot;#zhang2021computational&quot;&gt;(Zhang et al., 2021)&lt;/a&gt;
&lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002929721001543?dgcid=author&quot;&gt;A computational approach for detecting physiological homogeneity in the midst of genetic heterogeneity&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have never had a paper so similar to a project that we are currently working on.
Basically the only difference is:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;the cluster size method (theirs might work better, using total edge weights instead of protein count),&lt;/li&gt;
  &lt;li&gt;cluster separation by NHC instead of MCL,&lt;/li&gt;
  &lt;li&gt;using KEGG+REACTOME to additionally define the description of the cluster (but this can be done at the end).&lt;/li&gt;
  &lt;li&gt;The method in main analysis is carrier count instead of SKAT style (we have compared both).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Same cohort size. Really nice to see it in ASHG. Will have to do a comparison to per-gene SKAT-O to show the improvement, as they do.&lt;/p&gt;

&lt;h3 id=&quot;problems-with-this-paper-1&quot;&gt;Problems with this paper&lt;/h3&gt;

&lt;h4 id=&quot;cluster-method&quot;&gt;Cluster method&lt;/h4&gt;
&lt;p&gt;I surprised to see them apply their own novel clustering method (NHC) without some formal methods paper first - so many already exist, including 10 that can be used with their String data. 
In theory it sound good but there is no real evidence.&lt;/p&gt;

&lt;h4 id=&quot;carrier-counts-instead-of-allele-dosage&quot;&gt;Carrier counts instead of allele dosage&lt;/h4&gt;
&lt;p&gt;Their main analysis is:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;(a) with PC-adj = glm(Pheno ~ Carrier-status + PCs, family=binomial).
    &lt;ul&gt;
      &lt;li&gt;The information for carrier status is missing, presumably 0/1 resulting in a sort of burden test.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;(b) without PC-adj = fisher exact test.
    &lt;ul&gt;
      &lt;li&gt;When doing this we got great results with for our earlier tests but thought it might be too simplistic.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Using pathway SKAT-O (accounts for frequency/dosage) they got poorer results.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Main problems here:&lt;/p&gt;

&lt;p&gt;One count per gene (as data$CARRIER in analysis) presumes that any variant (homozygous or heterozygous, compound het, etc) is deleterious in the individual.
In restricted to LoF then OK, but missense variants are included in the study (and will usually be the majority of candidates).
Filters are applied to get a high quality list, but there will still be a majority of missense VUS.&lt;/p&gt;

&lt;p&gt;In the section “Application to an HSE cohort (V): 
Comparison with a burden test on the HSE cohort”, SKAT-O is not a burden test; 
while their method is actually more similar to a burden test since they are labelling an individual if they carry a qualifying variant in a gene. 
This is just a terminology error but it indicates that maybe the key differences in pathway testing have not been considered, as written up here in our review.&lt;/p&gt;

&lt;p&gt;One could argue that only counting once per gene could help prevent problems due to LD.
Although, the WES/WGS data would have the phasing information included to prune out LD if that was the argument.
Basically, I think this is just a simplification rather than an advantage. 
If SKAT had happened to work better they would have used those results.&lt;/p&gt;

&lt;p&gt;e.g. In Fig 3, TLR3 variants = 7 cases, 1 Hom, 9 Het.
This seems to amount to a carrier status of 7 in their dataset for the glm and fisher exact.&lt;/p&gt;

&lt;p&gt;Their main analysis:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# call R command to run pc-adj enrichment&lt;/span&gt;
ro.r&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;data &amp;lt;- read.table(&apos;temp_pc.txt&apos;, header=T, sep=&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&apos;)&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
ro.r&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;fit &amp;lt;- glm(data&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PHENOTYPE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; ~ data&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CARRIER&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;+data&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PC1&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;+data&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PC2&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;+data&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PC3&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;, family=&apos;binomial&apos;)&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
ro.r&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;adjusted.pval &amp;lt;- anova(fit, test=&apos;LRT&apos;)[2, 5]&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
r_pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; ro.r&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;adjusted.pval&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; r_pvalue[0]
pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; float&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;%.3E&apos;&lt;/span&gt; % Decimal&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;pvalue&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
file_out.write&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;str&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;pvalue&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; + &lt;span class=&quot;s1&quot;&gt;&apos;\t&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# pathway enrichment&lt;/span&gt;
this_case_pathway_hit &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; dict&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;each_pathway &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;pathway_gene_set_dict.keys&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;:
	pathway_gene_set &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; pathway_gene_set_dict[each_pathway]
	case_pathway_overlap &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; this_case_gene_set &amp;amp; pathway_gene_set
	case_in &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; len&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;case_pathway_overlap&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
	case_out &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; len&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;this_case_gene_set&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; - case_in
	pathway_in &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; len&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;pathway_gene_set&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
	pathway_out &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; len&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;pathway_genes&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; - pathway_in
	&lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;case_in &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; 0:
		odd, pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; stats.fisher_exact&lt;span class=&quot;o&quot;&gt;([[&lt;/span&gt;case_in, case_out], 
		&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;pathway_in, pathway_out]], &lt;span class=&quot;nv&quot;&gt;alternative&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;two-sided&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
		adj_pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; pvalue &lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; pathway_size
		&lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;adj_pvalue &amp;lt; 0.00001:
			adj_pvalue &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; float&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;%.3E&apos;&lt;/span&gt; % Decimal&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;adj_pvalue&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt;
			this_case_pathway_hit[each_pathway] &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; adj_pvalue

&lt;span class=&quot;c&quot;&gt;# They do not show any code with the SKAT parameters, presumably defaults using R.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;itan--casanova-contributions&quot;&gt;Itan &amp;amp; Casanova contributions&lt;/h2&gt;
&lt;h3 id=&quot;historical-review&quot;&gt;Historical review&lt;/h3&gt;

&lt;p&gt;We are tackling this problem of protein pathway analysis from the viewpoint of 
rare immune disease and infection.
Historically, several topics in bioinformatic and functional analysis have 
been required before we could achieve our current position of 
statistically-robust genetic discovery for rare disease:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Candidate variant select for individual genomes&lt;/li&gt;
  &lt;li&gt;Compiling reliable cohorts of patients with shared phenotypes&lt;/li&gt;
  &lt;li&gt;Protein-protein interactions&lt;/li&gt;
  &lt;li&gt;Variant collapse&lt;/li&gt;
  &lt;li&gt;Protein pathway analysis&lt;/li&gt;
  &lt;li&gt;Functional validation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Therefore, a historical review of the timeline is beneficial to illustrate the 
technical successes that allow us to reliably produce candidate variants by
genome sequencing and to validate statistically-driven results by &lt;em&gt;“traditional”&lt;/em&gt;
functional validation. 
These steps [1, 2 and 6 in our list] are exemplified by the following 
historical review. 
The complete list of steps 1-6 are touched on, 
but full validation of each step is the culmination of what we are currently 
working on and will be explicitly reviewed when we have completed our study.&lt;/p&gt;

&lt;p&gt;As one of the leaders in this field, Casanova lab has provided a lot of insider
history to the story in a great twitter thread
&lt;a href=&quot;https://twitter.com/casanova_lab/status/1397539593608695808&quot;&gt;https://twitter.com/casanova_lab/status/1397539593608695808&lt;/a&gt;.
The literature for discussion is first listed here to facilitate downloading 
but sources are referenced as usual throughout.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;2013 PNAS. The human gene connectome as a map of short cuts for morbid allele discovery.
&lt;a class=&quot;citation&quot; href=&quot;#itan2013human&quot;&gt;(Itan et al., 2013)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/23509278/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/23509278/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2014 BMC Gen. HGCS: an online tool for prioritizing disease-causing gene variants by biological distance.
&lt;a class=&quot;citation&quot; href=&quot;#itan2014hgcs&quot;&gt;(Itan et al., 2014)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/24694260/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/24694260/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2015 Front. Novel primary immunodeficiency candidate genes predicted by the human gene connectome.
&lt;a class=&quot;citation&quot; href=&quot;#itan2015novel&quot;&gt;(Itan &amp;amp; Casanova, 2015)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/25883595/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/25883595/&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;2015 PNAS. The human gene damage index as a gene-level approach to prioritizing exome variants.
&lt;a class=&quot;citation&quot; href=&quot;#itan2015human&quot;&gt;(Itan et al., 2015)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/26483451/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/26483451/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2016 NatMet. The mutation significance cutoff: gene-level thresholds for variant predictions.
&lt;a class=&quot;citation&quot; href=&quot;#itan2016mutation&quot;&gt;(Itan et al., 2016)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/26820543/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/26820543/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2015 PNAS. Can the impact of human genetic variations be predicted?
&lt;a class=&quot;citation&quot; href=&quot;#itan2015can&quot;&gt;(Itan &amp;amp; Casanova, 2015)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/26351682/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/26351682/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2018 Bioinf. PopViz: a webserver for visualizing minor allele frequencies and damage prediction scores of human genetic variations.
&lt;a class=&quot;citation&quot; href=&quot;#zhang2018popviz&quot;&gt;(Zhang et al., 2018)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/30535305/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/30535305/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2019 PNAS. Blacklisting variants common in private cohorts but not in public databases optimizes human exome analysis.
&lt;a class=&quot;citation&quot; href=&quot;#maffucci2019blacklisting&quot;&gt;(Maffucci et al., 2019)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/30591557/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/30591557/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2019 NAR. SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data.
&lt;a class=&quot;citation&quot; href=&quot;#zhang2019seqtailor&quot;&gt;(Zhang et al., 2019)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/31045209/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/31045209/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;2020 Hum Gen. The human genetic determinism of life-threatening infectious diseases: genetic heterogeneity and physiological homogeneity?
&lt;a class=&quot;citation&quot; href=&quot;#casanova2020human&quot;&gt;(Casanova &amp;amp; Abel, 2020)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/32462426/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/32462426/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2021 JCI. Herpes simplex encephalitis in a patient with a distinctive form of inherited IFNAR1 deficiency.
&lt;a class=&quot;citation&quot; href=&quot;#bastard2021herpes&quot;&gt;(Bastard et al., 2021)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/32960813/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/32960813/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2021 JCI. TLR3 controls constitutive IFN-β antiviral immunity in human fibroblasts and cortical neurons.
&lt;a class=&quot;citation&quot; href=&quot;#gao2021tlr3&quot;&gt;(Gao et al., 2021)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/33393505/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/33393505/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2021 AJHG. A computational approach for detecting physiological homogeneity in the midst of genetic heterogeneity
&lt;a class=&quot;citation&quot; href=&quot;#zhang2021computational&quot;&gt;(Zhang et al., 2021)&lt;/a&gt;
&lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/34015270/&quot;&gt;https://pubmed.ncbi.nlm.nih.gov/34015270/&lt;/a&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;To date, the main paper that implements protein pathway analysis for rare 
immune disease is that by
Peng Zhang and 
Yuval Itan &lt;a class=&quot;citation&quot; href=&quot;#zhang2021computational&quot;&gt;(Zhang et al., 2021)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The history begins with determining methods for candidate variant selection - 
the main challenge in human genomics, 
especially for individual patients who can benefit from precision medicine. 
Over the last decade, we have reached a point where we can now reasonably 
discern individual candidate-causal variants from the background noise of 
genomic variability.&lt;/p&gt;

&lt;p&gt;From the authors’ perspective, the project began around 2011 and its first 
step was concluded in 2013 with 
Yuval Itan’s
first “Human Gene Connectome” paper 
while he was a post-doc with Casanova lab. 
&lt;a class=&quot;citation&quot; href=&quot;#itan2013human&quot;&gt;(Itan et al., 2013)&lt;/a&gt;.
This software connected genes like streets in a map, 
based on their physiological relatedness.
It was soon followed by methodological development 
&lt;a class=&quot;citation&quot; href=&quot;#itan2014hgcs&quot;&gt;(Itan et al., 2014)&lt;/a&gt;
and application to inborn errors of immunity 
&lt;a class=&quot;citation&quot; href=&quot;#itan2015novel&quot;&gt;(Itan &amp;amp; Casanova, 2015)&lt;/a&gt;
or both 
&lt;a class=&quot;citation&quot; href=&quot;#itan2015human&quot;&gt;(Itan et al., 2015)&lt;/a&gt;,
and a couple of necessary detours 
&lt;a class=&quot;citation&quot; href=&quot;#itan2016mutation&quot;&gt;(Itan et al., 2016)&lt;/a&gt;
and 
&lt;a class=&quot;citation&quot; href=&quot;#maffucci2019blacklisting&quot;&gt;(Maffucci et al., 2019)&lt;/a&gt;.
A review was also written by two of the main authors during the same period
&lt;a class=&quot;citation&quot; href=&quot;#itan2015can&quot;&gt;(Itan &amp;amp; Casanova, 2015)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When Peng Zhang joined the Casanova lab as post-doc, 
Yuval Itan had started his own lab. 
However, the pair worked together to continue producing the papers on variant
interpretation and data processing
&lt;a class=&quot;citation&quot; href=&quot;#zhang2018popviz&quot;&gt;(Zhang et al., 2018)&lt;/a&gt;
and
&lt;a class=&quot;citation&quot; href=&quot;#zhang2019seqtailor&quot;&gt;(Zhang et al., 2019)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;After completing this period of work, 
they renamed “Human Gene Connectome II” the 
“Network-based Heterogeneity Clustering”.
At this point, their aims were defined as being generally indistinguishable 
from ours. 
That is, &lt;em&gt;“the detection of physiological homogeneity in a cohort of patients 
sharing a clinical phenotype but with high genetic heterogeneity - 
a hallmark of severe infectious diseases”&lt;/em&gt; (Casanova via twitter), 
as presented in their next paper on this topic
&lt;a class=&quot;citation&quot; href=&quot;#casanova2020human&quot;&gt;(Casanova &amp;amp; Abel, 2020)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Shen-Ying Zhang came on board as senior author on the next two papers.
With an excellent database of immune disorders and infections, 
the team could gradually build their software.
Exomes from patients with HSV-1 encephalitis were used for testing successive 
versions in
&lt;a class=&quot;citation&quot; href=&quot;#bastard2021herpes&quot;&gt;(Bastard et al., 2021)&lt;/a&gt; 
and 
&lt;a class=&quot;citation&quot; href=&quot;#gao2021tlr3&quot;&gt;(Gao et al., 2021)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Quoting Casanova &lt;em&gt;“When they were capable of detecting the known TLR3-IFN needles in the HSE stack, they installed camp 1, rested a bit and reported to me on the radio, while I was watching them from the basecamp with binoculars. I encouraged them to push for the final ascent and they did.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;With the same goal as our own - 
producing unbiased methods for detection of biologically-connected causal 
genetic variation - 
they found new gene variants that interact via the TLR3-IFN protein pathway,
in individual patients. 
Shen-Ying Zhang found them to be biochemically deleterious, 
an important factor for validation of genetic-first aproaches. 
In this case, Zhang &lt;em&gt;et al&lt;/em&gt; get as close to the &lt;em&gt;“gold-standard”&lt;/em&gt; 
as anyone to date.&lt;/p&gt;

&lt;p&gt;The functional validation of candidate variants in disease then provided a proof-of-principle indication that they could detect physiological homogeneity in the midst of genetic heterogeneity 
&lt;a class=&quot;citation&quot; href=&quot;#zhang2021computational&quot;&gt;(Zhang et al., 2021)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Quoting Casanova 
&lt;em&gt;“A computational approach for detecting physiological homogeneity in the midst of genetic heterogeity.
That was terrific!”&lt;/em&gt;.&lt;/p&gt;

&lt;hr /&gt;

&lt;blockquote&gt;
  &lt;p&gt;An aside on what I call the &lt;em&gt;“gold-standard”&lt;/em&gt; for our field should be:&lt;/p&gt;
  &lt;ol&gt;
    &lt;li&gt;Unbiased statistical detection of a genetic phenomenon.&lt;/li&gt;
    &lt;li&gt;Validation by functional models under systematic control.&lt;/li&gt;
  &lt;/ol&gt;

  &lt;p&gt;Part [1] Depends on patient cohorts that are large enough to detect the effect based on the phenotype strength - difficult for &lt;em&gt;rare disease&lt;/em&gt;.&lt;br /&gt;
Part [2] Depends on independently testing biological mechanisms.&lt;/p&gt;

  &lt;p&gt;This second step generally consists of two hurdles:&lt;/p&gt;
  &lt;ul&gt;
    &lt;li&gt;If the same researchers perform (1) genetic stats and (2) functional work, 
there is a bias that is difficult to avoid when trying to functionally 
validate statistically positive results.&lt;/li&gt;
    &lt;li&gt;If the statistical genetic associations happen to contain a false positive 
for something like severe rare immune disease, 
the sensitive functional models may detect a damaging response. 
One might find a truly damaging biological mechanism, 
but if the statistical genetic association is not correct then this 
biological mechanism should not be deemed causal; 
back-tracking at this stage would be very difficult due to self-imposed bias.&lt;/li&gt;
  &lt;/ul&gt;

  &lt;p&gt;Ideally, in the future we hope to see a separation of the two steps 
(stat genetics and wet-lab) such that each are performed independently. 
The wet-lab would also ideally focus their routines on a particular protein 
pathway/system with SOPs that improve accuracy and precision 
(e.g. clinical diagnostics labs, clinical trials) rather than setting up 
models for each new study.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;p&gt;Returning to our historical review, 
we have been producing our protocols similarly in parallel.
With patient cohorts of comparable sizes and phenotypes we will soon 
have a complementary validation of protocols.
However, great care is also being taken to test and select the most reliable 
statistical methods for association testing - 
an improvement to the fine work by
&lt;a class=&quot;citation&quot; href=&quot;#zhang2021computational&quot;&gt;(Zhang et al., 2021)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Best practices in candidate variant selection protocols are basically 
standardised as of 2021, 
so the main remaining task is standardisation of the protein-pathway 
annotation and association testing methods - 
steps which we will soon be ready to publish after peer-review.&lt;/p&gt;

&lt;h2 id=&quot;burden-test-power-calculation&quot;&gt;Burden test power calculation&lt;/h2&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;This&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seems&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;work&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# This script tests then the minimum number of heterozygous mutations that can exist in a test set to reach significance.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Required inputs, can be modified to read data variable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# SAMPLES (equal case and control)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# SNPs (the number of SNPs in the test pathway, can modify to loop every pathway)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# CMC settings may need adjustment to match your model&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AssotesteR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# Pathway size of 5 SNPs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 100 case and 100 controls&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# for (SNPs in 1:3) {&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
 &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SNPs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SNPs&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# combine&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mat1.data&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# contols then cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mat1.data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nrow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ncol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SNPs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;byrow&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#print(genotype)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dimnames&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;outer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;colnames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rownames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FUN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;for&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; 
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;cases and &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;controls:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;with&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SNPs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;candidate pathway SNPs&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                 &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;mutations per&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;total nucleotides in cases&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                 &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;0 mutation per&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TOTAL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;total nucleotides in controls.&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                 &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# number of cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# number of controls&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;controls&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SAMPLES&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# total (cases + controls)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;controls&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# phenotype vector&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;controls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cases&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#print(phenotype)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#mycmc = CMC(phenotype, genotype, maf=0.0000001, perm=100)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#print(mycmc)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;myskat&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SKAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;genotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;myskat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;asym.pval&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, I am filling up SNPs columns so that all cases get the variant mutation. 
I believe SKAT will give a better score to 5 unique variants rather than 5 shared variants.
Therefore, I am currently revising the code.&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# SKAT: Sequence Kernel Association Test &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;for 100 cases and  100 controls:&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;with 5 candidate pathway SNPs&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;18 mutations per 500 total nucleotides in cases&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;0 mutation per 500 total nucleotides in controls.&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] 0.04799332&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;for 100 cases and  100 controls:&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;with 10 candidate pathway SNPs&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;36 mutations per 1000 total nucleotides in cases&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] &quot;0 mutation per 1000 total nucleotides in controls.&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# [1] 0.04799332&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;You can view to genotype matrix constructions by uncommenting&lt;/p&gt;
&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# print(genotype).&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Exmple:&lt;/p&gt;
&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;198&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;199&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;for 100 cases and  100 controls:&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;with 5 candidate pathway SNPs&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2 mutations per 500 total nucleotides in cases&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;0 mutation per 500 total nucleotides in controls.&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asym.pval&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.3165047&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I will instead try to construct the test set with uniq variants rather than shared. I only realised the problem at the end.
An example of what we want instead is - checking the P-value and combinations of variants. 
Then we will find the real minimum variant count required to pass significance:&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# With a small test set&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 2 case, 2 control, 4 positions&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1 control SNP, 3 case SNP&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cases&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;controls&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;controls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;rep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cases&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;expand.grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;14&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SKAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;asym.pval&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SKAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;asym.pval&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SKAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;asym.pval&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SKAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;phenotype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geno3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;linear&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;asym.pval&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;


&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &amp;gt; geno1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#   Var1 Var2 Var3 Var4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1    0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 2    1    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 3    0    1    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 4    1    1    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval [1] 0.0619688&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &amp;gt; geno2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#    Var1 Var2 Var3 Var4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1     0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 2     1    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 9     0    0    0    1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 13    0    0    1    1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval [1] 0.0311075&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &amp;gt; geno3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#     Var1 Var2 Var3 Var4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1      0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 2      1    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1.1    0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 14     1    0    1    1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval [1] 0.0311075&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# &amp;gt; geno4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;#     Var1 Var2 Var3 Var4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1      0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 2      1    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 1.1    0    0    0    0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# 15     0    1    1    1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;# $asym.pval [1] 0.0311075&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ol class=&quot;bibliography&quot;&gt;&lt;li&gt;&lt;span id=&quot;Povysil2019rare&quot;&gt;Povysil, G., Petrovski, S., Hostyk, J., Aggarwal, V., Allen, A. S., &amp;amp; Goldstein, D. B. (2019). Rare-variant collapsing analyses for complex traits: guidelines and applications. &lt;i&gt;Nature Reviews Genetics&lt;/i&gt;, &lt;i&gt;20&lt;/i&gt;(12), 747–759. https://doi.org/10.1038/s41576-019-0177-4&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;li2008methods&quot;&gt;Li, B., &amp;amp; Leal, S. M. (2008). Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;, &lt;i&gt;83&lt;/i&gt;(3), 311–321.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;morris2010evaluation&quot;&gt;Morris, A. P., &amp;amp; Zeggini, E. (2010). An evaluation of statistical approaches to rare variant analysis in genetic association studies. &lt;i&gt;Genetic Epidemiology&lt;/i&gt;, &lt;i&gt;34&lt;/i&gt;(2), 188–193.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;price2010pooled&quot;&gt;Price, A. L., Kryukov, G. V., de Bakker, P. I. W., Purcell, S. M., Staples, J., Wei, L.-J., &amp;amp; Sunyaev, S. R. (2010). Pooled association tests for rare variants in exon-resequencing studies. &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;, &lt;i&gt;86&lt;/i&gt;(6), 832–838.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;neale2011testing&quot;&gt;Neale, B. M., Rivas, M. A., Voight, B. F., Altshuler, D., Devlin, B., Orho-Melander, M., Kathiresan, S., Purcell, S. M., Roeder, K., &amp;amp; Daly, M. J. (2011). Testing for an unusual distribution of rare variants. &lt;i&gt;PLoS Genetics&lt;/i&gt;, &lt;i&gt;7&lt;/i&gt;(3), e1001322.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;Lee2012Optimal&quot;&gt;Lee, S., Wu, M. C., &amp;amp; Lin, X. (2012). Optimal tests for rare variant effects in sequencing association studies. &lt;i&gt;Biostatistics&lt;/i&gt;, &lt;i&gt;13&lt;/i&gt;(4), 762–775. https://doi.org/10.1093/biostatistics/kxs014&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;Lee2012Optimalunified&quot;&gt;Lee, S., Emond, M. J., Bamshad, M. J., Barnes, K. C., Rieder, M. J., Nickerson, D. A., Christiani, D. C., Wurfel, M. M., &amp;amp; Lin, X. (2012). Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies. &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;, &lt;i&gt;91&lt;/i&gt;(2), 224–237. https://doi.org/https://doi.org/10.1016/j.ajhg.2012.06.007&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;IonitaLaza2013Sequence&quot;&gt;Ionita-Laza, I., Lee, S., Makarov, V., Buxbaum, J. D., &amp;amp; Lin, X. (2013). Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants. &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;, &lt;i&gt;92&lt;/i&gt;(6), 841–853. https://doi.org/https://doi.org/10.1016/j.ajhg.2013.04.015&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;madsen2009groupwise&quot;&gt;Madsen, B. E., &amp;amp; Browning, S. R. (2009). A groupwise association test for rare mutations using a weighted sum statistic. &lt;i&gt;PLoS Genet&lt;/i&gt;, &lt;i&gt;5&lt;/i&gt;(2), e1000384.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;zhang2021computational&quot;&gt;Zhang, P., Cobat, A., Lee, Y.-S., Wu, Y., Bayrak, C. S., Boccon-Gibod, C., Matuozzo, D., Lorenzo, L., Jain, A., Boucherit, S., &amp;amp; others. (2021). A computational approach for detecting physiological homogeneity in the midst of genetic heterogeneity. &lt;i&gt;The American Journal of Human Genetics&lt;/i&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2013human&quot;&gt;Itan, Y., Zhang, S.-Y., Vogt, G., Abhyankar, A., Herman, M., Nitschke, P., Fried, D., Quintana-Murci, L., Abel, L., &amp;amp; Casanova, J.-L. (2013). The human gene connectome as a map of short cuts for morbid allele discovery. &lt;i&gt;Proceedings of the National Academy of Sciences&lt;/i&gt;, &lt;i&gt;110&lt;/i&gt;(14), 5558–5563.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2014hgcs&quot;&gt;Itan, Y., Mazel, M., Mazel, B., Abhyankar, A., Nitschke, P., Quintana-Murci, L., Boisson-Dupuis, S., Boisson, B., Abel, L., Zhang, S.-Y., &amp;amp; others. (2014). HGCS: an online tool for prioritizing disease-causing gene variants by biological distance. &lt;i&gt;BMC Genomics&lt;/i&gt;, &lt;i&gt;15&lt;/i&gt;(1), 1–8.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2015novel&quot;&gt;Itan, Y., &amp;amp; Casanova, J.-L. (2015). Novel primary immunodeficiency candidate genes predicted by the human gene connectome. &lt;i&gt;Frontiers in Immunology&lt;/i&gt;, &lt;i&gt;6&lt;/i&gt;, 142.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2015human&quot;&gt;Itan, Y., Shang, L., Boisson, B., Patin, E., Bolze, A., Moncada-Vélez, M., Scott, E., Ciancanelli, M. J., Lafaille, F. G., Markle, J. G., &amp;amp; others. (2015). The human gene damage index as a gene-level approach to prioritizing exome variants. &lt;i&gt;Proceedings of the National Academy of Sciences&lt;/i&gt;, &lt;i&gt;112&lt;/i&gt;(44), 13615–13620.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2016mutation&quot;&gt;Itan, Y., Shang, L., Boisson, B., Ciancanelli, M. J., Markle, J. G., Martinez-Barricarte, R., Scott, E., Shah, I., Stenson, P. D., Gleeson, J., &amp;amp; others. (2016). The mutation significance cutoff: gene-level thresholds for variant predictions. &lt;i&gt;Nature Methods&lt;/i&gt;, &lt;i&gt;13&lt;/i&gt;(2), 109–110.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;itan2015can&quot;&gt;Itan, Y., &amp;amp; Casanova, J.-L. (2015). Can the impact of human genetic variations be predicted? &lt;i&gt;Proceedings of the National Academy of Sciences&lt;/i&gt;, &lt;i&gt;112&lt;/i&gt;(37), 11426–11427.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;zhang2018popviz&quot;&gt;Zhang, P., Bigio, B., Rapaport, F., Zhang, S.-Y., Casanova, J.-L., Abel, L., Boisson, B., &amp;amp; Itan, Y. (2018). PopViz: a webserver for visualizing minor allele frequencies and damage prediction scores of human genetic variations. &lt;i&gt;Bioinformatics&lt;/i&gt;, &lt;i&gt;34&lt;/i&gt;(24), 4307–4309.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;maffucci2019blacklisting&quot;&gt;Maffucci, P., Bigio, B., Rapaport, F., Cobat, A., Borghesi, A., Lopez, M., Patin, E., Bolze, A., Shang, L., Bendavid, M., &amp;amp; others. (2019). Blacklisting variants common in private cohorts but not in public databases optimizes human exome analysis. &lt;i&gt;Proceedings of the National Academy of Sciences&lt;/i&gt;, &lt;i&gt;116&lt;/i&gt;(3), 950–959.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;zhang2019seqtailor&quot;&gt;Zhang, P., Boisson, B., Stenson, P. D., Cooper, D. N., Casanova, J.-L., Abel, L., &amp;amp; Itan, Y. (2019). SeqTailor: a user-friendly webserver for the extraction of DNA or protein sequences from next-generation sequencing data. &lt;i&gt;Nucleic Acids Research&lt;/i&gt;, &lt;i&gt;47&lt;/i&gt;(W1), W623–W631.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;casanova2020human&quot;&gt;Casanova, J.-L., &amp;amp; Abel, L. (2020). The human genetic determinism of life-threatening infectious diseases: genetic heterogeneity and physiological homogeneity? &lt;i&gt;Human Genetics&lt;/i&gt;, &lt;i&gt;139&lt;/i&gt;, 681–694.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;bastard2021herpes&quot;&gt;Bastard, P., Manry, J., Chen, J., Rosain, J., Seeleuthner, Y., AbuZaitun, O., Lorenzo, L., Khan, T., Hasek, M., Hernandez, N., &amp;amp; others. (2021). Herpes simplex encephalitis in a patient with a distinctive form of inherited IFNAR1 deficiency. &lt;i&gt;The Journal of Clinical Investigation&lt;/i&gt;, &lt;i&gt;131&lt;/i&gt;(1).&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span id=&quot;gao2021tlr3&quot;&gt;Gao, D., Ciancanelli, M. J., Zhang, P., Harschnitz, O., Bondet, V., Hasek, M., Chen, J., Mu, X., Itan, Y., Cobat, A., &amp;amp; others. (2021). TLR3 controls constitutive IFN-βantiviral immunity in human fibroblasts and cortical neurons. &lt;i&gt;The Journal of Clinical Investigation&lt;/i&gt;, &lt;i&gt;131&lt;/i&gt;(1).&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;
</description>
        <pubDate>Mon, 26 Apr 2021 00:00:00 +0000</pubDate>
        <link>http://localhost:4000//topic/statistics/burden_test/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/burden_test/</guid>
      </item>
    
      <item>
        <title>Fisher&apos;s exact test</title>
        <description>&lt;h1 id=&quot;fishers-exact-test&quot;&gt;Fisher’s exact test&lt;/h1&gt;

&lt;p&gt;&lt;br /&gt;
https://www.statology.org/fishers-exact-test/
Statology page&lt;/p&gt;

&lt;p&gt;Fisher’s Exact Test is used to determine whether or not there is a significant association between two categorical variables. It is typically used as an alternative to the Chi-Square Test of Independence when one or more of the cell counts in a 2x2 table is less than 5.&lt;/p&gt;

&lt;p&gt;Fisher’s Exact Test uses the following null and alternative hypotheses:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;H0: (null hypothesis) The two variables are independent.&lt;/li&gt;
  &lt;li&gt;H1: (alternative hypothesis) The two variables are not independent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Suppose we have the following 2x2 table:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Group 1&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Group 2&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Row Total&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Category 1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;a&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;b&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;a+b&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Category 2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;c&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;d&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;c+d&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Column Total&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;a+c&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;b+d&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;a+b+c+d = n&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The one-tailed p value for Fisher’s Exact Test is calculated as:&lt;/p&gt;

\[p = (a+b)!(c+d)!(a+c)!(b+d)! / (a!b!c!d!n!)\]

&lt;p&gt;This produces the same p value as the CDF of the hypergeometric distribution with the following parameters:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;population size = n&lt;/li&gt;
  &lt;li&gt;population “successes” = a+b&lt;/li&gt;
  &lt;li&gt;sample size = a + c&lt;/li&gt;
  &lt;li&gt;sample “successes” = a&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The two-tailed p value for Fisher’s Exact Test is less straightforward to calculate and can’t be found by simply multiplying the one-tailed p value by two. To find the two-tailed p value, we recommend using the Fisher’s Exact Test Calculator.&lt;/p&gt;

&lt;h2 id=&quot;fishers-exact-test-example&quot;&gt;Fisher’s Exact Test: Example&lt;/h2&gt;

&lt;p&gt;Suppose we want to know whether or not gender is associated with political party preference. We take a simple random sample of 25 voters and survey them on their political party preference. The following table shows the results of the survey:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Sex&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;Democrat&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Republican&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Total&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Male&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;13&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Female&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;8&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Total&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;12&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;13&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;25&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Step 1: Define the hypotheses.&lt;/p&gt;

&lt;p&gt;We will perform Fisher’s Exact Test using the following hypotheses:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;H0: Gender and political party preference are independent.&lt;/li&gt;
  &lt;li&gt;H1: Gender and political party preference are not independent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Step 2: Calculated the two-tailed p value.&lt;/p&gt;

&lt;p&gt;We can use the Fisher’s Exact Test Calculator with the following input:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Group1&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Group2&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Category 1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Category 2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;ul&gt;
  &lt;li&gt;One tailed p value: 0.081178&lt;/li&gt;
  &lt;li&gt;Two-tailed p value is 0.115239&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fisher’s Exact test example&lt;/p&gt;

&lt;p&gt;The two-tailed p value is 0.115239. Since this value is less than 0.05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that there is any statistically significant association between gender and political party preference.&lt;/p&gt;

&lt;h2 id=&quot;additional-resources&quot;&gt;Additional Resources&lt;/h2&gt;

&lt;p&gt;The following tutorials explain how to perform a Fisher’s Exact Test using different statistical programs:&lt;/p&gt;

&lt;p&gt;How to Perform Fisher’s Exact Test in R
https://www.statology.org/fishers-exact-test-in-r/&lt;/p&gt;

&lt;p&gt;Fisher’s Exact Test Calculator
https://www.statology.org/fishers-exact-test-calculator/&lt;/p&gt;

</description>
        <pubDate>Fri, 25 Sep 2020 00:00:00 +0200</pubDate>
        <link>http://localhost:4000//topic/statistics/fisher/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/fisher/</guid>
      </item>
    
      <item>
        <title>Upervised learning K-Means clustering</title>
        <description>&lt;h1 id=&quot;upervised-learning-k-means-clustering&quot;&gt;Upervised learning K-Means clustering&lt;/h1&gt;
&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#major-requirements&quot; id=&quot;markdown-toc-major-requirements&quot;&gt;Major Requirements&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#k-means-algorithm&quot; id=&quot;markdown-toc-k-means-algorithm&quot;&gt;K-Means Algorithm&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-requirements-1&quot; id=&quot;markdown-toc-major-requirements-1&quot;&gt;Major Requirements&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#interpreting-the-clusters&quot; id=&quot;markdown-toc-interpreting-the-clusters&quot;&gt;Interpreting the Clusters&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-requirements-2&quot; id=&quot;markdown-toc-major-requirements-2&quot;&gt;Major Requirements&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#selecting-the-number-of-clusters&quot; id=&quot;markdown-toc-selecting-the-number-of-clusters&quot;&gt;Selecting the Number of Clusters&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-requirements-3&quot; id=&quot;markdown-toc-major-requirements-3&quot;&gt;Major Requirements&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#evaluating-the-number-of-clusters&quot; id=&quot;markdown-toc-evaluating-the-number-of-clusters&quot;&gt;Evaluating the Number of Clusters&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-requirements-4&quot; id=&quot;markdown-toc-major-requirements-4&quot;&gt;Major Requirements&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#hierarchical-clustering&quot; id=&quot;markdown-toc-hierarchical-clustering&quot;&gt;Hierarchical Clustering&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#major-advantages&quot; id=&quot;markdown-toc-major-advantages&quot;&gt;Major Advantages&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post is based on reading and figures from: Practical Statistics for Data Scientists 50+ Essential Concepts Using R and Python by Peter Bruce, Andrew Bruce, Peter Gedeck.&lt;/p&gt;

&lt;p&gt;Clustering is a technique to divide data into different groups, where the records in each group are similar to one another. A goal of clustering is to identify significant and meaningful groups of data. The groups can be used directly, analyzed in more depth, or passed as a feature or an outcome to a predictive regression or classification model.&lt;/p&gt;

&lt;p&gt;K-means was the first clustering method to be developed; it is still widely used, owing its popularity to the relative simplicity of the algorithm and its ability to scale to large data sets. K-means divides the data into K clusters by minimizing the sum of the squared distances of each record to the mean of its assigned cluster. This is referred to as the within-cluster sum of squares or within-cluster SS. K-means does not ensure the clusters will have the same size but finds the clusters that are the best separated.&lt;/p&gt;

&lt;h3 id=&quot;major-requirements&quot;&gt;Major Requirements&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Divide data into different groups&lt;/li&gt;
  &lt;li&gt;Identify significant and meaningful groups of data&lt;/li&gt;
  &lt;li&gt;Minimize the sum of the squared distances of each record to the mean of its assigned cluster&lt;/li&gt;
  &lt;li&gt;Use within-cluster sum of squares or within-cluster SS&lt;/li&gt;
  &lt;li&gt;Find the clusters that are the best separated&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The sum of squares within a cluster is given by:&lt;/p&gt;

\[SS_k = \sum_{i \in Cluster_k} (x_i - x_k)^2 + (y_i - y_k)^2\]

&lt;p&gt;In clustering records with multiple variables (the typical case), the term cluster mean refers not to a single number but to the vector of means of the variables.&lt;/p&gt;

&lt;p&gt;A typical use of clustering is to locate natural, separate clusters in the data. Another application is to divide the data into a predetermined number of separate groups, where clustering is used to ensure the groups are as different as possible from one another.&lt;/p&gt;

&lt;p&gt;For example, suppose we want to divide daily stock returns into four groups. K-means clustering can be used to separate the data into the best groupings. Note that daily stock returns are reported in a fashion that is, in effect, standardized, so we do not need to normalize the data.&lt;/p&gt;

&lt;p&gt;In R, K-means clustering can be performed using the kmeans function. For example, the following finds four clusters based on two variables—the daily stock returns for ExxonMobil (XOM) and Chevron (CVX):&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sp500_px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row.names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sp500_px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;2011-01-01&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;XOM&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;CVX&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;km&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kmeans&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The cluster assignment for each record is returned as the cluster component (R):&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;km&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;head&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                      &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;XOM&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CVX&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-03&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.73680496&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2406809&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-04&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.16866845&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.5845157&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-05&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.02663055&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.4469854&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-06&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.24855834&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.9197513&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-07&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.33732892&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.1805111&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2011-01-10&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.00000000&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.4641675&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;       &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The first six records are assigned to either cluster 1 or cluster 2. The means of the clusters are also returned (R):&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;XOM&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;        &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CVX&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.3284864&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-0.5669135&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.2410159&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.3342130&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1.1439800&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;-1.7502975&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0.9568628&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1.3708892&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Clusters 1 and 3 represent “down” markets, while clusters 2 and 4 represent “up markets.”&lt;/p&gt;

&lt;p&gt;As the K-means algorithm uses randomized starting points, the results may differ between subsequent runs and different implementations of the method. In general, you should check that the fluctuations aren’t too large.&lt;/p&gt;

&lt;p&gt;In this example, with just two variables, it is straightforward to visualize the clusters and their means:&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;XOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CVX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;color&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;XOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CVX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stroke&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The resulting plot shows the cluster assignments and the cluster means. Note that K-means will assign records to clusters, even if those clusters are not well separated (which can be useful if you need to optimally divide records into groups).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/kmeans_fig_1.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Figure 1. The clusters of K-means applied to daily stock returns for ExxonMobil and Chevron (the cluster centers are highlighted with black symbols)&lt;/p&gt;

&lt;h2 id=&quot;k-means-algorithm&quot;&gt;K-Means Algorithm&lt;/h2&gt;

&lt;p&gt;K-means is a clustering algorithm that can be applied to a data set with p variables X1, …, Xp. While the exact solution to K-means is computationally very difficult, heuristic algorithms provide an efficient way to compute a locally optimal solution.&lt;/p&gt;

&lt;h3 id=&quot;major-requirements-1&quot;&gt;Major Requirements&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Specify the number of clusters K and an initial set of cluster means&lt;/li&gt;
  &lt;li&gt;Assign each record to the nearest cluster mean as measured by squared distance&lt;/li&gt;
  &lt;li&gt;Compute the new cluster means based on the assignment of records&lt;/li&gt;
  &lt;li&gt;Iterate steps 2 and 3 until the assignment of records to clusters does not change&lt;/li&gt;
  &lt;li&gt;Specify an initial set of cluster means by randomly assigning each record to one of the K clusters and then finding the means of those clusters&lt;/li&gt;
  &lt;li&gt;Run the algorithm several times using different random samples to initialize the algorithm&lt;/li&gt;
  &lt;li&gt;Use the iteration that has the lowest within-cluster sum of squares to get the K-means result&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The sum of squares within a cluster is given by:&lt;/p&gt;

\[SS_k = \sum_{i \in Cluster_k} (x_i - x_k)^2 + (y_i - y_k)^2\]

&lt;p&gt;The nstart parameter to the R function kmeans allows you to specify the number of random starts to try. For example, the following code runs K-means to find 5 clusters using 10 different starting cluster means:&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;syms&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;AAPL&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;MSFT&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;CSCO&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;INTC&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;CVX&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;XOM&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;SLB&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;COP&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
           &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;JPM&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;WFC&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;USB&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;AXP&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;WMT&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;TGT&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;HD&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;COST&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sp500_px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row.names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sp500_px&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;2011-01-01&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;syms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;km&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kmeans&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nstart&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The function automatically returns the best solution out of the 10 different starting points. You can use the argument iter.max to set the maximum number of iterations the algorithm is allowed for each random start.&lt;/p&gt;

&lt;h2 id=&quot;interpreting-the-clusters&quot;&gt;Interpreting the Clusters&lt;/h2&gt;

&lt;p&gt;An important part of cluster analysis can involve the interpretation of the clusters. The two most important outputs from kmeans are the sizes of the clusters and the cluster means.&lt;/p&gt;

&lt;h3 id=&quot;major-requirements-2&quot;&gt;Major Requirements&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Check the sizes of the resulting clusters with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;km$size&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Plot the centers of the clusters with ggplot and gather&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can plot the centers of the clusters using the gather function in conjunction with ggplot:&lt;/p&gt;

&lt;div class=&quot;language-R highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;as.data.frame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;paste&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Cluster&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;row.names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gather&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Cluster&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Mean&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mean&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;centers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;identity&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;identity&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;.75&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;facet_grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Cluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;scales&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;free_y&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The resulting plot reveals the nature of each cluster. For example, clusters 4 and 5 correspond to days on which the market is down and up, respectively. Clusters 2 and 3 are characterized by up-market days for consumer stocks and down-market days for energy stocks, respectively. Finally, cluster 1 captures the days in which energy stocks were up and consumer stocks were down.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/kmeans_fig_2.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Figure 2.  The means of the variables in each cluster (cluster means).&lt;/p&gt;

&lt;h2 id=&quot;selecting-the-number-of-clusters&quot;&gt;Selecting the Number of Clusters&lt;/h2&gt;

&lt;p&gt;The K-means algorithm requires that you specify the number of clusters K. Sometimes the number of clusters is driven by the application, while in other cases a statistical approach can be used.&lt;/p&gt;

&lt;h3 id=&quot;major-requirements-3&quot;&gt;Major Requirements&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Use practical or managerial considerations to determine the number of desired clusters&lt;/li&gt;
  &lt;li&gt;Use a statistical approach such as the elbow method to find the “best” number of clusters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A common approach, called the elbow method, is to identify when the set of clusters explains “most” of the variance in the data. Adding new clusters beyond this set contributes relatively little in the variance explained. The elbow is the point where the cumulative variance explained flattens out after rising steeply, hence the name of the method.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/kmeans_fig_3.png&quot; width=&quot;100%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Figure 3 shows the cumulative percent of variance explained for the default data for the number of clusters ranging from 2 to 15. In this example, there is no obvious elbow point since the incremental increase in variance explained drops gradually. This is fairly typical in data that does not have well-defined clusters. This is perhaps a drawback of the elbow method, but it does reveal the nature of the data.&lt;/p&gt;

&lt;h2 id=&quot;evaluating-the-number-of-clusters&quot;&gt;Evaluating the Number of Clusters&lt;/h2&gt;

&lt;p&gt;In R, the kmeans function doesn’t provide a single command for applying the elbow method, but it can be readily applied from the output of kmeans. You can evaluate how many clusters to retain by considering the replicability of the clusters on new data and whether the clusters are interpretable and related to a general characteristic of the data.&lt;/p&gt;

&lt;h3 id=&quot;major-requirements-4&quot;&gt;Major Requirements&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Apply the elbow method to determine the optimal number of clusters&lt;/li&gt;
  &lt;li&gt;Use cross-validation to assess the replicability of the clusters on new data&lt;/li&gt;
  &lt;li&gt;Consider practical considerations in choosing the number of clusters since there is no statistically determined optimal number of clusters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The algorithm iteratively assigns records to the nearest cluster mean until cluster assignments do not change, and the number of desired clusters, K, is chosen by the user.&lt;/p&gt;

&lt;h2 id=&quot;hierarchical-clustering&quot;&gt;Hierarchical Clustering&lt;/h2&gt;

&lt;p&gt;Hierarchical clustering is an alternative to K-means that can yield different clusters and is more sensitive in discovering outlying or aberrant groups or records. It allows the user to visualize the effect of specifying different numbers of clusters and lends itself to an intuitive graphical display for easier interpretation of the clusters.&lt;/p&gt;

&lt;h3 id=&quot;major-advantages&quot;&gt;Major Advantages&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Can yield different clusters than K-means&lt;/li&gt;
  &lt;li&gt;More sensitive in discovering outlying or aberrant groups or records&lt;/li&gt;
  &lt;li&gt;Allows visualization of the effect of specifying different numbers of clusters&lt;/li&gt;
  &lt;li&gt;Intuitive graphical display for easier interpretation of the clusters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Hierarchical clustering involves a dendrogram that shows the relationships between clusters. The user can specify the desired number of clusters by cutting the dendrogram at a specific height.&lt;/p&gt;

</description>
        <pubDate>Sun, 23 Apr 2023 00:00:00 +0000</pubDate>
        <link>http://localhost:4000//topic/statistics/kmeans/</link>
        <guid isPermaLink="true">http://localhost:4000//topic/statistics/kmeans/</guid>
      </item>
    
  </channel>
</rss>
