<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>jbiol149</ui>
   <ji>1475-4924</ji>
   <fm>
      <dochead>Minireview</dochead>
      <bibl>
         <title>
            <p>Can modular analysis identify disease-associated candidate genes for therapeutics?</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Tegn&#233;r</snm>
               <fnm>Jesper</fnm>
               <insr iid="I1"/>
               <email>jesper.tegner@ki.se</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Medicine, Center for Molecular Medicine, Karolinska University Hospital, 171 76 Solna, Stockholm, Sweden</p>
            </ins>
         </insg>
         <source>Journal of Biology</source>
         <issn>1475-4924</issn>
         <pubdate>2009</pubdate>
         <volume>8</volume>
         <issue>5</issue>
         <fpage>48</fpage>
         <url>http://jbiol.com/content/8/5/48</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19519937</pubid>
               <pubid idtype="doi">10.1186/jbiol149</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>28</day>
               <month>05</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Complex diseases such as allergy change gene expression in several cell types and tissues. Benson and colleagues have now shown, in a paper in <it>BMC Systems Biology</it>, that this complexity can be studied effectively using an integrated experimental and computational modular analysis. Their strategy revealed a core of allergy-associated genes of potential therapeutic value.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="bmcbiol_series_title" id="bmcbiolcommentary">Commentary</classification>
         <classification type="BMC" subtype="bmcbiol_series_editor" id="bmcbiolcommentary"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>Technologies are an important driver of progress in the medical sciences. Recent advances in array-based and sequence-based instrumentation have opened up new ways to monitor the inner molecular world of the cells and tissues that might be relevant to human diseases. Yet it is far from evident how these large datasets should be analyzed and how they can be integrated with other sources of data in order to become informative. Conversely, the medical community expects nothing less than a list of predictive biomarkers reflecting the risk of disease or its progression and an understanding of the cellular mechanisms involved in disease. However, comparing microarray samples from healthy and diseased individuals using a differential gene expression protocol generates a list of thousands of genes, and it is not clear which genes are important for what.</p>
         <p>A key idea, originating from engineering science in general and computer science in particular, is the notion of 'divide and conquer', which refers to first breaking down a problem into smaller sub-problems that are simple enough to allow an analysis and then combining the solutions to the subproblems, which gives the solution to the original problem. Modular analysis of genomic data implements this strategy by dividing the original genomic data into smaller number of modules and then conquering the reduced complexity by using these modules for prioritization to give a shorter list of disease-associated genes. Such genes could either be causal drivers of disease or secondary reactions to disease that could potentially be useful biomarkers.</p>
         <p>Benson and colleagues, in a recent paper in <it>BMC Systems Biology </it><abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, have used a modular approach to study allergic asthma. They managed to divide the complexity and arrive at the gene encoding the interleukin-7 receptor (IL7R) as a putative key regulator in allergic asthma. Importantly, their computational analysis is accompanied by experiments. Here, I put their analysis in the context of other modular approaches and discuss the possible use of this methodology for finding and prioritizing useful candidates for therapeutics.</p>
      </sec>
      <sec>
         <st>
            <p>Dividing complex biological data into modules of disease-associated genes</p>
         </st>
         <p>Not surprisingly, there are several different ideas on how to divide and conquer high-throughput functional genomics data. I will restrict my discussion here to gene expression data, although similar remarks could be made for sequence data. Conceptually there are two distinct problems. One is: given a module of disease-associated genes, how can we compute and/or experimentally predict which genes are good candidates for therapeutics? Before discussing this problem I will first give an overview of different approaches to the other problem: identifying a module of genes.</p>
         <p>A module is a group of genes that are related in some way to each other and therefore a module is effectively a measure of similarity. Grouping genes into modules depends on an exact mathematical definition of similarity. For example, if similarity is defined as the distance in a network, then a graph theoretical calculation will be used. However, if gene functional associations are used, then gene similarity will be measured in terms of gene ontology (GO) or correlation in gene expression values. Therefore, different algorithms are used for dividing the genes into modules, a fact that could be confusing for the clinical researcher.</p>
         <p>The need to reduce the complexity of the original high-throughput gene expression data was realized early on in its analysis <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Applying established engineering concepts, such as principal component analysis (PCA) and singular value decomposition (SVD), reduced the dimensionality of the data. Instead of analyzing scattered points (the samples) in a high-dimensional space equaling the number of genes, the data could thereby be projected into a two- to four-dimensional space. However, it turned out to be difficult to make a biological interpretation of the resulting linear combinations of large numbers of genes. This problem forced the development of different strategies in which the available knowledge on a limited number of genes could be used to predict the functions of as-yet uncharacterized genes.</p>
         <p>The use of hierarchical clustering in the classic compendium study on yeast data by Rosetta Inpharmatics <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> grouped genes (shown as rows) by their similarity of expression across several experimental conditions (columns). Novel gene function was then predicted by inspecting genes in the same cluster as genes with known functions. Subsequent work by Eran Segal and colleagues <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> developed more statistically sound procedures for identifying robust modules using a Bayesian formalism applied to microarray data generated from cancer samples.</p>
         <p>It became clear, however, that a similarity measure based only on correlations was insufficient, because the clusters (modules) or Bayesian modules did not have an internal network structure that could be used for a more refined analysis. As a consequence, a large number of studies addressing this problem appeared in the literature at the beginning of 2004. The idea was that if we could identify the wiring within cellular networks, various different algorithms could be applied to find 'connected groups' in such networks. Such an analysis would then provide more biological insights into the mechanisms of disease.</p>
         <p>Now, how can such networks be found using only a small number of experimental samples with a large number of genes? This is an impossible problem from the point of view of engineering system identification, because the number of possible networks consistent with the data is prohibitively large <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. The key simplifying insight came from Ideker and Lauffenburger <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and was later developed by Nicolas Luscombe and colleagues in a pioneering paper <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. Here, the edges (or connections) in the network were simply defined by transcription factor binding experiments, and gene expression data were used to select the subsets of edges that were active under different conditions.</p>
         <p>This idea of defining edges in a network using a static scaffold has since been reused using various data types (protein-protein interaction data, pathways from a database, text mining and DNA variants). The network of interest is then defined by combining the gene expression data with the scaffold, leaving only the active edges. By searching through such an active network using graph algorithms it is then possible to define 'more' connected parts in a well defined manner, thereby providing modules with an intrinsic network structure.</p>
         <p>All the above approaches basically begin with a large, complex dataset, which is then simplified by dividing the data into smaller modules. Interestingly, Vidal and colleagues <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> demonstrated that this process can be reversed. They instead began with four well characterized breast cancer genes and, by using these ideas, constructed a module in which the genes were 'close' as defined by expression and proteomic data in several species.</p>
      </sec>
      <sec>
         <st>
            <p>Finding an allergy-associated module</p>
         </st>
         <p>Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> have now contributed to a disease-oriented modular analysis by combining several of the above ideas in a novel manner, as summarized in the flow chart in Figure <figr fid="F1">1</figr>. First, because allergic disease involves multiple cells in different tissues and because no prior characterization of key genes was available, they turned to several different sets of gene expression microarray data in order to find a reference disease-associated gene around which they could construct a module. Using the idea that disease-associated genes tend to interact, they could search for other disease-associated genes that were 'close'. For this purpose, the authors used a graph algorithm that identified a connected clique of 103 disease-associated genes from the microarray data.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Flowchart of the modular analysis by Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p>
            </caption>
            <text>
               <p>Flowchart of the modular analysis by Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Integration of several public gene expression datasets revealed a group of shared (blue) and closely connected clique (red and black) disease-associated genes. A subset of these genes were found to share the T-cell receptor signalling pathway, an observation that was then validated by independent experimentation. To identify a transcription factor (GATA3) regulating one of this subset, the <it>ITK </it>gene, a promoter analysis was performed. The final module of 37 disease-associated genes consisted of genes listed in public databases as having relevant expression patterns and interacting with GATA3.</p>
            </text>
            <graphic file="jbiol149-1"/>
         </fig>
         <p>The T-cell receptor signaling pathway turned out to be a pathway shared by these 103 genes, as detected by the Ingenuity Pathway Analysis tool, which identifies physical, transcriptional and enzymatic interactions from the literature <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Experimental analysis of this pathway in patient-derived cells revealed strong activation of the <it>ITK </it>gene, which is also known to be located in the genomic susceptibility region for allergy. Combining a promoter analysis of the <it>ITK </it>gene with expression data revealed that the transcription factor GATA3 regulated <it>ITK</it>.</p>
         <p>Finally, using available databases, 47 genes were identified as interacting with GATA3. The expression data were used to filter out 10 inactive genes, thus leaving a final module of 37 disease-associated genes around the GATA3 transcription factor <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The construction of this module was accompanied by several experimental tests at various stages, providing confidence to the analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Conquering the modules &#8211; selecting therapeutic targets within the module</p>
         </st>
         <p>The problem of selecting therapeutic targets within a module has not received much attention in studies that have used a modular approach for reducing complexity. There are various ideas from graph theory on how to compute mathematically defined properties, such as clustering and connectivity in large networks, which then could suggest which nodes are essential. However, essentiality is not necessarily equivalent to disease association. Experimental investigators have instead performed target selection using the full dataset in combination with extensive experimental testing. This is, by most measures, an inefficient and expensive procedure.</p>
         <p>The analysis by Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> is important because it highlights the difficulty of selecting a disease-associated target from a module of 37 genes despite the elegant prior reduction of complexity. They resorted to using a connectivity criterion, selecting the <it>IL7R </it>gene because it had the largest number of connections, and they were also able to demonstrate that perturbing the <it>IL7R </it>gene affected other genes and the T-cell phenotype. There are probably several other disease-associated genes in their module that warrant further experimental investigation.</p>
      </sec>
      <sec>
         <st>
            <p>Beyond allergy &#8211; translation to the clinic</p>
         </st>
         <p>Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> have introduced a useful procedure for defining a module of disease-associated genes. As with most complex diseases, the study of allergy is complicated by the fact that the disease affects several cell types and tissues. The process of identifying such modules therefore requires the kind of stringent experimental validation as was performed by the Benson team <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Despite their careful analysis, because there are other transcription factors for the <it>ITK </it>gene that are active in the expression datasets there is a significant risk that several disease-associated genes remain that were not captured in their module.</p>
         <p>The second step of selecting a gene for therapeutics from a module is even more problematic because we are currently lacking systematic tools for this selection problem. Furthermore, it is not unlikely that an efficient therapy could require targeting of several disease-associated genes simultaneously. However, the number of combinations of three genes that can be chosen from a small ten-gene module, for example, quickly exceeds what is experimentally feasible to study.</p>
         <p>In conclusion, Benson and colleagues <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> have devised an interesting method for finding disease-associated genes, but it needs to be evaluated on other complex diseases. Their study also makes clear that the problem of prioritizing disease-associated genes within a module for therapeutic studies in the clinic is still unsolved.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I thank the Swedish Research Council for support.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>A module-based analytical strategy to identify novel disease associated genes shows an inhibitory role for interleukin 7 receptor in allergic inflammation</p>
            </title>
            <aug>
               <au>
                  <snm>Mobini</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Andersson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Erjef&#228;lt</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hahn-Zoric</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Langston</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Perkins</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cardell</snm>
                  <fnm>L-O</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>BMC Systems Biol</source>
            <pubdate>2009</pubdate>
            <volume>3</volume>
            <fpage>19</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/1752-0509-3-19</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Singular value decomposition for genome-wide expression data and modelling</p>
            </title>
            <aug>
               <au>
                  <snm>Alter</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <fpage>10101</fpage>
            <lpage>10106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">27718</pubid>
                  <pubid idtype="pmpid" link="fulltext">10963673</pubid>
                  <pubid idtype="doi">10.1073/pnas.97.18.10101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Functional discovery via a compendium of expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Coffey</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Kidd</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Slade</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Stepaniants</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Gachotte</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chakraburtty</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bard</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Friend</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2000</pubdate>
            <volume>102</volume>
            <fpage>109</fpage>
            <lpage>126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)00015-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10929718</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Module networks: identifying regulatory modules and their condition specific regulators from gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shapira</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Regev</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pe'er</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>34</volume>
            <fpage>166</fpage>
            <lpage>176</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1165</pubid>
                  <pubid idtype="pmpid" link="fulltext">12740579</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Perturbations to uncover gene networks</p>
            </title>
            <aug>
               <au>
                  <snm>Tegn&#233;r</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bj&#246;rkegren</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>34</fpage>
            <lpage>41</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2006.11.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">17098324</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Building with a scaffold: emerging strategies for high- to low-level cellular modeling</p>
            </title>
            <aug>
               <au>
                  <snm>Ideker</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lauffenburger</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Trends Biotechnol</source>
            <pubdate>2003</pubdate>
            <volume>21</volume>
            <fpage>252</fpage>
            <lpage>262</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/S0167-7799(03)00115-X</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Genomic analysis of regulatory network dynamics reveals large topological changes</p>
            </title>
            <aug>
               <au>
                  <snm>Luscombe</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Babu</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Teichmann</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>431</volume>
            <fpage>308</fpage>
            <lpage>312</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02782</pubid>
                  <pubid idtype="pmpid" link="fulltext">15372033</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Network modelling links breast cancer susceptibility and centrosome dysfunction</p>
            </title>
            <aug>
               <au>
                  <snm>Pujana</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Han</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Starita</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Tewari</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ahn</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Rennert</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Moreno</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kirchhoff</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gold</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Assmann</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Elshamy</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Rual</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rozek</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Gelman</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Gunsalus</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Greenberg</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Sobhian</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bertin</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Venkatesan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ayivi-Guedehoussou</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sol&#233;</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Hern&#225;ndez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>L&#225;zaro</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Nathanson</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Cusick</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Offit</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Livingston</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Gruber</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Parvin</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Vidal</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <fpage>1338</fpage>
            <lpage>1349</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng.2007.2</pubid>
                  <pubid idtype="pmpid" link="fulltext">17922014</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
