-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathHepSeq2.html
174 lines (145 loc) · 6.6 KB
/
HepSeq2.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=us-ascii">
<title>HapSeq2 - A Program for Haplotype Phasing and Genotype Calling from Next Generation Sequencing Data
Using Haplotype Information of Jumping Reads
</title>
</head>
<body bgcolor=white text=navy vlink=green alink=green link=purple>
<basefont font="Times New Roman">
<p>
<center> <font size=+2 color=purple>HapSeq2 - A Program for Haplotype Phasing and Genotype Calling from Next Generation Sequencing
Data Using Haplotype Information of Jumping Reads
<br> <br>
Degui Zhi and Kui Zhang <br> <br>
</font></center>
</p>
<p> <hr size="#3" width="100%" color="RED"></hr> </p>
<p align="left"> <font size=+2 color=purple> <b>About HapSeq2</b> </font></p>
<p align="left"> HapSeq2 is a program for genotyping calling and haplotype phasing from next generation sequencing data using haplotype information
from jumping reads. Previously, we developed a Hidden Markov Model (HMM) based method for genotype calling and haplotype phasing from next generation
data that can take into account jumping reads information across two adjacent potential polymorphic sites.
Our method extends the HMM in the Thunder program (Li, et al., 2010) and explicitly models jumping reads information as emission
probabilities conditional on the states of adjacent sites (Zhi et al., 2012). The method is implemented in the program, HapSeq.
The program is written with C++ and based on the source code of Thunder program provided by Drs. Yun Li and Goncalo Abecasis.
The detailed description of the method implemented in HapSeq can be found in our manuscript (Zhi et al., 2012).
</p>
<p>
Recently, we extended our methods to incorporate the haplotype information of multiple adjacent and/or non-adjacent sites from sequencing reads.
Our model is inspired by the model implemented in HASH that was originally designed to phase individual genomes (Bansal et al., 2008).
We develop a new hybrid MCMC algorithm that combines the Gibbs sampling algorithm of HapSeq and
Metropolis-Hasting algorithm similar to that of HASH and is computationally feasible.
We show by simulation and real data from the 1000 Genomes Project that our model offers superior performance
for haplotype phasing as well as genotype calling for population NGS data over existing methods.
The new method is implemented in the program, HapSeq2. For the detailed description of the method implemented in HapSeq2,
please refer to our manuscripts (Zhi et al., 2012; Zhang and Zhi, 2013). </p>
<p>
<p align="left"> <font size=+2 color=purple> <b>HapSeq2 Download</b> </font></p>
<ul>
<p>
<li> <a href="HapSeq2.zip"">HapSeq2.zip</a>. This file includes the complied program of HapSeq2 with the Window XP and Linux operating systems, the program manual and the example with corresponding input files.
</p>
</ul>
</p>
<p align="left"> <font size=+2 color=purple> <b>Program Manual</b> </font>
<ul>
<p>
<li> <a href="HapSeq2-Manual.pdf"">HapSeq2 Manual</a>
</p>
</ul>
</p>
<p align="left"> <font size=+2 color=purple> <b>Input Files</b> </font>
<ul>
<p>
<li> A set of input files are needed for HapSeq2. Please refer to
<a href="HapSeq2-Manual.pdf""> HapSeq2 Manual </a>
for the detailed description of these input files. In general, these files should be
extracted from a set of read alignment (BAM) files.
We have implemented several programs to extract the input files required by HapSeq2. The programs and the pipeline to
use these programs are described at this <a href="http://www.ssg.uab.edu/wiki/x/RoAPAQ " > Wiki Page</a>.
</p>
</ul>
</p>
<p align="left"> <font size=+1 color=purple> <b>Old Program</b> </font>
<p>
<ul>
<li> Please go to this <a href ="HapSeq/hapseq-index.html"> Web Site <a> to find
and download the old version of HapSeq.
</li>
</ul>
</p>
<p align="left"> <font size=+1 color=purple> <b>Program History</b> </font>
<p>
<ul>
<li> December 15, 2011: The program HapSeq is implemented and released.
</li>
<li> January 4, 2013: The program HapSeq2 is implemented and released.
</li>
</ul>
</p>
<p align="left"> <font size=+1 color=purple> <b>References</b> </font>
<p>
<ul>
<li> Bansal V, Halpern AL, Axelord N, Bafna V. 2008. An MCMC algorithm for haplotype assembly from whole-genome sequence data. <I> Genome Research </I> 18: 1336-1346.
</li>
<li> Li Y, Willer CJ, Scheet P, Abecasis GR. 2010. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. <I>Genetic Epidemiology </I> 34: 816-834.
</li>
<li> Zhi D, Wu J, Liu N, Zhang K. 2012. Genotype calling from next-generation sequencing data using haplotype information of reads. <I> Bioinformatics </I> 28: 938-946.
</li>
<li>
Zhang K, Zhi D. 2013. Joint haplotype phasing and genotype calling of multiple individ-uals using long-range haplotype informative reads. Submitted to <I> Bioinformatics</I>.
</li>
</ul>
</p>
<p align="left"> <font size=+1 color=purple> <b>How to Cite</b> </font>
<p> Please cite the follwoing papers if you use our program:
<p>
<ul>
<li> Zhi D, Wu J, Liu N, Zhang K. 2012. Genotype calling from next-generation sequencing data using haplotype information of reads. <I> Bioinformatics </I> 28: 938-946.
</li>
<li>
Zhang K, Zhi D. 2013. Joint haplotype phasing and genotype calling of multiple individ-uals using long-range haplotype informative reads. Submitted to <I> Bioinformatics</I>.
</li>
</ul>
</p>
<p>
<hr size="#3" width="100%" color="RED"></hr>
<p align="left">
<b>We are planning to update our program regularly. You are welcome to suggest features
that you want us to implement into this program. We greatly appreciate if you could point out any bugs
when you use our program. Our contact information is:
</b>
<ul>
<table cellpadding=30>
<tr>
<td>
<b>Degui Zhi, Ph.D. </b><br>
Section on Statistical Genetics<br>
Department of Biostatistics <br>
University of Alabama at Birmingham<br>
Birmingham, AL, 35294 <br>
(205) 975-9192 (phone) <br>
(205) 975-2540 (fax) <br>
Email: <a href="mailto:[email protected]">[email protected]</a> <br>
Homepage: <a href="http://www.soph.uab.edu/ssg/people/zhi"> http://www.soph.uab.edu/ssg/people/zhi </a> <br>
</td>
<td> <strong> or </strong> </td>
<td>
<b>Kui Zhang, Ph.D. </b><br>
Section on Statistical Genetics<br>
Department of Biostatistics <br>
University of Alabama at Birmingham<br>
Birmingham, AL, 35294 <br>
(205) 996-4094 (phone) <br>
(205) 975-2540 (fax) <br>
Email: <a href="mailto:[email protected]">[email protected]</a> <br>
Homepage: <a href="http://www.soph.uab.edu/ssg/zhang"> http://www.soph.uab.edu/ssg/zhang </a> <br>
</td>
</table>
</ul>
<div align="left">
<i><b>Created Date:</b> January 4, 2013</i><br>
<i><b>Last Updated Date:</b> January 4, 2013</i><br>
</div>
</body>
</html>