First Major Release + GUI
After much testing and feedback from users (thank you!), we have gone through
the code for HyDe and have updated the package architecture and documentation.
Many of the changes in this update are cosmetic and will not change how users
interact with HyDe. However, one addition that is likely going to change the user experience
is the addition of a new script, hyde_gui.py
, which will launch a graphical user interface (GUI)
to allow users to set up analyses in an interactive window.
Below are other changes that were made during this update:
- Dropping support for Python 2.7. Python 2.7 is set to lose support in the next year, so we are moving forward with Python 3 development only.
- Submodules in the
core
folder have been moved up one level. Thecore
module itself no longer exists. - The
visualization
module has been removed. We instead provide code for generating the same types of plots in the documentation, but no longer include the plotting functions as part of the software.
Ignore missing or ambiguous sites
- All scripts now have the option to ignore missing or ambiguous bases when
running an analysis with HyDe. Just add
--ignore_amb_sites
as a flag at the command line.
Removing dependence on hyde_cpp
-
The
hyde_cpp
executable has been deprecated in favor of using the*_hyde.py
scripts, which are just as fast. It will be available as its own GitHub repository for legacy purposes. -
The
phyde.analyze
submodule has been removed, and all functions calling thehyde_cpp
program have been either modified or removed. -
We anticipate that this will make the installation process simpler, and will allow us to distribute the software as purely a Python package (w/ Cython for speed).
New multithreaded scripts
-
All of the
*_hyde.py
scripts now have multithreaded versions:run_hyde_mp.py
,individual_hyde_mp.py
, andbootstrap_hyde_mp.py
. Parallelization with these scripts works best with data sets that have a lot of species. We use themultiprocess
module, which will need to be installed:pip install multiprocess
. -
We also added a quiet option (
-q
or--quiet
) to suppress printing to stdout while the program is running.
Input file format update
- All of the
*_hyde.py
scripts now accept DNA sequence data files in sequential Phylip format (ie, you don't need to remove the header line).hyde_cpp
also accepts data in this format. - We maintain backwards compatibility with the previous input format without the header information as well.
Bug fixes
- Corrected bootstrap_hyde.py output.
- Bootstrap class now correctly calculates # of bootstrap reps.
- Better handling of files with extra empty line at the end.
Changes introducing incompatibilities with previous versions:
- The ability to do bootstrapping was removed from
hyde_cpp
. This functionality is now provided by thebootstrap_hyde.py
script. All scripts relying on bootstrapping withhyde_cpp
have been changed accordingly.
New workflow scripts:
We introduce a new workflow for HyDe using three main Python scripts that can be
called from the command line. They are automatically installed with the phyde
module. A new input file is also introduced for specifying specific triples that
are to be tested using the methods implemented in the scripts (see below).
run_hyde.py
: this script remains largely the same as in previous versions. We have added the ability to test specific hypotheses rather than running a full analysis (all triples in all directions) as well.individual_hyde.py
: this new script takes a table of specified triples and tests all individuals in the hybrid population separately.bootstrap_hyde.py
: this new script will bootstrap resample individuals within the hybrid populations that are specified by the table of triples.
Input format for specifying triples:
The three scripts listed above can take as input a three column table where each row specifies the names of taxa that are to be tested for hybridization. The order is (Parent 1, Hybrid, Parent 2) as below:
sp1 sp2 sp3
sp1 sp2 sp4
sp2 sp3 sp4
.
.
.
The individual_hyde.py
and bootstrap_hyde.py
files can also take output files
from previous runs of HyDe. These contain header information that the scripts will
automatically check for and will then parse the first three columns of the file.
This release can be used as a legacy version for the old workflow for HyDe. Ideally users should switch to the newest version (see above).
First release of HyDe.
Features:
hyde_cpp
executable for performing full scale hybridization detection analyses on phylogenomic data sets.
phyde
Python module for performing individual hypothesis tests, parsing and analyzing results from hyde_cpp, plotting, and calculating D-Statistics.