Releases: nextstrain/augur
Releases · nextstrain/augur
26.1.0
These release notes are automatically extracted from the full changelog.
Features
- ancestral, translate: Add
--skip-validation
as an alias to--validation-mode=skip
. #1656 (@victorlin) - clades: Allow customizing the validation of input node data JSON files with
--validation-mode
and--skip-validation
. #1656 (@victorlin) - tree: When using iqtree, check for all synonyms of default args when detecting potential conflicts, e.g.
--threads-max
is equivalent to-ntmax
. Previously, we were only checking for the latter. Also use new, preferred IQtree2 option names (e.g.--polytomy
instead of-czb
etc.). #1547 (@corneliusroemer)
Bug Fixes
- index: Previously specifying a directory that does not exist in the path to
--output
would result in an incorrect error stating that the input file does not exist. It now shows the correct path responsible for the error. #1644 (@victorlin) - curate format-dates: Update help docs and improve failure messages to show use of
--expected-date-formats
. #1653 (@joverlee521)
26.0.0
These release notes are automatically extracted from the full changelog.
Major Changes
- filter: Duplicate header names in the FASTA file (
--sequences
) will now result in an error. #1613 (@victorlin) - parse: When both
strain
andname
fields are present, thestrain
field will now be used as the sequence ID field. #1629 (@victorlin) - merge: Generated source columns (e.g.
__source_metadata_{NAME}
) are now omitted by default. They may be explicitly included with--source-columns=TEMPLATE
or explicitly omitted with--no-source-columns
. This may be a breaking change for any existing uses ofaugur merge
relying on the generated columns, though asaugur merge
is relatively new we believe usage to be scant if extant at all. #1625 #1632 (@tsibley)
Bug Fixes
- filter: Previously, when
--subsample-max-sequences
was slightly lower than the number of groups, it was possible to fail with an uncaughtAssertionError
. Internal calculations have been adjusted to prevent this from happening. #1588 #1598 (@victorlin)
25.4.0
These release notes are automatically extracted from the full changelog.
Features
- merge: Table-specific id columns and delimiters may now be specified, e.g.
--metadata-id-columns X=id Y=strain
and--metadata-delimiters X=, Y=';'
, to allow more precise behaviour and avoid ordering issues. #1594 (@tsibley)
Bug Fixes
- filter: Improved warning and error messages in the case of missing columns. #1604 (@victorlin)
- merge: Any user-customized
~/.sqliterc
file is now ignored so it doesn't breakaugur merge
's internal use of SQLite. #1608 (@tsibley) - merge: Non-id columns in metadata inputs that would conflict with the output id column are now forbidden and will cause an error if present. Previously they would overwrite values in the output id column, causing incorrect output. #1593 (@tsibley)
- import: Spaces in BEAST MCC tree annotations (for example, from a discrete state reconstruction) no longer break
augur import beast
's parsing. #1610 (@watronfire)
25.3.0
These release notes are automatically extracted from the full changelog.
Features
- A new command,
augur merge
, now allows for generalized merging of two or more metadata tables. #1563 (@tsibley) - Two new commands,
augur read-file
andaugur write-file
, now allow external programs to do i/o like Augur by piping from/to these new commands. They provide handling of compression formats and newlines consistent with the rest of Augur. #1562 (@tsibley) - A new debugging mode can be enabled by setting the
AUGUR_DEBUG
environment variable to1
(or any non-empty value). Currently the only effect is to print more information about handled (i.e. anticipated) errors. For example, stack traces and parent exceptions in an exception chain are normally omitted for handled errors, but setting this env var includes them. Future debugging and troubleshooting features, like verbose operation logging, will likely also condition on this new debugging mode. #1577 (@tsibley) - filter: Added the ability to use weights in subsampling. See help text of
--group-by-weights
and the updated Filtering and Subsampling guide for more information. #1454 (@victorlin)
Bug Fixes
- Embedded newlines in quoted field values of metadata files read/written by many commands, annotation files read by
augur curate apply-record-annotations
, and index files written byaugur index
are now properly handled. #1561 #1564 (@tsibley) - Output written to stderr (e.g. informational messages, warnings, errors, etc.) is now always line-buffered regardless of the Python version in use. This helps with interleaved stderr and stdout. Previously, stderr was block-buffered on Python 3.8 and line-buffered on 3.9 and higher. #1563 (@tsibley)
25.2.0
These release notes are automatically extracted from the full changelog.
Features
- export v2: we now limit numerical precision on floats in the JSON. This should not change how a dataset is displayed / interpreted in Auspice but allows the gzipped & minimised JSON filesize to be reduced by around 30% (dataset-dependent). #1512 (@jameshadfield)
- traits, export v2:
augur traits
now reports all confidence values above 0.1% rather than limiting them to the top 4 results. There is no change in the eventual Auspice dataset asaugur export v2
will still only consider the top 4. #1512 (@jameshadfield) - curate: Excel (
.xlsx
and.xls
) and OpenOffice (.ods
) spreadsheet files are now also supported as metadata inputs (--metadata
). The first sheet in the workbook is read as tabular data. #1550 (@tsibley)
Bug Fixes
25.1.1
25.1.0
These release notes are automatically extracted from the full changelog.
Features
- Support xopen major version 2. Deprecate v1. Schedule for removal around November 2024. #1532 (@corneliusroemer)
- Support networkx major version 3. #1534 (@corneliusroemer)
25.0.0
These release notes are automatically extracted from the full changelog.
Major changes
- curate format-dates: Raises an error if provided date field does not exist in records. #1509 (@joverlee521)
- All curate subcommands: Verifies all input records have the same fields and raises an error if a record does not have matching fields. #1518 (@joverlee521)
Features
- Added a new sub-command
augur curate apply-geolocation-rules
to apply user curated geolocation rules to the geolocation fields in a metadata file. Previously, this was available as a script within the nextstrain/ingest repo. #1491 (@victorlin) - Added a default color for the "Asia" region that will be used in
augur export
is no custom colors are provided. #1490 (@joverlee521) - Added a new sub-command
augur curate apply-record-annotations
to apply user curated annotations to existing fields in a metadata file. Previously, this was available as amerge-user-metadata
in the nextstrain/ingest repo. #1495 (@joverlee521) - Added a new sub-command
augur curate abbreviate-authors
to abbreviate lists of authors to " et al." Previously, this was avaliable as thetransform-authors
script within the nextstrain/ingest repo. [#1483][] (@genehack) - Added a new sub-command
augur curate parse-genbank-location
to parse thegeo_loc_name
field from GenBank reconds. Previously, this was available as thetranslate-genbank-location
script within the nextstrain/ingest repo. [#1485][] (@genehack) - curate format-dates: Added defaults to
--expected-date-formats
so that ISO 8601 dates (%Y-%m-%d
) and its various masked forms (e.g.%Y-XX-XX
) are automatically parsed by the command. #1501 (@joverlee521) - Added a new sub-command
augur curate transform-strain-name
to filter strain names based on matching a regular expression. Previously, this was available as thetransform-strain-names
script within the nextstrain/ingest repo. #1514 (@genehack) - Added a new sub-command
augur curate rename
to rename field / column names. Previously, a similar version was available as thetransform-field-names
script within the nextstrain/ingest repo however the behaviour is slightly changed here. #1506 (@jameshadfield)
Bug Fixes
- filter: Improve speed of checking duplicates in metadata, especially for large files. #1466 (@victorlin)
- curate: Stop adding double quotes to the metadata TSV output when field values have internal quotes. #1493 (@joverlee521)
- curate format-dates: Mask empty date values as
XXXX-XX-XX
to represent unknown dates. #1509 (@joverlee521)
24.4.0
These release notes are automatically extracted from the full changelog.
Features
- All commands: Allow repeating an option that takes multiple values. Previously, if multiple option flags were specified (e.g.
--exclude-where 'region=A' --exclude-where 'region=B'
), only the last one was used. Now, all values are used. #1445 (@victorlin) - ancestral, translate: output node data files are now validated. The argument
--validation-mode
is added which controls this behaviour (default: error). This argument also controls validation of the input node-data file (ancestral only). #1440 (@jameshadfield) - export: Updated default latitudes and longitudes for geography traits. This only applies if you are not using
--lat-longs
to override the built in mappings. #1449 (@trvrb)
Bug Fixes
- validation: we no longer exit with a non-zero exit code when the requested validation mode is "warn" #1440 (@jameshadfield)
- validation: we no longer perform any validation when the requested validation mode is "skip" #1440 (@jameshadfield)
- filter: Send all log messages to
stderr
. This allows output to be written tostdout
(e.g.--output-strains /dev/stdout
). #1459 (@victorlin)
24.3.0
These release notes are automatically extracted from the full changelog.
Features
- filter: Added a new option
--max-length
to filter out sequences that are longer than a certain amount of base pairs. #1429 (@victorlin) - parse: Added support for environments that use pandas 2.x. #1436 (@emollier, @victorlin)
Bug Fixes
- filter: Updated docs with an example of tiered subsampling. #1425 (@victorlin)
- export: Fixes bug #1433 introduced in v23.1.0, that causes validation to fail when gene names start with
nuc
, e.g.nucleocapsid
. #1434 (@corneliusroemer) - import: Fixes bug introduced in v24.2.0 that prevented
import beast
from running. #1439 (@tomkinsc) - translate, ancestral: Compound CDS are now exported as segmented CDS and are now viewable in Auspice. #1438 (@jameshadfield)