Skip to content

Releases: nextstrain/augur

24.2.3

23 Feb 22:08
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

  • filter: Updated the help and report text of --min-length to explicitly state that the minimum length filter only counts standard nucleotide characters A, C, G, or T (case-insensitive). This has been the behavior since version 3.0.3.dev1, but has never been explicitly documented. #1422 (@joverlee521)
  • frequencies: Fixed a bug introduced in 24.2.0 and 24.1.0 that prevented --regions from working when providing regions other than the default "global" region. #1424

24.2.2

16 Feb 22:58
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

  • filter: In versions 24.2.0 and 24.2.1, --query stopped working in cases where internal optimizations added in version 24.2.0 failed to parse the columns from the query. It now falls back to non-optimized behavior that allows queries to work. #1418 (@victorlin)
  • filter: Handle backtick quoting in internal optimizations of --query. #1417 (@victorlin)

24.2.1

14 Feb 00:36
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

  • frequencies: Fixed a bug introduced in 24.2.0 that prevented --method diffusion from working alongside --tree. #1412 (@victorlin)

24.2.0

12 Feb 21:07
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • filter: Added a new option --query-columns that allows specifying what columns are used in --query along with the expected data types. If unspecified, automatic detection of columns and types is attempted. #1294 (@victorlin)
  • augur.io.read_metadata: A new optional columns argument allows specifying a subset of columns to load. The default behavior still loads all columns, so this is not a breaking change. #1294 (@victorlin)
  • augur parse: A new optional --output-id-field argument allows the user to select any ID field for the produced FASTA file (e.g. 'accession' instead of 'name' or 'strain'). #1403 (@j23414)
    • When no --output-id-field is given and the data has both name and strain fields, continue to preferentially use name over strain as the sequence ID field; but, throw a deprecation warning that the order will be switched to prefer strain over name in the future to be consistent with the rest of Augur.
    • Added entry to DEPRECATED.md.
  • Compression should now be supported for all input and output files. Please open an issue if you find one that doesn't! #1381 (@victorlin)

Bug Fixes

  • filter: In version 24.1.0, automatic conversion of boolean columns was accidentally removed. It has been restored with additional support for empty values evaluated as None. #1410 (@victorlin)
  • filter: The order of rows in --output-metadata and --output-strains now reflects the order in the original --metadata. #1294 (@victorlin)
  • filter, frequencies, refine: Performance improvements to reading the input metadata file. #1294 (@victorlin)
    • For filter, this comes with increased writing times for --output-metadata and --output-strains. However, net I/O speed still decreased during testing of this change.
  • filter: Updated the help text of --include and --include-where to explicitly state that this can add strains that are missing an entry from --sequences. #1389 (@victorlin)
  • filter: Fixed the summary messages to properly reflect force-inclusion of strains that are missing an entry from --sequences. #1389 (@victorlin)
  • filter: Updated wording of summary messages. #1389 (@victorlin)
  • Enforce UTF-8 encoding when reading and writing files. Improve error messages when a non-UTF-8 file is used. #1381 (@victorlin)

24.1.0

30 Jan 20:56
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • augur.io.read_metadata: A new optional dtype argument allows custom data types for all columns. Automatic type inference still happens by default, so this is not a breaking change. #1252 (@victorlin)
  • augur.io.read_vcf has been removed and usage replaced with TreeTime's function of the same name which has improved validation of the VCF file. #1366 (@jameshadfield)

Bug Fixes

  • filter, frequencies, refine: Speed up reading of the metadata file. #1252 (@victorlin)
  • traits: Previously, columns with only numeric values were treated as numerical data. These are now treated as categorical data for discrete trait analysis. #1252 (@victorlin)
  • Support Biopython ≥1.82 by requiring bcbio-gff ≥0.7.1. #1400 (@victorlin)

24.0.0

22 Jan 23:25
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Major Changes

  • ancestral, translate: For VCF inputs please ensure you are using TreeTime 0.11.2 or later. A large number of bugfixes and improvements have been added in both Augur and TreeTime. #1355 and TreeTime #263 (@jameshadfield)
  • ancestral, translate: GenBank files now require the (GFF mandatory) source feature to be present. #1351 (@jameshadfield)
  • ancestral, translate: For GFF files, we extract the genome/sequence coordinates by inspecting the sequence-region pragma, region type and/or source type. This information is now required. #1351 (@jameshadfield)

Features

  • ancestral, translate: Improvements to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
    • Output VCF will better match the input VCF, including CHROM name and ploidy encoding.
    • VCF inputs now require --vcf-reference-output
    • AA sequences are now exported for the tree root
    • VCF writing is now 3 orders of magnitude faster (dataset dependent)
  • ancestral, translate: A range of improvements to how we parse GFF and GenBank reference files. #1351 (@jameshadfield)
    • translate will now always export a 'nuc' annotation in the output JSON, allowing it to pass validation
    • Gene/CDS names of 'nuc' are now forbidden.
    • If a Gene/CDS in the GFF/GenBank file is unparsed we now print a warning.
  • ancestral: For VCF alignments, a VCF output file is now only created when requested via --output-vcf. #1344 (@jameshadfield)
  • ancestral: Improvements to command line arguments. #1344 (@jameshadfield)
    • Incompatible arguments are now checked, especially related to VCF vs FASTA inputs.
    • --vcf-reference and --root-sequence are now mutually exclusive.
  • translate: Tree nodes are checked against the node-data JSON input to ensure sequences are present. #1348 (@jameshadfield)
  • utils::load_features: This function may now raise AugurError. #1351 (@jameshadfield)
  • export v2: Automatically minify large outputs. Use --no-minify-json to disable this default behavior. #1352 (@victorlin)
  • Added a new file DEPRECATED.md to document timelines and progress of deprecated features in the Augur CLI and Python API. #1371 (@victorlin)

Bug Fixes

  • ancestral, translate: Various fixes to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
    • Fix incorrect (but passing) tests
    • Fix case-sensitive sequence comparisons between the root and reference sequences.
    • Fix a bug where ambiguous alleles are not inferred (see #1380 for full details).
    • Fix a bug where positions with no sequence information were assigned a base because the mask was not being computed (see #1382 for full details).
    • More than one ALT allele is now correctly parsed
    • Mutations followed by an insertion are now parsed
    • Unchanged ref genotypes are now encoded as '0' rather than '.'
    • ALT alleles "*" are now valid (introduced in VCF spec 4.2, but observed in VCF 4.1 files)
    • Positions with no variation are no longer exported
  • ancestral, translate: Fixes for JSON (non-VCF) inputs. #1355 (@jameshadfield)
    • The "reference" translations are now from the provided reference sequence, not from the root of the tree. #1355 (@jameshadfield)
    • Fix a bug where positions with no sequence information were assigned a base because the mask was not applied (see #1382 for full details)
  • ancestral, translate: Avoid incompatibilities with Biopython >=1.82. #1374, #1387 (@victorlin)
  • ancestral, translate: Address Biopython deprecation warnings. #1379 (@victorlin)
  • ancestral: Previously, the help text for --genes falsely claimed that it could accept a file. Now, it can truly claim that. #1353 (@victorlin)
  • translate: The 'source' ID for GFF files is now ignored as a potential gene feature (it is still used for overall nuc coords). #1348 (@jameshadfield)
  • translate: Improvements to command line arguments. #1348 (@jameshadfield)
    • --tree and --ancestral-sequences are now required arguments.
    • separate VCF-only arguments into their own group
  • translate: Fixes a bug in the parsing behaviour of GFF files whereby the presence of the --genes command line argument would change how we read individual GFF lines. Issue #1349, PR #1351 (@jameshadfield)
  • If TreeTimeError is encountered Augur now exits with code 2 rather than 0. (This restores the original behaviour.) #1367 (@jameshadfield)
  • Deprecate read_strains from augur.utils and add it to the public API under augur.io. #1353 (@victorlin)

23.1.1

07 Nov 21:42
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

23.1.0

22 Sep 16:44
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • Support treetime 0.11.* #1310 (@corneliusroemer)
  • export: Allow minimal export using only a (newick) tree in augur export v2. #1299 (@jameshadfield)
  • A number of schema updates and improvements #1299 (@jameshadfield)
    • We now require all nodes to have node_attrs on them with one of div or num_date present
    • Some never-used properties are removed from the schemas, including a pattern for defining nucleotide INDELs which was never used by augur or auspice.
    • Tip label defaults are now settable within the auspice-config JSON
    • Empty colorings definitions are allowed (the tree will be grey in Auspice)

Bug fixes

  • ancestral: Export amino acid sequences inferred for the root node of the tree in the node data JSON output for compatibility with augur translate output. #1317 (@huddlej)

23.0.0

05 Sep 19:17
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Major Changes

Features

  • export v2: Allow the root-sequence data to be included (inlined) in the main dataset JSON file, avoiding the need for a sidecar _root-sequence.json file. #1295 (@jameshadfield)

22.4.0

29 Aug 21:01
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • refine: Export covariance matrix and standard deviation for clock rate regression in the node data JSON output when these values are calculated by TreeTime. These new values appear in the clock data structure of the JSON output as cov and rate_std keys, respectively. #1284 (@huddlej)

Bug fixes

  • clades: Fix outputs for genes named NA (previously the value was replaced by nan). #1293 (@rneher)
  • distance: Improve documentation by describing how gaps get treated as indels and how users can ignore specific characters in distance calculations. #1285 (@huddlej)
  • Fix help output compatibility with non-Unicode streams. #1290 (@victorlin)