Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--resume option unclear #164

Closed
RNieuwenhuis opened this issue Jul 8, 2022 · 2 comments
Closed

--resume option unclear #164

RNieuwenhuis opened this issue Jul 8, 2022 · 2 comments

Comments

@RNieuwenhuis
Copy link

Hi,

The option of resuming a failed run is not clear to me as the program does not want to resume at all, unless unsing the --force_rewrite argument. But that argument makes it all run over anyway, so what is then the point of having a resume function?

@IsmailM
Copy link
Member

IsmailM commented Jul 8, 2022

Hi,

In the Help text (genevalidator --help), the --force_rewrite and --resume options are explained as follows:

    -f, --force_rewrite              Rewrites over existing output.
    ...
    -r, --resume [DIR]               Resumes an analysis. This works by using previously generated
                                     temporary files instead of recomputing the analysis where possible.
                                     A new output directory is created where the output files are
                                     generated. This assumes that the input file is the same as that
                                     used in the analysis you are resuming from.

As alluded to here, when resuming, you cannot use the original output directory (i.e. --resume and --output_dir should not point to the same directory), but rather you need to have a different output directory.

When resuming, all the relevant files from the original resuming folder are copied across to a new output folder and the command continues from where it left off...

On the other hand, the --force_rewrite simply deletes the folder before starting the analysis.

I have created a new Github Issue (#165) to address this issue.

So, for example, if you run the following command and then press ^C before it can finish:

❯ genevalidator -n 20 -d blast_db/swissprot -o orig_output_dir exemplar_data/protein_data.fa
==> Analysing input arguments

==> Running BLAST. This may take a while.

==> Extracting fasta sequences for each BLAST HSP from the BLAST database
==> Validating input sequences

 No     Score             Identifier    No_Hits LengthCluster   LengthRank      GeneMerge       Duplication     MissingExtraSequences
 10         0             PB26730-RA          0 Not enough evidence     Not enough evidence     Not enough evidence     Not enough evidence     Not enough evidence
  5        67             PB18768-RA          7 132 [124, 137]  43%     0.0     1.0     Not enough evidence
  1        67             GB10056-PA         10 339 [317, 365]  40%     -3.5    1.0     100% conserved; 100% extra; 0% missing.
  6        64          SI2.2.0_02651         34 660 [404, 1242] 15% (too short) 0.0     1.0     100% conserved; 100% extra; 0% missing.
  3        64             GB10113-PA         19 659 [300, 506]  37%     -2.4    0.25    100% conserved; 100% extra; 0% missing.
^C/home/ismailm/tmp/genevalidator/lib/vendor/ruby/2.4.0/gems/genevalidator-2.1.11/lib/genevalidator/pool.rb:49:in `join': Interrupt
        from /home/ismailm/tmp/genevalidator/lib/vendor/ruby/2.4.0/gems/genevalidator-2.1.11/lib/genevalidator/pool.rb:49:in `map'
        from /home/ismailm/tmp/genevalidator/lib/vendor/ruby/2.4.0/gems/genevalidator-2.1.11/lib/genevalidator/pool.rb:49:in `shutdown'
        from /home/ismailm/tmp/genevalidator/lib/vendor/ruby/2.4.0/gems/genevalidator-2.1.11/lib/genevalidator/validation.rb:64:in `run_validations'
        from /home/ismailm/tmp/genevalidator/lib/vendor/ruby/2.4.0/gems/genevalidator-2.1.11/lib/genevalidator.rb:58:in `run'
        from /home/ismailm/tmp/genevalidator/lib/app/bin/genevalidator:423:in `<main>'

You can then resume as follows - and it should only take a few seconds to get to the previously stopped place:

❯ genevalidator -n 20 -d blast_db/swissprot -r orig_output_dir -o new_output_dir exemplar_data/protein_data.fa

Hopefully, that clarifies things - feel free to re-open the issue if you have any further questions.

@IsmailM IsmailM closed this as completed Jul 8, 2022
@RNieuwenhuis
Copy link
Author

Yes, that is indeed enlightening! Thank you very much for the swift reply and for development and maintenance of this software.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants