Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

17054 custom name #1477

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

lijiayong
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Aug 4, 2021

Codecov Report

Merging #1477 (8393482) into main (462377e) will decrease coverage by 9.79%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1477      +/-   ##
==========================================
- Coverage   65.82%   56.03%   -9.80%     
==========================================
  Files          89       45      -44     
  Lines       15624     7845    -7779     
  Branches     3935     1975    -1960     
==========================================
- Hits        10285     4396    -5889     
+ Misses       4251     2927    -1324     
+ Partials     1088      522     -566     
Impacted Files Coverage Δ
workflow_job.py 80.00% <0.00%> (-0.50%) ⬇️
process.py 69.11% <0.00%> (ø)
cwltool/argparser.py
cwltool/__main__.py
cwltool/stdfsaccess.py
cwltool/udocker.py
cwltool/pathmapper.py
cwltool/__init__.py
cwltool/expression.py
cwltool/validate_js.py
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 462377e...8393482. Read the comment docs.

Copy link
Member

@mr-c mr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. Like id or label, but allowed to be a CWL expression.

Can you run make format ? Also, please run make diff_pydocstyle_report and fix the complaint at https://github.com/common-workflow-language/cwltool/pull/1477/checks?check_run_id=3243796649#step:7:28

@tetron
Copy link
Member

tetron commented Aug 4, 2021

The use case here is that if you have something like a wide scatter over samples, the name of the step that is visible to the user can incorporate useful information such as the sample id. It is a label for a specific instance of an execution of a step, not the description of the step (which is what id or label represent).

This was specifically requested by a user for a large pharma company who is known to run gigantic workflows.

runtimeContext.name = shortname(self.id)

stepnameReq, is_required = self.step.get_requirement("http://commonwl.org/cwltool#StepName")
if is_required is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to use stepnameReq is not None instead

@mr-c
Copy link
Member

mr-c commented Aug 5, 2021

The use case here is that if you have something like a wide scatter over samples, the name of the step that is visible to the user can incorporate useful information such as the sample id. It is a label for a specific instance of an execution of a step, not the description of the step (which is what id or label represent).

Then should it be called a "job name" to better disambiguate? As steps already have an id, label, and description field.

Alternatively, logs could be improved to always include the name of the scattered variable and its current value when scattering. Thus hand-specifying this field would be unnecessary.

@tetron
Copy link
Member

tetron commented Aug 5, 2021

Then should it be called a "job name" to better disambiguate? As steps already have an id, label, and description field.

Yes. That is a good idea. @lijiayong could you rename it "JobName" instead of "StepName" ?

Alternatively, logs could be improved to always include the name of the scattered variable and its current value when scattering. Thus hand-specifying this field would be unnecessary.

Logging the name of the scattered variable doesn't generalize terribly well, we don't know for sure what parameters are most relevant to the user to identify a job, and users may want to do some string munging to display only the most useful part. So I'm pretty users would still ask to be able to control the job names.

@mr-c
Copy link
Member

mr-c commented Aug 5, 2021

Alternatively, logs could be improved to always include the name of the scattered variable and its current value when scattering. Thus hand-specifying this field would be unnecessary.

Logging the name of the scattered variable doesn't generalize terribly well, we don't know for sure what parameters are most relevant to the user to identify a job, and users may want to do some string munging to display only the most useful part. So I'm pretty users would still ask to be able to control the job names.

Sure, but we should implement the improved auto naming of scattered jobs as that will enhance the logging in the vast majority of situations.

@mr-c
Copy link
Member

mr-c commented Aug 5, 2021

Is this intended to be part of a future version of the CWL standards?

Then there should be an issue opened to discuss the syntax.

I'm a bit uncomfortable with the number of cwltool extensions that have yet to be included in the standards.

If this needs to be fast tracked for Arvados and it's customer, then we can merge the supporting parts without the cwltool:JobName and Arvados can implement arv:JobName

@tetron
Copy link
Member

tetron commented Aug 5, 2021

Is this intended to be part of a future version of the CWL standards?

Then there should be an issue opened to discuss the syntax.

I'm a bit uncomfortable with the number of cwltool extensions that have yet to be included in the standards.

I don't understand this, there are precisely 3 cwltool extensions right now, this would only be a 4th one. One is highly experimental (ProcessGenerator) and the others (including this one) could be standards track, but that's a separate discussion. The development strategy in the past was generally to implement something and then iterate?

If this needs to be fast tracked for Arvados and it's customer, then we can merge the supporting parts without the cwltool:JobName and Arvados can implement arv:JobName

I'm not sure I understand, it needs to check for the hint type:

self.step.get_requirement("http://commonwl.org/cwltool#StepName")

Are you suggesting that it would not be added to the extensions yml file, or that the code in cwltool would look for the extension with the arvados namespace instead of the cwltool namespace?

@mr-c
Copy link
Member

mr-c commented Aug 18, 2021

I'd prefer there was a callback that Arvados uses for their own arv prefixed extension; and the default without that uses the name of the scattered variable(s) and their current value(s) when scattering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants