Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce --dry-run run mode #1881

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 27 additions & 21 deletions metaflow/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -796,17 +796,15 @@ def before_run(obj, tags, decospecs):
# in two places in this module and make sure _init_step_decorators
# doesn't get called twice.

# We want the order to be the following:
# - run level decospecs
# - top level decospecs
# - environment decospecs
all_decospecs = (
list(decospecs or [])
+ obj.tl_decospecs
+ list(obj.environment.decospecs() or [])
# decospecs have this precedence order:
# - run-level
# - top-level
# - environment
decospecs = (
list(decospecs or []) + obj.decospecs + list(obj.environment.decospecs() or [])
)
if all_decospecs:
decorators._attach_decorators(obj.flow, all_decospecs)
if decospecs:
decorators._attach_decorators(obj.flow, decospecs)
obj.graph = FlowGraph(obj.flow.__class__)

obj.check(obj.graph, obj.flow, obj.environment, pylint=obj.pylint)
Expand Down Expand Up @@ -901,6 +899,12 @@ def version(obj):
type=click.Choice(MONITOR_SIDECARS),
help="Monitoring backend type",
)
@click.option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of having options cause other options to be ignored with no warning. I would add a check to see if there are conflicting options. dry-run is also a bit misleading (it's not dry run and things may happen, for example conda package caching, anything the code does as well, etc).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also feel like --dry-run is a bit misleading as to what it actually does under the hood. Some alternatives that would maybe describe the behavior better could be --local-only or --local

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thing that would speak for the rename is the following:

python HelloFlow.py --dry-run run --with kubernetes

results in an error

The @kubernetes decorator requires --datastore=s3 or --datastore=azure or --datastore=gs at the moment.

which would maybe make more sense with a renamed flag? e.g. "trying to run kubernetes locally, got an error".
With --dry-run my expectation at least is that remote decorators would be omitted instead

"--dry-run",
is_flag=True,
expose_value=True,
help="Set metadata & datastore to local",
)
@click.pass_context
def start(
ctx,
Expand Down Expand Up @@ -929,7 +933,13 @@ def start(

echo("Metaflow %s" % version, fg="magenta", bold=True, nl=False)
echo(" executing *%s*" % ctx.obj.flow.name, fg="magenta", nl=False)
echo(" for *%s*" % resolve_identity(), fg="magenta")
if ctx.params["dry_run"]:
# Set metadata & datastore to local for dry-run execution
# TODO: Consider setting monitor, tracer and logger to null too
ctx.params["metadata"] = metadata = "local"
ctx.params["datastore"] = datastore = "local"
echo(" in dry run mode", fg="magenta", nl=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if there is any way around this, but exceptions raised in decorator init (f.ex. kubernetes checking for datastore type) will lead to this echo being eaten so there is no trace for the cause of a raised exception that the user has in their stdout, only if they check the command will it maybe be obvious

echo(" for *%s* " % resolve_identity(), fg="magenta")

cli_args._set_top_kwargs(ctx.params)
ctx.obj.echo = echo
Expand Down Expand Up @@ -998,11 +1008,9 @@ def start(
deco_options,
)

# In the case of run/resume, we will want to apply the TL decospecs
# *after* the run decospecs so that they don't take precedence. In other
# words, for the same decorator, we want `myflow.py run --with foo` to
# take precedence over any other `foo` decospec
ctx.obj.tl_decospecs = list(decospecs or [])
# run/resume can add more decorators with --with, which take precedence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am fine with this (and associated change) but would re-add the comment about how precedence is determined by order right now (first one wins) so we need to save to apply in before_run. (ie: this comment is correct but doesn't say why we do it this way).

# over any other decospecs
ctx.obj.decospecs = list(decospecs or [])

# initialize current and parameter context for deploy-time parameters
current._set_env(flow=ctx.obj.flow, is_running=False)
Expand All @@ -1013,11 +1021,9 @@ def start(
if ctx.invoked_subcommand not in ("run", "resume"):
# run/resume are special cases because they can add more decorators with --with,
# so they have to take care of themselves.
all_decospecs = ctx.obj.tl_decospecs + list(
ctx.obj.environment.decospecs() or []
)
if all_decospecs:
decorators._attach_decorators(ctx.obj.flow, all_decospecs)
decospecs = list(decospecs or []) + list(ctx.obj.environment.decospecs() or [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, use ctx.obj.decospecs (no point redoing the same computation and conversion)

if decospecs:
decorators._attach_decorators(ctx.obj.flow, decospecs)
# Regenerate graph if we attached more decorators
ctx.obj.graph = FlowGraph(ctx.obj.flow.__class__)

Expand Down
Loading