Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex conditional dataset scheduling does not displayed in DAG graph #46288

Open
1 of 2 tasks
kullachartp opened this issue Jan 30, 2025 · 1 comment
Open
1 of 2 tasks
Labels
affected_version:2.10 Issues Reported for 2.10 area:core area:datasets Issues related to the datasets feature area:UI Related to UI/UX. For Frontend Developers. kind:bug This is a clearly a bug

Comments

@kullachartp
Copy link

kullachartp commented Jan 30, 2025

Apache Airflow version

Other Airflow 2 version (please specify below)

If "Other Airflow 2 version" selected, which one?

2.10.2

What happened?

I am trying to schedule datasets with the & and | condition to implement data lineage using Datasets. The list of datasets that depend on the DAG is as follows:

(
      (Dataset("table1") & Dataset("table2") & Dataset("table3")) |
      (Dataset("tableA") & Dataset("tableB") & Dataset("tableC") & Dataset("dummy"))
)

The intention is that every day, table1, table2, and table3 will be updated, while tableA, tableB, and tableC will be updated on an ad-hoc basis -- and I do not want the DAG to be triggered.

However, after deploying this DAG to Airflow, the DAG graph does not display the dependent datasets. It only displays tasks and outlet datasets.

I also tried other variations, such as:

(
    Dataset("table1") & Dataset("table2") & Dataset("table3") | Dataset("tableA")
)

and

(
    Dataset("table1") & Dataset("table2") & Dataset("table3") | Dataset("tableA") | Dataset("tableB")
)

Both of these display correctly. However, when using:

(
    Dataset("table1") & Dataset("table2") & Dataset("table3") | Dataset("tableA") & Dataset("tableB")
)

The DAG does not display the datasets. However, Dataset graph displays dataset relation correctly in every example.

What you think should happen instead?

No response

How to reproduce

Create a dag with conditonal dataset scheduling, complex &, | like in the description. Try deploying to Airflow and see DAG graph view.

Operating System

N/A

Versions of Apache Airflow Providers

No response

Deployment

Google Cloud Composer

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@kullachartp kullachartp added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jan 30, 2025
Copy link

boring-cyborg bot commented Jan 30, 2025

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@dosubot dosubot bot added area:datasets Issues related to the datasets feature area:UI Related to UI/UX. For Frontend Developers. labels Jan 30, 2025
@RNHTTR RNHTTR removed the needs-triage label for new issues that we didn't triage yet label Jan 31, 2025
@vikramkoka vikramkoka added the affected_version:2.10 Issues Reported for 2.10 label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.10 Issues Reported for 2.10 area:core area:datasets Issues related to the datasets feature area:UI Related to UI/UX. For Frontend Developers. kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

3 participants