-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IMDB(JOB) Benchmark [2/N] (imdb queries) #12529
Open
austin362667
wants to merge
17
commits into
apache:main
Choose a base branch
from
austin362667:imdb-benchmark
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+2,150
−2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
austin362667
force-pushed
the
imdb-benchmark
branch
2 times, most recently
from
September 19, 2024 00:11
c1ccd0b
to
c3b4b8c
Compare
github-actions
bot
added
the
development-process
Related to development process of DataFusion
label
Sep 19, 2024
austin362667
force-pushed
the
imdb-benchmark
branch
from
September 19, 2024 16:00
837d9ba
to
ef99ebf
Compare
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]> Fix `get_query_sql()` for CI roundtrip test Signed-off-by: Austin Liu <[email protected]> Fix `get_query_sql()` for CI roundtrip test Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]> Prepare IMDB dataset Signed-off-by: Austin Liu <[email protected]>
austin362667
force-pushed
the
imdb-benchmark
branch
from
September 20, 2024 07:13
ef99ebf
to
0d89553
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
development-process
Related to development process of DataFusion
sqllogictest
SQL Logic Tests (.slt)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Partially closes #12311.
.csv
,.parquet
).query_id
5
indicatesquery
2a
,Rationale for this change
imdb.slt
, just like what we did totpch.slt
.Unlike TPC-H, IMDB dataset is not generated and it's fixed sized, so no scaling factor and we don't need another docker container to generate data and answers.
I have also cross-checked answers in csv files from https://github.com/duckdb/duckdb/tree/main/benchmark/imdb/answers .
What changes are included in this PR?
IMDB(JOB) queries don't have incremental
query_id
, so I hard-coded the benchmark runnerquery_id
(1,2,3,4, ... 113 in integer) to actual IMDB query name (1a, 1b, 1c, 1d, 2a, ... 33c in string, there is no pattern) mapping via lots ofif
.Currently, I've only add SLT for:
Are these changes tested?
Yes, please check
test_files/imdb
for details.Are there any user-facing changes?
No.