-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Dataframe mode with parquet #25
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sfc-gh-kjimenezmorales
requested changes
Dec 13, 2024
...ectors/src/snowflake/snowpark_checkpoints_collector/snow_connection_model/snow_connection.py
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/checkpoint.py
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/checkpoint.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/utils/constant.py
Outdated
Show resolved
Hide resolved
sfc-gh-fgonzalezmendez
requested changes
Dec 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the comments I left.
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/utils/constant.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/utils/constant.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/checkpoint.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/utils/constant.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/utils/utils_checks.py
Outdated
Show resolved
Hide resolved
snowpark-checkpoints-validators/src/snowflake/snowpark_checkpoints/checkpoint.py
Outdated
Show resolved
Hide resolved
…e in demo and test files
…ext type to Optional in various functions
…nstants in checkpoint validation
…e no exceptions are raised
0d64e7c
to
b7430ec
Compare
sfc-gh-fgonzalezmendez
approved these changes
Dec 16, 2024
sfc-gh-kjimenezmorales
approved these changes
Dec 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation & Context
JIRA: SNOW-1854994
Description
This pull request includes significant updates to the
Demos/demo_pyspark_pipeline.py
andDemos/demo_snowpark_pipeline.py
files, focusing on the handling of schema and data checkpoints, as well as some minor changes to thesnow_connection.py
andsummary_stats_collector.py
files.Changes to schema and data checkpoints:
Demos/demo_pyspark_pipeline.py
: Added import forCheckpointMode
and updated thecollect_dataframe_checkpoint
function to include a new mode parameter. [1] [2]Demos/demo_snowpark_pipeline.py
: Updated import statements to includeCheckpointMode
andvalidate_dataframe_checkpoint
. Replacedcheck_dataframe_schema_file
withvalidate_dataframe_checkpoint
and added a new mode parameter. [1] [2]Commenting out specific data types:
Demos/demo_pyspark_pipeline.py
: Commented outfloat
,binary
,timestamp
, andtimestamp_ntz
data types in the schema and sample data rows. [1] [2] [3]Demos/demo_snowpark_pipeline.py
: Commented outfloat
,binary
,timestamp
, andtimestamp_ntz
data types in the schema and sample data rows. [1] [2]Minor changes:
snow_connection.py
: Updated theCREATE_STAGE_STATEMENT_FORMAT
to remove theTEMPORARY
keyword.summary_stats_collector.py
: Added a newline for better readability.How Has This Been Tested?
Checklist