-
Notifications
You must be signed in to change notification settings - Fork 105
Issues: IBM/data-prep-kit
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] chunking fails on PDFs with one line text
bug
Something isn't working
#590
opened Sep 14, 2024 by
sujee
1 of 2 tasks
[Feature] Enable pure python transforms in new spark runtime.
enhancement
New feature or request
#586
opened Sep 12, 2024 by
daw3rd
1 of 17 tasks
[Bug] possible regression on ededupe code in release dev3
bug
Something isn't working
#585
opened Sep 11, 2024 by
sujee
1 of 2 tasks
[Bug] Testing Rag notebook with latest release of pdf2Parquet, eDedup and DocID
bug
Something isn't working
#583
opened Sep 10, 2024 by
touma-I
1 of 2 tasks
[Bug] issues running ray transformations on Google colab
bug
Something isn't working
#582
opened Sep 10, 2024 by
sujee
1 of 2 tasks
[Feature] Need better documentation of fuzzy dedupe
enhancement
New feature or request
#578
opened Sep 6, 2024 by
sujee
2 tasks done
[Feature] need an example of using doc_quality plugin with installed pypi packages
enhancement
New feature or request
#575
opened Sep 6, 2024 by
sujee
1 of 2 tasks
[Bug] Intermittent doc_id test-src failures in ci/cd.
bug
Something isn't working
#574
opened Sep 5, 2024 by
daw3rd
2 tasks done
[Bug] improve performance of pdf2parquet
enhancement
New feature or request
#573
opened Sep 5, 2024 by
sujee
1 of 2 tasks
[Bug] test/publish-image targets are disabled for pii_redactor/ray due to OSError
bug
Something isn't working
#571
opened Sep 4, 2024 by
daw3rd
1 of 2 tasks
[Feature] Remove or merge older examples from examples/notebooks/archive
enhancement
New feature or request
#568
opened Sep 4, 2024 by
daw3rd
2 tasks done
[Feature] Allow selected columns to be ignored in non-launcher tests of transforms that generate parquet files.
enhancement
New feature or request
#564
opened Sep 3, 2024 by
daw3rd
2 tasks done
[Feature] HTML to Markdown (based on HTML2Parquet trafilatura code)
enhancement
New feature or request
#559
opened Aug 30, 2024 by
touma-I
2 tasks done
[Bug] header_cleanser fails in running in openshift
bug
Something isn't working
#557
opened Aug 30, 2024 by
dtsuzuku-ibm
1 of 2 tasks
[Feature] Publish data-prep-kit core and transforms NIGHTLY into pypi
enhancement
New feature or request
#554
opened Aug 29, 2024 by
sujee
1 of 2 tasks
[Bug] pdf2parquet is now failing ci/cd builds
bug
Something isn't working
#552
opened Aug 28, 2024 by
daw3rd
1 of 2 tasks
[Feature] Provide an operator that loads files content to parquet
enhancement
New feature or request
#543
opened Aug 26, 2024 by
touma-I
2 tasks done
[Feature] Allow selected metadata fields to be ignored during tests.
enhancement
New feature or request
#536
opened Aug 23, 2024 by
daw3rd
1 of 2 tasks
[Feature] Allow a transform to define the file extensions it supports
enhancement
New feature or request
#535
opened Aug 23, 2024 by
daw3rd
1 of 2 tasks
[Feature] Publish Single Wheel for Doc Quality Transform
enhancement
New feature or request
#533
opened Aug 23, 2024 by
touma-I
2 tasks done
Look into the Quay security scanner checks that show a couple of critical severity issues with our images (e.g., as related to the existence of old pyarrow versions)
bug
Something isn't working
#529
opened Aug 22, 2024 by
shahrokhDaijavad
1 of 2 tasks
Enhance Code2Parquet module to handle non-code text as well
enhancement
New feature or request
#520
opened Aug 19, 2024 by
shahrokhDaijavad
1 of 2 tasks
[Feature] Develop ability to run ci/cd testing only on the portion of repo that has changed and its dependencies
enhancement
New feature or request
#515
opened Aug 19, 2024 by
daw3rd
1 of 2 tasks
[Feature] New num_processors python launcher option needs doc
enhancement
New feature or request
#503
opened Aug 14, 2024 by
daw3rd
1 of 2 tasks
[Feature] Organize transforms as sub-packages each with a unique name for distribution and import
enhancement
New feature or request
#501
opened Aug 14, 2024 by
touma-I
2 tasks done
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.