Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] [python-package] scikit-learn compatibility tests fail with scikit-learn 1.6.dev0 #6653

Closed
jameslamb opened this issue Sep 15, 2024 · 2 comments · Fixed by #6651
Closed

Comments

@jameslamb
Copy link
Collaborator

Description

Starting a few days ago, the scikit-learn compatibility checks here have been failing with the following errors:

E AssertionError: Estimator LGBMClassifier should not set any attribute apart from parameters during init. Found attributes ['min_data_in_bin'].

E AssertionError: Estimator LGBMClassifier doesn't check for NaN and inf in fit.

E AssertionError: ('_more_tags() was removed in 1.6. Please use sklearn_tags instead.',)

And the same for LGBMRegressor.

This is only happening with the 1.6.dev0 nightlies of scikit-learn.

Reproducible example

This is happening across all pull requests here, even those not related to the Python package in any way. For example, build log from #6648: https://github.com/microsoft/LightGBM/actions/runs/10776737786/job/29884208680?pr=6648

Environment info

installed packages (click me)

contourpy-1.3.1.dev1 cycler-0.12.1 fonttools-4.53.1 joblib-1.4.2 kiwisolver-1.4.7 matplotlib-3.10.0.dev587+g1c892c2033 numpy-2.2.0.dev0 pandas-3.0.0.dev0+1452.g80b6850271 pillow-11.0.0.dev0 pyparsing-3.1.4 python-dateutil-2.9.0.post0 scikit-learn-1.6.dev0 scipy-1.15.0.dev0 six-1.16.0 threadpoolctl-3.5.0 tzdata-2024.1

build link: https://github.com/microsoft/LightGBM/actions/runs/10776737786/job/29884208680?pr=6648#step:4:140

Additional Comments

Where this test is configured:

test-latest-versions:

@vnherdeiro started investigating 1 of the 3 issues (the one about _more_tags()) in #6651. Some notes from there:

Other related discussions:

@vnherdeiro
Copy link
Contributor

vnherdeiro commented Sep 16, 2024

@jameslamb I think that #6651 will fix the first issue you are quoting: the min_data_in_bin one. Reason is it's being raised because the tag to not check parameters defined outside of BaseClassifier subclass constructor is missing because of not using the new __sklearn_tags__ API, note that one the tags is an xfail bypass:
"check_no_attributes_set_in_init": "scikit-learn incorrectly asserts that private attributes "

edit: and likewise for LGBMClassifier doesn't check for NaN and inf in fit. with the allow_nan flag.

@jameslamb
Copy link
Collaborator Author

it's being raised because the tag to not check parameters defined outside of BaseClassifier subclass constructor is missing because of not using the new __sklearn_tags__ API

Ah yep, you are right! I agree with your analysis, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants