-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explainable boosting parameters #6335
base: master
Are you sure you want to change the base?
Conversation
Merge upstream updates
@microsoft-github-policy-service agree |
I changed the status of this pull request to "draft" while waiting for the maintainer's opinions on the parameters added. |
Thanks for your contribution. I believe this will be a very helpful feature! I'll look into this PR. |
Some initial questions I have 1. how should this be made consistent with all the other feature-selection parameters?
There are already multiple ways to control which features are used, and I'm concerned that adding 3 (!) more parameters will be more confusing to users of the library than it is helpful. And that it might be very difficult to provide expected behavior in the face of all these combinations. For example, consider the following mix:
These can't both be satisfied, so what happens? A runtime error before boosting begins? Or consider this:
What happens if a randomly-selected set of features violates all of the 2. How should these work for multiclass classification?In multiclass classification, LightGBM trains one tree per class. So for I think it's not uncommon for different features to be more important for predicting one class than another. So should 3. How should this work with
|
Hi @jameslamb,
Thus, I would say that So focusing now only on one parameter, I try to propose a solution to your questions:
I think
I suggest replicating the behavior of
Again, we should be consistent with the behavior of
I totally understand. I also do not want to insist on adding them to the main branch. I just opened the pull request because I thought it might be interesting to other users to have (some of) them.
I would be glad to help if needed :) |
Thanks for the response.
I don't know for sure what the answers to my questions for
If we do move forward with adding tree-level feature constraints like this, I definitely support adding only this one parameter instead of all 3, to limit the complexity. It seems to me that the @shiyu1994 I'll wait until you have time to look into this proposal (the linked paper and code samples here) and give a specific opinion on whether LightGBM should take on some subset of this. |
In my investigations, I focused on creating ensembles of trees using only 1 or 2 features per tree, and thus, other parameters, such as the already mentioned |
Thanks very much for that! Totally makes sense to me. Like I mentioned in #6335 (comment), I will defer to @shiyu1994 (or maybe @guolinke if interested) to move this forward if they want. They're much better qualified than me to decide on how this could fit into LightGBM. If we do decide to go forward with it, I'll be happy to help with the testing, documentation, etc. |
Hi all,
First of all, thank you to the maintainers for keeping this project updated; this is one of the best libraries for gradient boosting I ever tried.
I am making this pull request because two years ago, I forked lightgbm to create a proof of concept for this paper: https://dl.acm.org/doi/10.1145/3477495.3531840
Since I am now improving and expanding the experiments for a new interpretable LambdaMART, I decided to give back to this project by polishing the proof of concept and opening a pull request.
Issue addressed
This pull request is a starting point to address issue #3905 (also mentioned in #2302) by adding interpretable characteristics of the training process.
Parameters added
I added three parameters:
tree_interaction_constraints
max_tree_interactions
max_interactions
tree_interaction_constraints
The parameter
tree_interaction_constraints
similar tointeraction_constraints
limits the interactions between features. Whileinteraction_constraints
controls which features can appear in the same branch,tree_interaction_constraints
controls which features can appear in the same tree.max_tree_interactions
The parameter
max_tree_interactions
limits the number of features that can appear in the same tree in a greedy fashion, e.g., ifmax_tree_interactions = 5
after having used the fifth different features in the same tree, no further features will be used and the same 5 will be reused to find the one with the maximum gain.max_interactions
The parameter
max_interactions
limits the number of interactions that can be used in the whole forest in a greedy fashion.Every tree is associated with a set of features, and we can say that those features interact with each other.
If a tree uses a subset of features of another tree, we say that the two trees use the same set of feature interactions.
To make an example, if
max_interactions = 5
and the first 5 trees use 5 disjoint sets of features, the sixth tree will be forced to use a subset of one of the feature sets used by the first 5 trees.Tests
Other than having extensively tested the new parameters in the proof of concepts for my papers, I added three unit tests in
python-package/lightgbm/engine.py
, namely:test_tree_interaction_constraints
test_max_tree_interactions
test_max_interactions
These functions are only a starting point for testing the new features added.
I have not tested the integration in R.