-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MusicbrainzNGS replacement #4936
base: master
Are you sure you want to change the base?
Conversation
62900d8
to
6351d61
Compare
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hello, it is still relevant but the merge is corrupted since the massive style conformance that was made last summer. Unfortunately I have a serious lack of availability to fix it. However the replacement library mbzero is not sufficiently experimented and would expect some review. To move forward, we could split this PR to prepare for a replacement and drop the last commit. I would be happy to work on that when I have time, but if anyone wants to take the lead, it's also fine. As musicbrainzngs is still unmaintained, I would recommend to keep that PR open |
4280735
to
6cbc3f2
Compare
Hello, as the library mbzero evolved (test coverage got pretty good), I updated the request to match the new style. Tests done: py312 py312-lint format format_check Again, read commit per commit. Suggestions are welcome (ping @sampsyo) |
Thanks for the PR! First off, I made a bunch of comments on the code but there's one issue to deal with before anything else: Do you have a solution to that? Because if not, this PR will have to be placed on hold until 2025-10. I see you're the maintainer for |
Hello, I saw that last week. I don't think there is any blocker, 3.10 was my python version at the time of development. Let me check on 3.8 and 3.9. |
Some of the work that was done in musicbrainzngs requires to be done in beets to lighten the musicbrainz API. Signed-off-by: Louis Rannou <[email protected]>
The structure of data changes a bit Signed-off-by: Louis Rannou <[email protected]>
d3147cf
to
267c1af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking that! There's a couple things to address with the code though
beets/autotag/mb.py
Outdated
limit_interval = 1.0 | ||
limit_requests = 1 | ||
do_rate_limit = True | ||
|
||
|
||
def set_rate_limit(limit_or_interval=1.0, new_requests=1): | ||
"""Sets the rate limiting behavior of the module. Must be invoked | ||
before the first Web service call. | ||
If the `limit_or_interval` parameter is set to False then | ||
rate limiting will be disabled. If it is a number then only | ||
a set number of requests (`new_requests`) will be made per | ||
given interval (`limit_or_interval`). | ||
""" | ||
global limit_interval | ||
global limit_requests | ||
global do_rate_limit | ||
if isinstance(limit_or_interval, bool): | ||
do_rate_limit = limit_or_interval | ||
else: | ||
if limit_or_interval <= 0.0: | ||
raise ValueError("limit_or_interval can't be less than 0") | ||
if new_requests <= 0: | ||
raise ValueError("new_requests can't be less than 0") | ||
do_rate_limit = True | ||
limit_interval = limit_or_interval | ||
limit_requests = new_requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use something other than these types of global variables? It's doesn't fit with beet's OOP model. Perhaps a singleton class for the rate limiter would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can undertsand that ! So far the PR s a 1st step. That's a simple copy from what used to be in musicbrainzngs.
If the whole idea is fine for beets, I'm ok to find a solution. Would you have any tip for the singleton ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would look up singletons in Python, but they're pretty simple. If you don't know what they are (and sorry for being presumptive, if you do know then ignore), they are classes that can only create a single instance. You store an instance in the class data and that is the only one that is every created, which allows you to keep track of state like with these global variables, but in an OOP way.
Here is an explanation of how to create a singleton in Python. Again, ignore if you're already aware.
For the implementation, I would suggest perhaps a class that manages requests and handles the rate limiting. The RateLim decorator would be the best place, you can make it a singleton and then just use the decorator. If that doesn't work, then there can be a separate object that manages the rate inside the decorator should be fine. It's basically just trying to keep track of a counter and maybe a lock for the sake of concurrency. That's prime singleton stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as far as first steps go, this is good! My idea on PRs is that they should be complete enough to not need immediate changes. I think the global variables would be that type of thing; it's just a change that should be done, and there's no reason not to do it now rather than add them and then have to remove them in a separate PR.
beets/autotag/mb.py
Outdated
useragent = "beets/{} (https://beets.io/)".format(beets.__version__) | ||
mbi = MbInterface(useragent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More global variables. Is there a way to not have global variables for this? The user agent isn't much of a problem as a global but the MbInterface really shouldn't be one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree. I had the idea to create an other file mbinterface.py that would replace the MbInterface class. The drawback is that we would have two files for the musicbrainz connection.
I'm not sure what's good here. Could we opt for an acceptable solution until mb.py is restructured ? That file is not really OOP btw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's a solution, properly. The issue is that the mbi
object is being initialised in the module directly. Honestly a better solution, based on its structure and that the user-agent is a constant, we should just make a new object of the MbInterface
whenever we call it, or even convert them to class functions. The latter is probably best. There's no actual reason to make an MbInterface
object unless you're changing the source from the MusicBrainz servers to something else.
Even integrating it with the rate limiting code as a singleton could work. It's tricky. If we didn't offer support for other sources, I would just say make them class functions. Maybe passing in a destination object or reading that directly from the config could work.
What do you think?
mbzero is much lighter and things have to be implemented in beets: - there is no helper such as get_release_by_id - no checking, so wrong requests will raise an exception - there is nothing to limit the requests, it needs to be implemented - mocks are moved to the interface Signed-off-by: Louis Rannou <[email protected]>
mbzero 0.4 has support for python 3.8 and 3.9 Signed-off-by: Louis Rannou <[email protected]>
Signed-off-by: Louis Rannou <[email protected]>
Signed-off-by: Louis Rannou <[email protected]>
"release": { | ||
"title": releases[id_][0], | ||
"id": id_, | ||
"medium-list": [ | ||
{ | ||
"track-list": [ | ||
{ | ||
"id": "baz", | ||
"recording": { | ||
"title": "foo", | ||
"id": "bar", | ||
"length": 59, | ||
}, | ||
"position": 9, | ||
"number": "A2", | ||
} | ||
], | ||
"position": 5, | ||
} | ||
], | ||
"artist-credit": [ | ||
{ | ||
"artist": { | ||
"name": releases[id_][1], | ||
"id": "some-id", | ||
}, | ||
} | ||
], | ||
"release-group": { | ||
"id": "another-id", | ||
}, | ||
"status": "Official", | ||
} | ||
"title": releases[id_][0], | ||
"id": id_, | ||
"media": [ | ||
{ | ||
"tracks": [ | ||
{ | ||
"id": "baz", | ||
"recording": { | ||
"title": "foo", | ||
"id": "bar", | ||
"length": 59, | ||
}, | ||
"position": 9, | ||
"number": "A2", | ||
} | ||
], | ||
"position": 5, | ||
} | ||
], | ||
"artist-credit": [ | ||
{ | ||
"artist": { | ||
"name": releases[id_][1], | ||
"id": "some-id", | ||
}, | ||
} | ||
], | ||
"release-group": { | ||
"id": "another-id", | ||
}, | ||
"status": "Official", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to do a full review of the PR with these new changes, but there is one thing that could help me get through it a bit faster. Could you explain why the structure of the response dict has changes here? It just raises a red flag when the test fixture has this type of change.
It only works if everywhere in the code that expected the old structure has changed as well, which I'll need to check and can be time consuming. Because changing the tests without changing the codebase can lead to a mismatch between the tests and the actual code if the fixtures and mocks don't actually relate. Unnesting a value like this can cause big problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. For a better understanding, you may go commit by commit.
The structure of the response change because the library relies on the json interface instead of the xml as suggested in #4651 (comment)
The structure of the response is a bit different.
The difference is mainly :
- the "release" sublevel has disappeared
- "xxx-list" has become "xxxs"
- "medium-list" has become "media"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello. Eventually, I my split the work and first have a replacement that keeps the xml interface. That would leave the test. The xml is supported by mbzero anyway.
When that's fine, we can think about dropping the xml in favor of json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! I still need to review, my semester has hit like a truck so I have had very, very little spare time or energy to do code review. I'll get to it soon hopefully :)
Description
Suggestion for musicbrainz substitution in relation with discussion #4660
Tests have been ran with command
tox -e py311-test
To Do
docs/
to describe it.)docs/changelog.rst
near the top of the document.)