Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import from spotify extended streaming history #1800

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

amCap1712
Copy link
Member

It is possible to get one's entire streaming history from Spotify, the import on their website has the timestamps rounded to minutes but they mention in their policy that one can mail them to get an extended history. I asked for it and got the necessary data after 3 weeks. The zip file sent by spotify has a bunch of json files of those one is endsong.json. This contains the spotify stream data. (There's also a endvideo.json, for me there wasn't anything useful in it but maybe there is for others so may want to look into it as well.)

Here's what one spotify stream looks in it,

{
    "ts": "2021-11-15T13:41:44Z",
    "username": "foobar",
    "platform": "web_player windows 10;chrome 95.0.4638.69;desktop",
    "ms_played": 82932,
    "conn_country": "IN",
    "ip_addr_decrypted": "127.0.0.1", 
    "user_agent_decrypted": "Mozilla%2F5.0%20(Windows%20NT%2010.0;%20Win64;%20x64)%20AppleWebKit%2F537.36%20(KHTML,%20like%20Gecko)%20Chrome%2F95.0.4638.69%20Safari%2F537.36",
    "master_metadata_track_name": "Ghost",
    "master_metadata_album_artist_name": "Justin Bieber",
    "master_metadata_album_album_name": "Justice",
    "spotify_track_uri": "spotify:track:6I3mqTwhRpn34SLVafSH7G",
    "episode_name": null,
    "episode_show_name": null,
    "spotify_episode_uri": null,
    "reason_start": "trackdone",
    "reason_end": "fwdbtn",
    "shuffle": false,
    "skipped": null,
    "offline": false,
    "offline_timestamp": 0,
    "incognito_mode": false
}

The stream contains actual IP address of the user, there's also country and platform but ip address is probably the most sensitive info there. So we might want to handle processing of this stream file client side. This is what the PR does currently.

I have also put it in the last fm importer page for now. We'll probably need to overhaul the page a bit to make space for spotify importer. Plus we should also put documentation for how people can get their streaming history from spotify.

Copy link
Member

@MonkeyDo MonkeyDo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be able to test this locally once I get my files with ms precision from Spotify.

I've been thinking we'll want to make it easier for users to request their info from Spotify, and so we could have a button that will have a precomposed email to send to the Spotify support address, maybe with a bit of a blurb about timestamp precision (not sure if this is necessary; perhaps for users who are already tracking their listens to avoid duplicates?).
Then we can see how to link to that on a "getting started" page to help onboard new users.

listenbrainz/webserver/static/js/src/types.d.ts Outdated Show resolved Hide resolved
@saxobroko
Copy link

You no longer need to email spotify, they have a link to request your extended history at https://www.spotify.com/us/account/privacy/

image

@amCap1712 amCap1712 force-pushed the import-extended-streaming branch 2 times, most recently from de9479a to 8e5022a Compare March 12, 2023 14:24
@MonkeyDo
Copy link
Member

@amCap1712 Let me know if you need some help with testing or with the front-end side of things

@freddie-freeloader
Copy link

Hey there! I am quite interested in this feature and was about to build something myself using the API, until I saw this pull request. Are there any particular reasons, why this pull request was not merged yet?
Thank you for your work ☺️

@BoostCookie
Copy link

I have written a small importer script.
https://gitlab.com/BoostCookie/spotify-to-listenbrainz-history-import

const listens = streams
.filter(
(stream) =>
stream.ms_played > 30000 &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the ListenBrainz documentation a listen should only be submitted if it was played for more than half the track or for at least 4 minutes.

@kellnerd
Copy link
Contributor

kellnerd commented May 8, 2024

I think I had already mentioned this on IRC, but there are a few caveats with the data from Spotify. For some of them I have found workarounds while implementing my own parser, feel free to integrate them into the official importer.
See my implementation and the accompanying documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants