Settings

Theme

Show HN: Spotify streaming GDPR dump local analyzer

github.com

104 points by pdubouilh 3 years ago · 25 comments

Reader

punnerud 3 years ago

And ChatGPT is good at adding comments along the way in the code, so when they do a new dump from GitHub they have way better training data

  • nuccy 3 years ago

    I already find original ChatGPT (free version based on GPT-3.X) being "smarter" than Bing's one based on GPT-4. The latter much easier gives up just saying that it could not find anything, while ChatGPT replies based on its "knowledge" even on quite advanced science questions. Having a LLM trained on content generated by other LLM may be a way for disaster in future: filtering true from false (but convincing) information on scale would be hard to say the least. In light of recent troubles with Internet Archive, AI companies, like OpenAI, are of the beneficiaries of the Internet history "made on Earth by humans" :)

    • seaal 3 years ago

      I think Bing Chat’s safety features makes it very neutered. Lately I’ve been enjoying using Bard more due to its longer answers by default.

      • beebeepka 3 years ago

        You want longer responses? One of my problems with chat gpt is its tendency to use too many words for no good reason. Most of it is just filler imho

    • hobs 3 years ago

      You cant prevent an LLM from having a hallucination.

      The current state of the art is retrieval + summarization, all that knowledge it was trained on still exists. When performing a search having no reference for the knowledge is a decent signal that it may not exist at all and you may be talking to a liar.

      • capableweb 3 years ago

        You can though, by designing your prompt against it and using very low temperature values so it's more deterministic.

mikae1 3 years ago

Well, this could be close to what I dreamed off when I did the GDPR takeout two years ago! Haven't used Spotify since, so the data should actually still actually be up to date. :)

Never used Spotify on a mobile device though, so the location scripts will likely not be interesting at all.

dag11 3 years ago

Oh hell yes! I downloaded my GDPR dump a few months ago with the intent of analyzing it like this but never got around to it. Gonna fire this up now.

On the topic of ChatGPT code projects, way I recently made another evening side project[1] using ChatGPT. I found a really nice pattern to use with it is to use one commit per ChatGPT iteration (including commits where it breaks the program, just don't push to main until it's good again). And in each commit, I store the full prompt or reply I said to ChatGPT as a prompt.txt[2]. I'll probably tack it onto the commit description next time for ease of reading. But other friends have found this really useful to be able to see exactly how I+ChatGPT evolved the software with each commit, and I can look back and reference useful prompting patterns I used.

[1] https://github.com/dag10/timelapse [2] https://github.com/dag10/timelapse/commits/e77d11baaaf4e2a5f...

rickdeveloper 3 years ago

There's an iPhone/Android app that does something similar: https://stats.fm

amrb 3 years ago

Feels abit creepy they have the "mood" and location in a database.

  • hoffs 3 years ago

    Not sure what you mean, Spotify tracks have long exposed "features" that include various metrics to judge mood. https://developer.spotify.com/documentation/web-api/referenc...

    Also it's pretty apparent since Spotify suggests different playlists depending on time of day, weather, etc.

    • luckylion 3 years ago

      Do they only judge a song's mood, or do they judge the listener's mood and store that alongside the location?

      • input_sh 3 years ago

        Each song gets analysed and assigned ~10 audio parametres. If you tend to like tracks with, say, high danceability in the past X days, you're gonna get recommended more tracks with high danceability.

        If you listen to a specific genre, it's gonna work great. If you tend to listen to unrelated tracks, your recommendations are gonna be shit. If you listen to a couple of albums that are really not what you listen to usually, it's gonna take you a while to get rid of similar tracks from your recommendations.

        So to answer your question, it's more of a snapshot of the mood of the tracks you've interacted with recently, not your mood specifically.

  • pdubouilhOP 3 years ago

    I guess I didn't realise that they kept full IPs for such a wide range of time - I was kind of hoping they were anonymizing them after some time...

cynicalsecurity 3 years ago

What's this? What does GDPR have to do with it?

nemof 3 years ago

made a lot more sense when i looked up what endsong_*.json was. you can download your data (prob relates to gdpr requirements i guess) here https://www.spotify.com/us/account/privacy/ details as to what is included in json dump is noted here https://support.spotify.com/us/article/understanding-my-data...

rvcdbn 3 years ago

Really wish there were more of these kinds of tools to make sense of GDPR dumps.

  • grogenaut 3 years ago

    make some? point out some gdpr dumps you'd like tools for, what they should do?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection