In a bold move that has shaken the music streaming industry, the activist group Anna’s Archive announced on December 20 that it had successfully scraped a vast portion of Spotify’s music library. Known primarily for archiving books and academic papers as “shadow libraries,” the group framed the action as a cultural preservation effort, claiming to have created the world’s first fully open music archive.
The scrape includes 256 million rows of track metadata—covering titles, artists, albums, ISRC codes, and more—representing nearly 99.9% of Spotify’s catalog. Additionally, Anna’s Archive claims to have obtained 86 million audio files in OGG format, totaling around 300 terabytes and accounting for approximately 99.6% of all listening activity on the platform. The data is prioritized by Spotify’s own popularity metrics, with popular tracks preserved in original quality (160 kbit/s OGG Vorbis) and lesser-streamed ones compressed further.
As of today, only the metadata (about 200GB compressed) has been fully released via torrents. Audio files are being distributed in stages, starting with the most popular content. The archive covers music up to July 2025, with some newer popular tracks included but potential gaps in post-July releases.
Anna’s Archive argues that existing music preservation efforts are flawed: overly focused on high-quality versions of popular tracks, neglecting niche or rare music, and lacking a centralized, open torrent list for “all music ever produced.” They position Spotify’s catalog as a “great start” for a decentralized, mirrorable archive resistant to platform failures, license expirations, or corporate decisions.
Spotify’s Response and Ongoing Investigation
Spotify quickly confirmed the incident, describing it as unauthorized access. In statements to outlets like Billboard, Android Authority, and Cybernews, a spokesperson said: “An investigation into unauthorized access identified that a third party scraped public metadata and used illicit tactics to circumvent DRM to access some of the platform’s audio files. We are actively investigating the incident.”
The company downplayed the scale in some responses, referring to “some” audio files affected, while labeling the group as involved in anti-copyright extremism. No user data appears compromised—this is not a traditional breach but a large-scale scrape bypassing protections.
Potential Impacts on Spotify
While the full ramifications are unfolding, experts and industry observers highlight several key areas where this could affect Spotify:
- Security and DRM Vulnerabilities Exposed
The scrape demonstrates that even robust DRM can be circumvented at scale. This raises questions about the effectiveness of streaming platforms’ protections against mass downloading. Spotify may need to invest heavily in enhanced anti-scraping measures, rate limiting, and API security, potentially increasing operational costs. - Piracy and User Retention Risks
Tech commentators, including Yoav Zimmerman (CEO of Third Chair, an AI legal tools startup), noted that individuals could theoretically build personal offline Spotify clones using tools like Plex media servers. With 300TB of data now seeding on torrents, dedicated users might create ad-free, unlimited libraries—though storage costs (hundreds of terabytes) and legal risks limit widespread adoption.
For casual listeners, Spotify’s convenience (recommendations, playlists, cross-device syncing) remains superior, so immediate subscriber churn is unlikely. However, in regions with poor internet or high subscription costs, this could fuel piracy resurgence. - Copyright and Label Relations
Spotify licenses music from major labels under strict terms. This incident could strain relationships, prompting demands for better safeguards or renegotiated deals. Similar past scrapes (e.g., YouTube datasets) have fueled unlicensed AI music training, a growing concern. If this archive is used for AI models, it could complicate Spotify’s own AI features (like playlists) and industry-wide licensing negotiations. - Financial and Market Reaction
Despite the headlines, Spotify’s stock (SPOT) showed resilience: shares rose about 3% on Friday, December 19 (pre-announcement leak), closing around $582, and continued gaining in early trading today. Investors appear to view it as a non-existential threat—streaming dominance relies on exclusivity, discovery algorithms, and live features, not just raw access to music files. - Broader Industry Debate
This revives tensions between digital preservation and copyright. Anna’s Archive echoes arguments from groups like the Internet Archive, but on a massive commercial scale. Legal action seems inevitable: precedents like the Internet Archive’s $621 million settlement for digitizing old records suggest severe penalties could follow if labels pursue claims.
In the short term, Spotify’s core business—over 600 million users and growing subscriptions—seems secure. The platform’s value lies in curation and ease, not sole ownership of files. Long-term, however, this underscores streaming’s fragility: one vulnerability can democratize (or pirate) an entire catalog overnight.
As investigations continue and more audio torrents seed, the music world watches closely. Preservation or piracy? The line has rarely been blurrier.
