The U.S. DMCA notice-and-takedown system has generated heated debate for many years with supporters arguing that the safe harbour is essential, while rights holder critics countering that the growing number of takedown notices sent to Google illustrates mounting piracy concerns. In recent months, there have been several reports that raise questions about the reliability of takedown notices. A study released last year by the University of California, Berkeley and Columbia University found that approximately 30% of notices were questionable, while TorrentFreak report this week identified tens of millions of fake DMCA takedown notices sent to Google on a website with virtually no traffic. An earlier report also raised questions about dubious takedown practices.
Yet those reports pale in comparison to data just released by Google in its submission to the Register of Copyrights as part of the review of the DMCA notice-and-takedown system. Google reports that the overwhelming majority of takedown notices sent to Google Search through its Trusted Copyright Removal Program do not involve pages that are actually in its search index. The submission states:
a significant portion of the recent increases in DMCA submission volumes for Google Search stem from notices that appear to be duplicative, unnecessary, or mistaken. As we explained at the San Francisco Roundtable, a substantial number of takedown requests submitted to Google are for URLs that have never been in our search index, and therefore could never have appeared in our search results. For example, in January 2017, the most prolific submitter submitted notices that Google honored for 16,457,433 URLs. But on further inspection, 16,450,129 (99.97%) of those URLs were not in our search index in the first place. Nor is this problem limited to one submitter: in total, 99.95% of all URLs processed from our Trusted Copyright Removal Program in January 2017 were not in our index.
These numbers of simply staggering with only a tiny number of millions of requests reflecting actual pages in the search index. Rather, 99.95% of the processed URLs from Google’s trusted submitter program are machine-generated URLs that do not involve actual pages in the search index. Given that data, Google notes that claims that the large number of requests correlates to infringing content on the Internet is incorrect:
Nor is the large number of takedown requests to Google a good proxy even for the volume of infringing material available on the Internet. Many of these submissions appear to be generated by merely scrambling the words in a search query and appending that to a URL, so that each query makes a different URL that nonetheless leads to the same page of results.
The incredible volume of fake claims regarding allegedly infringing pages represents a serious problem. Indeed, the Google data points a massive fraud in search index takedown requests, calling into question claims about the scope of infringing material on the Internet. The Register of Copyrights review of the DMCA continues with written submissions on empirical research due next month.