Okay! by Steve Bowbrick https://flic.kr/p/LzCXwB (CC BY 2.0)

Misleading on Fair Dealing, Part 2: Why Access Copyright’s Claim of 600 Million Uncompensated Copies Doesn’t Add Up

November 20, 2018

My series on how the copyright review has been misled on fair dealing continues with one of Access Copyright’s most attention-getting claims: that each year “600 million pages of copyright-protected content is being copied for free each year by the education sector” (Part 1 on inconsistent claims on effect of 2012 reforms). The 600 million page figure has attracted widespread coverage, regularly cited by Access Copyright supporters (Association of Canadian Publishers, Writers’ Union of Canada) and noted by Members of Parliament. However, a closer examination reveals that the number is the result of outdated guesswork using decades-old data and deeply suspect assumptions.

Where does the 600 million figure come from?

Questions about the source of the 600 million figure have arisen a couple of times during the copyright review hearings. For example, MP Julie Dabrusin joined the Industry Committee for a hearing featuring the publishers and writers union, leading to the following exchange:

Ms. Julie Dabrusin: I’ve also heard a reference to 600 million pages every year being copied. I’m not sure, first of all, if you could tell us what the source is for that information, because that might help.

Mr. John Degen: My understanding is that most of that number, or at least over half of that number, comes from the York trial, so we’re talking about evidence. We’re talking about actual testing of copying that happened on a university campus in Canada. The rest would be what’s historical from the licence.

Ms. Kate Edwards: They’re from submissions to the Copyright Board and assessments of the tariff. Again, these are evidence-based, real copies made in Canadian institutions, and that has not changed in the last five years.

In fact, both Degen and Edwards are wrong. As discussed below, over half of the number does not come from the York University trial and the data says very little about copying practices over the past five years.

The same question was posed directly to Access Copyright during its appearance by NDP MP Brian Masse. Access Copyright’s Roanie Levy was somewhat evasive during the exchange:

Mr. Brian Masse: I want to get clarification with regard to the illegal copying that’s taking place. I’ve heard different numbers. What exactly are they and where do the numbers come from? Those have also been used by several witnesses prior to today’s testimony. I think they’re all using these numbers, the 600 million, as well as some other ones. Where do they come from?

Ms. Roanie Levy: There is a document, which I distributed earlier, that has some of the key numbers in it and explains where they come from. You’ll see where the “600 million pages of published works copied every year” comes from.

Mr. Brian Masse: For the record where exactly does that come from?

Ms. Roanie Levy: It comes from a couple of places. The first place is the Copyright Board decision in the elementary/secondary school sector. There were 380 million pages copied there per year.

Mr. Brian Masse: How did they come to that conclusion?

Ms. Roanie Levy: A study was done.

Mr. Brian Masse: I want to make sure because we’re hearing that number quite a bit.

Ms. Roanie Levy: Yes, absolutely.

Mr. Brian Masse: If people are following perhaps they want to know.

Ms. Roanie Levy: A study was done and that’s where the 380 million came from. Of that 380 million, the Copyright Board concluded that 60% of that is fair dealing, and therefore not compensable.

Mr. Brian Masse: Okay.

Ms. Roanie Levy: The remaining 40% is compensable. So, in fact, there were 150 million pages that still need to be compensated, but the ministers of education are still refusing to pay that. They’re claiming the whole thing under fair dealing, even the amount that the Copyright Board said had to be paid for.

Mr. Brian Masse: There’s about 150 million pages in outstanding invoices?

Ms. Roanie Levy: Outstanding payments that they’re claiming fair dealing on, 380 million pages in total for elementary/secondary. In the post-sec sector, we did a study on York University. In that case, as a result of the study that looked at the copies loaded on learning management systems and course packs, we see, on average, 360 pages per student per year being copied.

Mr. Brian Masse: Okay. So, it’s—

Ms. Roanie Levy: When you use all of that data, on a conservative end, you end up with 600 million pages that have been copied and not paid for. These are copies that are not licensed, nor have transactional licences been obtained for them that are not available under open access licences.

Why the 600 million page figure is misleading: K-12 data

A closer look at the sources cited by Access Copyright reveals that the 600 million page figure is highly misleading, overstating the impact of fair dealing while relying on data that almost entirely pre-dates the 2012 reforms. In fact, some of the data is so old that there are cabinet ministers who could have been the actual students in the K-12 schools at the time they were surveyed on copying practices.

Access Copyright says the 600 million figure combines data from a Copyright Board decision involving copying practices at K-12 schools with a several data points from York University. In the case of the Copyright Board data, it reflects 13-year old copying survey information from 2005-6. The Copyright Board noted that the data is now quite old, warning both Access Copyright and school boards in 2016 that it may no longer be representative:

Both of the parties rely on the 2005- 2006 Volume Study in their calculation of a royalty rate for the Proposed Tariffs. While we believe it is still possible to use this study for the purposes of establishing a royalty rate for this Tariff, it may not be so in the future. The study may then be sufficiently dated as to call into question its representativeness.

To suggest that 2005 copying data – which obviously pre-dates the 2012 reforms and Supreme Court of Canada decisions by many years – can be used as the basis to make claims about copying in 2018 is simply not credible.

Since the data is so old, it should not be used to make assertions about copying practices more than a decade later. Yet even if the data were reliable, Access Copyright’s inference that it represents 380 million unpaid copies due to fair dealing is misleading. The Copyright Board concluded that there were 382.2 million copies made by schools and school boards at issue, of which about half were books and the other half were newspapers, periodicals, and consumables (which can be any number of different works). After conducting a fair dealing analysis, the board concluded that 154.3 million copies – just over 40 per cent – were not covered by fair dealing. The majority of those copies were consumables.

Access Copyright has rounded the 382.2 million copies to 380 million and now argues that all are uncompensated. In fact, during the relevant period, Canadian schools paid millions of dollars to Access Copyright from 2010 to 2012 for copying in advance of an actual tariff being set by the Copyright Board of Canada. Having been found to have overpaid by tens of millions of dollars (based on Copyright Board analysis that was upheld by Federal Court of Appeal), the education groups asked for a refund of the overpayment. When Access Copyright refused – it said it was holding the money to cover later copying under a licence for which schools had opted-out – the schools were forced to file a lawsuit for a return of the overpayment. In other words, there was overpayment for 150 million copies from 2010-2012 and disputes/litigation over the withheld funds since that time.

Why the 600 million page figure is misleading: York data

The remaining 220 million copies that Access Copyright claims are uncompensated each year is derived from a study conducted as part of the York University litigation that is currently before the Federal Court of Appeal. The precise source of this estimate is difficult to discern since the Access Copyright expert in the case cited several York-related data sets including internal coursepack copying from 2005 -2011, licensed copyshops from 2011 – 2015, York printing services from 2011-2013, and learning management system posting of works from 2011-2013. Access Copyright’s Roanie Levy seemed to suggest to the committee it comes primarily from the 2005-2011 internal coursepack copying data since she pointed to 360 pages per year and that report estimated an average of 387.5 exposures per full time student (the estimate for LMS copying was far lower at 92 copies in 2012 and 186 in 2013 per full time student).

If so, some of the York data is as old as the K-12 data, dating back as much as 13 years for printed coursepacks that have dramatically declined in usage. Not only is the data old, but the 220 million figure represents a remarkable extrapolation by Access Copyright from a single university to estimate copying for all universities and colleges across Canada many years later.

Indeed, even if the York study was not outdated and somehow representative of all Canadian universities and colleges, the numbers significantly overstate the amount of potential non-fair dealing copying. First, the numbers do not account for copying covered by alternative site licenses. York has spent millions of dollars on site licences to pay for access to content, yet the study does not account for that licensed copying. In fact, even the trial judge in the York case acknowledged the potential risk of overstating the copying:

While Gauthier [Access Copyright’s expert] did not adjust his estimate of print and digital exposures to account for York’s claim that some of the items captured in the sampling were copied or posted with permission, this does not significantly undermine the conclusions which can be drawn. The report may overstate some of the copying, but since York had the data it was incumbent on them to establish quantum and materiality.

While the trial judge was nevertheless satisfied with that approach, a second study on the same data concluded that 96.6 per cent of the copying in course packs was covered by alternative permissions that York had obtained to include the materials. For the course management system study, the percentage of documents subject to an alternate permission was 26.2 percent. The precise amount of copying subject to alternate licences is subject to debate, but there is no doubt that a sizable percentage of the copying was covered by a non-Access Copyright permission.

Second, although none of the data is current to the last few years, the most recent data comes from access to materials on online learning management systems (LMS). As will be discussed in an upcoming post, the days of paper-based coursepacks have largely disappeared with universities shifting to an LMS approach. That digital shift has significant licensing implications since it allows universities to licence both access and reproduction in the same licence directly from publishers and database providers, unlike the Access Copyright licence that covers only reproduction.

The digital shift also has implications for determining how many copies – or instances of access – have occurred. According to Access Copyright, there should be a one-to-one ratio, meaning that every registered student should be deemed to have accessed every page posted to an LMS. If the 220 million figure includes LMS, then the “copying” is based on this assumption with a multiplier of every posted page per course for every registered student. Yet as any professor or student will tell you, it is incredibly unlikely that every student accessed every single page posted to an LMS.

In fact, the York study had evidence to suggest that registered students often outnumbered the frequency of LMS access. The data indicated that enrolled students outnumbered unique accesses of the LMS 34 per cent of the time, equal access occurred just 5 per cent of the time, and 14 per cent involved more unique accesses than enrolled students (46 per cent of the time the unique access data was missing). While not conclusive, the data unsurprisingly points to many instances where fewer students access content than are enrolled in a course. Indeed, the Copyright Board asked Access Copyright in the post-secondary case to provide data on alternate assumptions that 75%, 50%, and 25% of student accessed materials posted to an LMS.

The net effect is the York data is an unrepresentative, small sample of copying practices, much of which pre-dates the 2012 reforms. Moreover, much of the content may have been covered by alternate site licences and the amount of copies or usage is undoubtedly overstated. Access Copyright and its allies have repeatedly cited the 600 million copy figure but on closer inspection it overstates the situation by hundreds of millions copies and is based on unreliable, old data.