Gerrish, Charlotte. « European Copyright Law and the Text and Data Mining Exceptions and Limitations » [thesis]

« We are in a digital age with Big Data at the heart of our global online environment. Exploiting Big Data by manual means is virtually impossible. We therefore need to rely on innovative methods such as Machine Learning and AI to allow us to fully harness the value of Big Data available in our digital society. One of the key processes allowing us to innovate using new technologies such as Machine Learning and AI is by the use of TDM which is carried out on large volumes of Big Data. Whilst there is no single definition of TDM, it is universally acknowledged that TDM involves the automated analytical processing of raw and unstructured data sets through sophisticated ICT tools in order to obtain valuable insights for society or to enable efficient Machine Learning and AI development. Some of the source text and data on which TDM is performed is likely to be protected by copyright, which creates difficulties regarding the balance between the exclusive rights of copyright holders, and the interests of innovators developing TDM technologies and performing TDM, for both research and commercial purposes, who need as much unfettered access to source material in order to create the most performant AI solutions. As technology has grown so rapidly over the last few decades, the copyright law framework must adapt to avoid becoming redundant. This paper looks at the European approach to copyright law in the era of Big Data, and specifically its approach to TDM exceptions in light of the recent DSM Directive, and whether this approach has been, or is, a furtherance or hindrance to innovation in the EU. (…) »