Significant strides have been made in the development and accessibility of artificial intelligence (AI) to ordinary users in recent years. However, along with this progress, various copyright-related issues have also emerged for consideration relating to the usage of copyrighted materials in AI training, as described by our legal trainee Sofia Järvinen in her blog.

The ability of AI to generate content is based on a large amount of data on which it has been trained, all its creations are built upon this data. The usage rights of such data are one of the concerns in the relationship between AI and copyrights. In the United States, this conflict has led to lawsuits against AI companies.

Lawsuits in the United States

In a lawsuit filed by the stock photo provider Getty Images, AI company Stability AI is accused of misusing over 12 million images in the training of its AI image generator. A group of visual artists has filed a class-action complaint against Stability AI on similar grounds.

These lawsuits revolve around the concept of "fair use" in U.S. copyright law, which allows the use of copyright-protected material in certain circumstances. When determining fair use, four factors are considered: the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and the effect of the use on the potential market for or value of the copyrighted work. Additionally, other factors may be taken into account, and the evaluation is always done on a case-by-case basis. Examples of types of uses that can be considered fair use include using materials for criticism, commentary, news reporting, teaching, and research.

The mentioned lawsuits, as well as other similar cases, are expected to clarify the current situation regarding copyrights and AI in the US. However, these cases are not expected to be concluded at least within this year. The uncertainties in legislation in this area have also been acknowledged, for example, by the U.S. Copyright Office, which released an initiative in March to address questions related to copyrights and AI technology, including the use of copyrighted materials in AI training. Considering the status of the US as a market and its impact on other countries, it will be interesting to see how these questions will be resolved.

Copyright limitations in Finland

In Finnish copyright legislation, there is no equivalent principle of "fair use" as in the US. However, certain limitations are associated with copyrights in Finland. These include the right to quote a work, which allows the use of quotations from published works in accordance with proper usage to the extent necessary for the purpose, as well as the right to reproduction for private use, which allows making single copies of a work for private use with certain restrictions. Temporary reproduction, for the purpose of transmission of a work in a network between third parties or for the lawful use of the work, is also allowed if it is transient or incidental, an essential part of a technological process, and does not have independent economic significance.

Based on these pre-existing limitations in the law, it is difficult to provide a definite answer regarding the copyright status of materials used in AI training. However, the changes incorporated into the Finnish Copyright Act through the DSM Directive bring some clarity to the matter. The revised Copyright Act, which came into effect in April, includes a limitation on the reproduction of parts of works for text and data mining purposes. Accordingly, a person with legal access to a work may reproduce parts of it for the purpose of text and data mining and retain copies for this purpose, unless the author has expressly reserved this right. Such data mining is utilized in AI training.

If disputes like the ongoing lawsuits in the United States were to arise in Finland, they would need to be considered on a case-by-case basis. In such cases, the operating model of the individual AI program would be examined against the limitations of copyright. Based on these considerations, it appears that the use of copyrighted materials could be allowed if the utilized content is legally accessible to the AI company and the author has not reserved the use of the work for data mining purposes.

Sources:

Getty Images lawsuit says Stability AI misused photos to train AI – Blake Brittain, Reuters (6.2.2023)

Stability AI swerves copyright infringement allegations in response to Getty lawsuit – Tim Smith ja Kai Nicol-Schwarz, Sifted (3.5.2023)

Generative AI Has an Intellectual Property Problem – Gil Appel, Juliana Neelbauer ja David A. Schweidel, Harvard Business Review (7.4.2023)

U.S. Copyright Office Fair Use Index