Skip to content

Copyright and Licenses

'Intellectual Property (IP)' law is a complex subject. However some understanding of it is important for anyone producing creative works governed by it including software, datasets, graphics and more. This is true irrespective of the nature of your project: Closed commercial projects building on open tooling; Commercial projects maintaining an open resource; Open community driven and/or non-profit projects. Each of these may need to make slightly different licensing choices from the beginning of their projects to be compatible with their goals. Decisions about licencing made at the inception of a project can have long-lasting and significant ramifications. The choices that you make about how your work is licensed shape who can and cannot legally use your work and for what purpose.

Many of the concepts which apply to the licensing of software, data, AI/ML models, hardware and other creative works such as visuals share common attributes and concepts which will be covered here.

Intellectual property is an umbrella term that refers to a number of distinct areas of law, primarily these three:

What these have in common is the attempt to extend property rights to intangible goods, meaning their use by others can be prevented or licensed. Governments with such laws effectively create a limited grant of monopoly over these goods for their creators, and other holders of these rights. This is generally done with the ostensible intent to incentivise the creation and improvement of such goods, but can in practice result in perverse incentives which fail to do so.

Note

It is important to consider that copyright, licenses, and patents are all legal concepts. As such, they are subject to what the law prescribes, which may change over time and space. Simply put, different countries have different laws, and follow different procedures with regard to enforcing them. The content provided here is broadly based on American and European law and legal traditions. It might not be applicable - might even be contra indicated - or relevant in your particular context. However most nations are signatories to international treaty agreements which somewhat harmonise these laws notably the Berne Convention, the TRIPS Agreement, and others under the World Intellectual Property Organization (WIPO). Whilst international efforts have sought to harmonize copyright enforcement, the real world is a messy place.

Another note

Good legal advice is timely, specific, and given by an expert; this chapter is none of these. It was written by engineers & scientists, not by lawyers, and it is a heavily simplified overview of a very complex field. The intent is to give you an overview of the basics so that you will know when to check whether something you want to do has potential legal ramifications. Do not make any important decisions based solely on the contents of this chapter.

So do not take the descriptions provided or viewpoints shared as legal advice, they are not that. This document is not intended to be used in that manner. Consult a legal expert to provide actual legal advice for your case.

(Research) Data per se are not protected by copyright. A copyright only applies to a “personal intellectual creation” (§ Abs. 2 UrhG) (=criterium individuality). This is not true for data, as data is seen as facts. This means for data, only the structure of it can be protected by copyright (§ 4 UrhG). For example: The initial table with the data is not protected by copyright, but if the researchers invested comprehensive re-structuring of the data (e.g., by adding new derivatives of the data after pre-processing), a copyright hold by the creator is valid. The same is true for metadata. Beholder of the copyright is always the creator, i.e., the researcher. If the dataset was created in collaboration, all collaborators share the same copyright.

Professors are not obligated to share the created datasets with the university as it is implied that the research work of professors are not bound by instructions of the university (“nicht weisungsgebunden”). It is different for students and other scientific staff as their work is usually bound to instructions of the university (“weisungsgebunden”). This means, that the university holds usage rights for datasets specifically created for theses (Bachelor, Master, Dissertation) by students. Third-party funders such as the DFG do not have any usage rights, but expect the researchers to take care of this by themselves. There are no regulations to which extend researchers should share their data but they highly recommend to share as much as possible.

By default, if you make a work publicly available, you retain the copyright to that work and all rights that this gives you over it. Anyone wishing to re-use that work must seek to license the right to do so from you, or open themselves to the possibility of a lawsuit for infringing on your copyright.

optional/reading/further materials

Licenses

What are 'Usage Restricting' Licenses?

Usage restricting licenses seek to affirmatively protect users or others affected by the use of the work by placing specific restrictions on its use. This curtails freedom 0, the freedom to use software 'for any purpose' and prohibiting the use of the software, or other system, for unethical purposes. Both 'Ethical source' & 'Responsible AI' Licenses are examples of this approach and seek to place restrictions on the uses to which the licensees can put the software or machine learning systems licensed in this fashion. Consequently, these licenses by the classical definitions of free and open source software from the FSF and OSI would not be considered free or open source licenses. They do however generally resemble them in the other three criteria of the definition. Their merits versus conventional open source licenses have been the subject of some debate, and their adoption has thus far been relatively limited.

Even an attribution requirement (the BY in CC-BY) can in some cases be considered a usage restriction. For example the Debian project found the Common Public Attribution License (CPAL) to be incompatible their free-software guidelines for this reason whilst it is approved by the Open Source Initiative. In the case of academic works attribution requirements can serve to re-enforce the citation convention with the force of copyright law.

Where to find open licenses for different types of work

Licencing enforcement

There have been a number of successful legal cases that have been brought in defence of the terms of copyleft licenses obliging the parties abusing the terms of these licenses to appropriately release their code. But this can be hard to discover, as it is not immediately obvious if copyleft code has been used from looking at a black box proprietary end product.

Organisations which take legal action in defence of free software, and which can provide information and resources for anyone else seeking to do the same, include:

Contributor license Agreements

The holder of the copyright on a copyleft project can still re-license that project or dual-license that project under a different license, for example to grant exclusive rights to commercially distribute that project with proprietary extensions or to make future versions proprietary. In a large community developed project, this would require the consent of all contributors, as they each own the copyright to their contributions. To get around this, some copyleft projects developed by companies that commercially license proprietary extensions to these projects ask their contributors to sign contributor license agreements (CLAs) which may assign the contributor's copyright to the company, or include other provisions so that they can legally dual-license the project.

How and where to add licenses

Wherever you share your project it is likely to be organised in a heirarchy of directories, place a plain text file containing the license in the top level directoty of your project. If it is a git project for example that is shared on a git forge like github or gitlab, using a standard file name like LICENSE will allow your license to be picked up the the host and displayed on your project. If the license that you have used has a standarised short name from SPDX then this will be displayed as a small icon on your projects home page by these hosts. It can also be useful to include license information in the form of standard strings at the top of each text file in your project. There are useful tools which automate this available from REUSE a project from the FSFe which developed the spec. This is especially true if your project contains material that is licensed in multiple different ways or a part of your project is being used in someone else's which uses a different (compatible) license.

Task

Go to the Creative Commons License Chooser and select a license. Then, go to your OSF project and add a LICENSE file.