Machine-Centric Science | Transcript: FAIR Implementation Profile (FIP) Ontology

July 15, 2022 • 10 Minutes

FAIR Implementation Profile (FIP) Ontology

A FAIR implementation profile is a way to communicate how you're implementing the FAIR principles. It's a way for people, communities of practice, to share how they're addressing the FAIR principles, the choices that they've made, or considerations they've made about those choices, what enabling resources they're using to make those choices or what challenges they have, what they're planning. Also, to associate themselves with a community of practice so that people can perhaps, in similar fields, adopt similar answers to some of the questions about how do you implement FAIR.

Hello, and welcome to Machine-Centric Science. My name is Donny Winston, and I'm here to talk about the FAIR principles in practice for scientists who want to compound to their impacts, not their errors. Today, we're going to be talking about the FAIR Implementation Profile ontology.

So the nice thing about an ontology, what ontology does, is it allows you to have qualified relationships among concepts. So the concepts here are, first, there are the FAIR principles. The other concept here is a FAIR Implementation Profile Question. There are a bunch of Questions that each refer to the principles. So for example, principle F1: metadata are assigned a globally unique and persistent identifier. In the FAIR Implementation Profile Ontology, there's a question, F1-D, which is, what globally unique, persistent resolvable identifiers do you use for data sets?

There's also one for metadata records. What global unique persistent resolvable identifiers do you use for metadata records? So it's getting very specific about what resources you're using, how are you actually implementing these things? And the other concept, another concept in the fair implementation profile ontology is a so-called declaration.

So this is an expression of how you address a question. So your declaration might be a no choice declaration. No choice has been made yet about this question. You might also have a declaration where you're declaring the current use of some resource, the planned development of some resource or the planned use of a resource, or perhaps the planned replacement of a resource you're currently using to address the question. These choices refer to the questions and also you can, another named concept for any declaration, is considerations. So any considerations that have led to a given declaration. If it made something a choice or a challenge or something like that. Finally, what are the declarations about? If you're currently using or planning on using something, these are called, this concept in the ontology is called a FAIR-enabling Resource.

So this is something that's either to be developed or available. And it's some resource that actually let's you address the question. So, some enabling resource, for example, for a related question of, What identifiers are you using for data? You might declare that, I'm using DOIs and you might point to the enabling resource as DOIs, and say that you're currently using those or planning to use them, et cetera.

The other concepts are: each declaration can be declared by some implementation community. So you can declare that you're part of some community, of some domain, that might have a data steward. And so if someone's part of that research community or community of practice, the whole idea of having this structured ontology of concepts that have these qualified relations, where you talk about, the FAIR principles and specific questions related to the principles and declarations about how you address a question, either as a challenge or a choice, and the actual enabling resources that are the targets of those declarations, and also connecting those declarations to implementation communities -- the net effect of all of this is you can populate a model, and have a bunch of declarations by different people in different communities, referring to different resources, and you can essentially search this knowledge graph and interrogate it to see what makes sense for you to use. What are people in your field using? What challenges do they have, et cetera?

And this can be a lot more useful than just a folder of documents -- it can be very structured. So one of my goals now, is I'm beginning to walk through each of these questions and I would like to construct and share a populated model of people's articulations -- in the terminology of the ontology, declarations -- of choices they've made or challenges they face, where challenge would be a no-choice declaration, with regard to addressing each question, as well as the considerations that you associate with any choice or challenge.

In previous episodes, I've gone over each of the FAIR principles. What I'd like to do now is just briefly read aloud, I'll include this in the show notes, but I'd like to read aloud all of the questions in this fair implementation of profile ontology. So these are the concrete things about which you can make a choice or face a challenge. And there are certain resources that might help you implement them or not. But this is sort of central to evaluating whether your outputs, your process, your research life cycle and all of your digital objects that are part of that, are FAIR in practice. So I'm going to go ahead and read the questions aloud. Some of them have to do with data or metadata and they separate them. I'll read them all.

So, F1-D. What globally unique, persistent, resolvable identifiers do you use for datasets? F1-MD for metadata, What globally unique, persistent, resolvable identifiers do you use for metadata records?

FIP question F2: Which metadata schemas do you use for findability?

Question F3. What is the technology that links the persistent identifiers of your data to the metadata description? F4-D: In which search engines are your datasets indexed? F4-MD: In which search engines are your metadata records indexed?

Question A1.1D -- Which standardized communication protocol do you use for datasets? A1.1-MD, which standardized communication protocol do you use for metadata records? Question A1.2D, which authentication and authorization technique do you use for data sets? A1.2-MD, Which authentication & authorisation technique do you use for metadata records?

A2: Which metadata longevity plan do you use?

Going to I1-D, Which knowledge representation languages (allowing machine interoperation) do you use for datasets? I1-MD, Which knowledge representation languages -- again, allowing machine interoperation -- do you use for metadata records? I2-D: which structured vocabularies do you use to encode your data sets?

I2-MD, which structured vocabularies do you use to annotate your metadata records? I3-D, Which models, schema(s) do you use for your datasets? And then I3-MD, Which models, or schemas, do you use for your metadata records?

Going to the R principles, R1.1-D: which usage license do you use for your data sets? R1.1-MD, which usage license do you use for your metadata records? R1.2-D, Which metadata schemas do you use for describing the provenance of your datasets? And R1.2-MD, Which metadata schemas do you use for describing the provenance of your metadata records?

So one thing to note here is you might have different ways of addressing all of the principles for your datasets as well as for the metadata records, which point to your data sets. In general, you want your metadata records to be open. They're the things that are going to be indexed, and so you might have different schemas for your data sets versus your records. In general, your data sets are probably going to be a lot more domain-specific in how you encode things. But with metadata, things could be considerably more general. And you might make choices one way or the other.

So I hope that's helpful to you, to understand these various concepts about FAIR implementation and how to come up with a profile to share with others and compare how you address the FAIR principles.

So these concepts are: questions, declarations, considerations, enabling resources, and implementation communities. I will include a link to this in the show notes. And, yeah, hopefully this has been helpful. I plan to go into each question in a bit more detail and talk about actual FAIR-Enabling Resources that are available or yet to be developed to address each of these questions that people might declare.

All right folks. That's it for today. I'm Donny Winston and I hope you join me again next time for Machine-Centric Science.