Displaying episodes 1 - 21 of 21 in total

Stuck Data Mining Again (Lodi)

Things got bad, and things got worse. I guess you will know the tune.

Don't Silo Me In

with apologies to Cole Porter and Robert Fletcher

Shreyas Cholia

I interview Shreyas Cholia, currently at the Lawrence Berkeley National Laboratory in Berkeley, California. Topics we spoke about included: data lifecycles, edge computing for data firehoses, provenance, standards, broad versus detailed domain vocabularies, scope for common APIs, and identifier leveling.

Patrick Huck

I interview Patrick Huck, currently staff on the Materials Project at the Lawrence Berkeley National Laboratory in Berkeley, California, United States. We talk about choices and considerations in implementing FAIR.

FAIR Implementation Profile (FIP) Ontology

A FAIR implementation profile is a way to communicate how you're implementing the FAIR principles. It's a way for people, communities of practice, to share how they're addressing the FAIR principles, the choices that they've made, or considerations they've made about those choices, what enabling resources they're using to make those choices or what challenges they have, what they're planning. Also, to associate themselves with a community of practice so that people can perhaps, in similar fields, adopt similar answers to some of the questions about how do you implement FAIR.

R1.3: metadata and data meet domain-relevant community standards

FAIR principle R1.3: meta(data) meet domain-relevant community standards. An overview of the fundamentals of relevance and ranking in your search for standards.

R1.2: Metadata and data are associated with detailed provenance

The 14th of the 15 FAIR principles, R1.2: metadata and data are associated with detailed provenance. A dive into the World Wide Web Consortium (W3C) Provenance Data Model -- what are the different parts of provenance, and what are some terms that can be used in order to manage it?

R1.1: Meta(data) are released with a clear and accessible data usage license

FAIR Principle R1.1: Meta(data) are released with a clear and accessible data usage license. Overview of Creative Commons licenses for data and various licenses (BSD, MIT, GPL, oh my!) for code.

R1: (Meta)data are richly described with a plurality of accurate and relevant attributes

The 12th of the 15 FAIR principles, R1: metadata and data are richly described with a plurality of accurate and relevant attributes.

I3: (meta)data include qualified references to other (meta)data

It's more powerful when our references are indexed by nature rather than by number. On the 11th of the 15 FAIR principles, I3: metadata and data include qualified references to other metadata and data.

I2: (Meta)data use vocabularies that follow the FAIR principles

The 10th of the 15 FAIR principles, I2: metadata and data use vocabularies that follow the FAIR principles.

I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation

About the 9th of the 15 FAIR principles, I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. You need controlled term sets, vocabularies, ontologies, thesauri, whatever you want to call it, ideally having globally unique, persistent, resolvable identifiers. And apart from these controlled vocabularies you need actual models, well-defined frameworks, to describe and structure the metadata according to those controlled vocabularies.

A2. Metadata are accessible, even when the data are no longer available

Data may be, or become, inaccessible by design, or on request, or by accident. While it was accessible, it may have been used by others. If someone has a reference to data by ID, can they minimally understand the nature and provenance of the data?

A1.2: The protocol allows for authentication and authorisation where necessary

FAIR does not mean open. You're certainly allowed to authenticate and to authorize. The HTTP protocol is pretty great for this.

A1.1: The protocol is open, free and universally implementable

open -- free as in speech, free -- free as in beer, and universally implementable -- NOT free as in puppies.

A1: (Meta)data are retrievable by their identifier using a standardized communication protocol

So, you've identified a digital resource. Now it's time to retrieve it and/or its metadata record. TL;DR - Use HTTP(S) if possible.

F4: (Meta)data are registered or indexed in a searchable resource

Increasing leverage: the ratio of machine action to user action. Indexing as leverage via sorting.

F3: Metadata clearly and explicitly include the identifier of the data they describe

Literature references with and without DOIs. Tables of data in articles with and without unique identifiers in each row for what that row is about. The magic of including identifiers in the metadata you share.

F2: Data are described with rich metadata

"intrinsic" vs "extrinsic" metadata. Other-than-technical interoperability. Qualification vs. "Quality". Feature detection. Search-engine "rich results".

F1: (Meta)data have globally unique, persistent identifiers

Today, we'll be talking about the first of the FAIR principles, F1: Metadata are assigned globally unique and persistent identifiers.

What to expect from this podcast

A weekly podcast for scientific researchers getting far with FAIR.