Displaying episodes 1 - 21 of 21 in total
Stuck Data Mining Again (Lodi)
Things got bad, and things got worse. I guess you will know the tune.
Don't Silo Me In
with apologies to Cole Porter and Robert Fletcher
Shreyas Cholia
I interview Shreyas Cholia, currently at the Lawrence Berkeley National Laboratory in Berkeley, California. Topics we spoke about included: data lifecycles, edge computing for data firehoses, provenance, standards, broad versus detailed domain vocabularies, scope for common APIs, and identifier leveling.
Patrick Huck
I interview Patrick Huck, currently staff on the Materials Project at the Lawrence Berkeley National Laboratory in Berkeley, California, United States. We talk about choices and considerations in implementing FAIR.
FAIR Implementation Profile (FIP) Ontology
A FAIR implementation profile is a way to communicate how you're implementing the FAIR principles. It's a way for people, communities of practice, to share how they're addressing the FAIR principles, the choices that they've made, or considerations they've made about those choices, what enabling resources they're using to make those choices or what challenges they have, what they're planning. Also, to associate themselves with a community of practice so that people can perhaps, in similar fields, adopt similar answers to some of the questions about how do you implement FAIR.
R1.3: metadata and data meet domain-relevant community standards
FAIR principle R1.3: meta(data) meet domain-relevant community standards. An overview of the fundamentals of relevance and ranking in your search for standards.
R1.2: Metadata and data are associated with detailed provenance
The 14th of the 15 FAIR principles, R1.2: metadata and data are associated with detailed provenance. A dive into the World Wide Web Consortium (W3C) Provenance Data Model -- what are the different parts of provenance, and what are some terms that can be used in order to manage it?
R1.1: Meta(data) are released with a clear and accessible data usage license
FAIR Principle R1.1: Meta(data) are released with a clear and accessible data usage license. Overview of Creative Commons licenses for data and various licenses (BSD, MIT, GPL, oh my!) for code.
R1: (Meta)data are richly described with a plurality of accurate and relevant attributes
The 12th of the 15 FAIR principles, R1: metadata and data are richly described with a plurality of accurate and relevant attributes.
I3: (meta)data include qualified references to other (meta)data
It's more powerful when our references are indexed by nature rather than by number. On the 11th of the 15 FAIR principles, I3: metadata and data include qualified references to other metadata and data.
I2: (Meta)data use vocabularies that follow the FAIR principles
The 10th of the 15 FAIR principles, I2: metadata and data use vocabularies that follow the FAIR principles.
I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
About the 9th of the 15 FAIR principles, I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. You need controlled term sets, vocabularies, ontologies, thesauri, whatever you want to call it, ideally having globally unique, persistent, resolvable identifiers. And apart from these controlled vocabularies you need actual models, well-defined frameworks, to describe and structure the metadata according to those controlled vocabularies.
A2. Metadata are accessible, even when the data are no longer available
Data may be, or become, inaccessible by design, or on request, or by accident. While it was accessible, it may have been used by others. If someone has a reference to data by ID, can they minimally understand the nature and provenance of the data?
A1.2: The protocol allows for authentication and authorisation where necessary
FAIR does not mean open. You're certainly allowed to authenticate and to authorize. The HTTP protocol is pretty great for this.
A1.1: The protocol is open, free and universally implementable
open -- free as in speech, free -- free as in beer, and universally implementable -- NOT free as in puppies.
A1: (Meta)data are retrievable by their identifier using a standardized communication protocol
So, you've identified a digital resource. Now it's time to retrieve it and/or its metadata record. TL;DR - Use HTTP(S) if possible.
F4: (Meta)data are registered or indexed in a searchable resource
Increasing leverage: the ratio of machine action to user action. Indexing as leverage via sorting.
F3: Metadata clearly and explicitly include the identifier of the data they describe
Literature references with and without DOIs. Tables of data in articles with and without unique identifiers in each row for what that row is about. The magic of including identifiers in the metadata you share.
F2: Data are described with rich metadata
"intrinsic" vs "extrinsic" metadata. Other-than-technical interoperability. Qualification vs. "Quality". Feature detection. Search-engine "rich results".
F1: (Meta)data have globally unique, persistent identifiers
Today, we'll be talking about the first of the FAIR principles, F1: Metadata are assigned globally unique and persistent identifiers.
What to expect from this podcast
A weekly podcast for scientific researchers getting far with FAIR.