# Croissant (metadata format)

> Croissant is a metadata format design to support sharing of datasets for machine learning applications. It is a platform-agnostic schema used to standardize metadata in data repositories like Hugging Face, kaggle, Dataverse and OpenML. Structure Croissant builds upon schema.org, uses primarily JSON-LD, and divides metadata in four &#8220;layers&#8221;: Dataset Metadata, Resource, Structure and Semantic: The [&hellip;]

**Croissant** is a metadata format design to support sharing of datasets for machine learning applications. It is a platform-agnostic schema used to standardize metadata in data repositories like Hugging Face, kaggle, Dataverse and OpenML.

## Structure

Croissant builds upon schema.org, uses primarily JSON-LD, and divides metadata in four “layers”: *Dataset Metadata, Resource, Structure* and *Semantic*:

- The *Dataset Metadata* layer constrains which schema.org properties should be used, including additional properties, linking together the resources (*files*) of the dataset with general metadata, like licensing and citation information.

- The *Resource* layer describes the individual files and sets of those using two new classes, *FileObject* and *FileSet.* A *FileSet* may be a collection of related images.

- The *Structure* layer specifies how the files are organized in the dataset. A *RecordSet* class describes how resources are present, configurations that may very a lot between modality. This specification facilitates interoperability of the datasets.

…

*Source: [Wikipedia](https://en.wikipedia.org/wiki/Croissant_%28metadata_format%29)*

---

## Metadata

- **URL:** https://wpsearchai.com/croissant-metadata-format/
- **Published:** 2026-01-28T18:48:18+00:00
- **Modified:** 2026-01-28T18:48:18+00:00
- **Author:** admin
- **Categories:** Machine learning
