meerkat.datasets.audioset package

Module contents

build_audioset_dp(dataset_dir: str, splits: Optional[List[str]] = None, audio_column: bool = True, overwrite: bool = False) Dict[str, DataPanel][source]

Build DataPanels for the audioset dataset downloaded to dataset_dir. By default, the resulting DataPanels will be written to dataset_dir under the filenames “audioset_examples.mk” and “audioset_labels.mk”. If these files already exist and overwrite is False, the DataPanels will not be built anew, and instead will be simply loaded from disk.

Parameters
  • dataset_dir – The directory where the dataset is stored

  • download – Whether to download the dataset

  • splits – A list of splits to include. Defaults to [“eval_segments”]. Other splits: “balanced_train_segments”, “unbalanced_train_segments”.

  • audio_column (bool) – Whether to include a AudioColumn. Defaults to True.

  • overwrite (bool) – Whether to overwrite existing DataPanels saved to disk. Defaults to False.

build_ontology_dp(dataset_dir: str) Dict[str, DataPanel][source]

Build a DataPanel from the ontology.json file.

Parameters

dataset_dir – The directory where the ontology.json file is stored

find_submids(id: Union[List[str], str], relations: Optional[DataPanel] = None, dataset_dir: Optional[str] = None) List[str][source]

Returns a list of IDs of all subcategories of an audio category.

Parameters
  • ids – ID or list of IDs for which to find the subcategories

  • dp – A DataPanel built from the ontology.json file.

  • dataset_dir – Alternatively, the directory where the ontology.json file is stored can be provided to construct a DataPanel