meerkat.contrib.wilds package#
WILDS integration for Meerkat.
- class WILDSInputColumn(dataset_name: str = 'fmow', version: str | None = None, root_dir: str | None = None, split: str | None = None, use_transform: bool = True, **kwargs)[source]#
Bases:
AbstractColumn
- get_y_column()[source]#
Get a NumpyArrayColumn holding the targets for the dataset.
Warning: WildsDataset`s may remap indexes in arbitrary ways so it’s important not to directly try to access the underlying data structures, instead relying on the `y_array and metadata_array properties which are universal across WILDS datasets.
- get_wilds_datapanel(dataset_name: str, root_dir: str, version: str | None = None, column_names: List[str] | None = None, info: DatasetInfo | None = None, split: str | None = None, use_transform: bool = True, include_raw_input: bool = True)[source]#
Get a DataPanel that holds a WildsInputColumn alongside NumpyColumns for targets and metadata.
Example: Run inference on the dataset and store predictions alongside the data. .. code-block:: python
dp = get_wilds_datapanel(“fmow”, root_dir=”/datasets/”, split=”test”) model = … # get the model model.to(0).eval()
@torch.no_grad() def predict(batch: dict):
out = torch.softmax(model(batch[“input”].to(0)), axis=-1) return {“pred”: out.cpu().numpy().argmax(axis=-1)}
dp = dp.update(function=predict, batch_size=128, is_batched_fn=True)
- Parameters:
dataset_name (str, optional) – dataset name. Defaults to “fmow”.
version (str, optional) – dataset version number, e.g., ‘1.0’. Defaults to the latest version.
root_dir (str) – the directory where the WILDS dataset is downloaded. See https://wilds.stanford.edu/ for download instructions.
split (str, optional) – see . Defaults to None.
use_transform (bool, optional) – Whether to apply the transform from the WILDS example directory on load. Defaults to True.
column_names (List[str], optional) – [description]. Defaults to None.
info (DatasetInfo, optional) – [description]. Defaults to None.
use_transform – [description]. Defaults to True.
include_raw_input (bool, optional) – include a column for the input without the transform applied – useful for visualizing images. Defaults to True.
Submodules#
meerkat.contrib.wilds.config module#
WILDS configuration defaults and operations.
All default configurations are integrated from the WILDS repository: p-lambda/wilds
- populate_config(config, template: dict, force_compatibility=False)[source]#
Populates missing (key, val) pairs in config with (key, val) in template. Example usage: populate config with defaults :param - config: namespace :param - template: dict :param - force_compatibility: option to raise errors if config.key != template[key]