meerkat.ml package

Submodules

meerkat.ml.activation module

meerkat.ml.callbacks module

meerkat.ml.embedding_column module

class EmbeddingColumn(data: Optional[Sequence] = None, *args, **kwargs)[source]

Bases: TensorColumn

build_faiss_index(index=None, overwrite=False)[source]
pca(n_components=2)[source]
search(query, k: int)[source]
umap(n_neighbors=15, n_components=2)[source]
visualize_umap(n_neighbors=15, n_components=2, point_size=4)[source]

meerkat.ml.huggingfacemodel module

meerkat.ml.instances_column module

meerkat.ml.metrics module

accuracy(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor])[source]

Calculate accuracy.

class_distribution(labels: Union[list, array, Tensor], num_classes: Optional[int] = None, min_label: int = 0)[source]

Calculate the aggregated class distribution.

compute_metric(metric: str, predictions: Union[Sequence, Tensor], labels: Union[Sequence, Tensor], num_classes: int) Union[float, ndarray, Tensor][source]

Compute metric given predictions and target labels.

Parameters
  • metric (str) – name of metric

  • predictions (Union[Sequence, torch.Tensor]) – a sequence of predictions (rouge metrics) or a torch Tensor (other metrics) containing predictions

  • labels (Union[Sequence, torch.Tensor]) – a sequence of labels (rouge metrics) or a torch Tensor (other metrics) containing target labels

  • num_classes (int) – number of classes

Returns

the calculate metric value

dice(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor])[source]

Calculate Dice Score.

f1(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor])[source]

Calculate F1 score for binary classification.

f1_macro(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor])[source]

Calculate macro F1 score for multi-class classification.

f1_micro(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor])[source]

Calculate micro F1 score for multi-class classification.

format_summary(x: str) str[source]

Format summary text for computing rouge.

get_metric(name: str) Callable[source]

Get metrics from string names.

iou_score(predictions: Union[list, array, Tensor], labels: Union[list, array, Tensor], num_classes: Optional[int] = None)[source]

Calculate IoU.

meerkat.ml.model module

class Model(model: Module, is_classifier: Optional[bool] = None, task: Optional[str] = None, device: Optional[str] = None)[source]

Bases: Module

activation(dataset: DataPanel, target_module: str, input_columns: List[str], batch_size=32) EmbeddingColumn[source]

An Operation that stores model activations in a new Embedding column.

Parameters
  • dataset (DataPanel) – the meerkat DataPanel containing the model inputs.

  • target_module (str) – the name of the submodule of model (i.e. an intermediate layer) that outputs the activations we’d like to extract. For nested submodules, specify a path separated by “.” (e.g. ActivationCachedOp(model, “block4.conv”)).

  • input_columns (str) – Column containing model inputs

classification(dataset: DataPanel, input_columns: List[str], batch_size: int = 32, num_classes: Optional[int] = None, multi_label: bool = False, one_hot: Optional[bool] = None, threshold=0.5) DataPanel[source]
evaluate(dataset: DataPanel, target_column: List[str], pred_column: List[str], metrics: List[str], num_classes: Optional[int] = None)[source]
forward(input_batch: Dict) Dict[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static remap_labels(output_dict: Dict, label_map: List[int]) Dict[source]

Map the output labels of the model.

Example: 3-way classificaiton, with label_map = [1, 2, 0] => (model label 0 -> dataset label 1, model label 1 -> dataset label 2, …).

training: bool

meerkat.ml.prediction_column module

class ClassificationOutputColumn(logits: Optional[Union[Sequence, ndarray, Series, Tensor]] = None, probs: Optional[Union[Sequence, ndarray, Series, Tensor]] = None, preds: Optional[Union[Sequence, ndarray, Series, Tensor]] = None, num_classes: Optional[int] = None, multi_label: bool = False, one_hot: Optional[bool] = None, threshold=0.5, *args, **kwargs)[source]

Bases: TensorColumn

bincount() TensorColumn[source]

Compute the count (cardinality) for each category.

Categories which are not available will have a count of 0.

If self.multi_label=True, the bincount will include the total number of times the category is seen. If an example is marked as 2 categories, the bincount will increase the count for both categories. Note, this means the sum of the number of classes can be more than the number of examples N.

Returns

A 1D tensor of length self.num_classes.

Return type

torch.Tensor

entropy() TensorColumn[source]

Compute the entropy for each example.

If self.multi_label is True, each category is treated as a binary classification problem. There will be an entropy calculation for each category as well. For example, if the probabilities are of shape (N, C), there will be NxC entropy values.

In the multi-dimensional case, this returns the entropy for each element. For example, if the probabilities are of shape (N, C, A, B), there will be NxAxB entropy values.

Returns

Tensor of entropies

Return type

TensorColumn

logits() ClassificationOutputColumn[source]
mode()[source]
predictions() ClassificationOutputColumn[source]

Compute predictions.

preds() ClassificationOutputColumn

Compute predictions.

probabilities() ClassificationOutputColumn[source]
probs() ClassificationOutputColumn

meerkat.ml.segmentation_column module

meerkat.ml.tensormodel module

Module contents