meerkat.ml package
Submodules
meerkat.ml.activation module
meerkat.ml.callbacks module
meerkat.ml.embedding_column module
meerkat.ml.huggingfacemodel module
meerkat.ml.instances_column module
meerkat.ml.metrics module
- accuracy(predictions: Union[list, numpy.array, torch.Tensor], labels: Union[list, numpy.array, torch.Tensor])[source]
Calculate accuracy.
- class_distribution(labels: Union[list, numpy.array, torch.Tensor], num_classes: Optional[int] = None, min_label: int = 0)[source]
Calculate the aggregated class distribution.
- compute_metric(metric: str, predictions: Union[Sequence, torch.Tensor], labels: Union[Sequence, torch.Tensor], num_classes: int) Union[float, numpy.ndarray, torch.Tensor][source]
Compute metric given predictions and target labels.
- Parameters
metric (str) – name of metric
predictions (Union[Sequence, torch.Tensor]) – a sequence of predictions (rouge metrics) or a torch Tensor (other metrics) containing predictions
labels (Union[Sequence, torch.Tensor]) – a sequence of labels (rouge metrics) or a torch Tensor (other metrics) containing target labels
num_classes (int) – number of classes
- Returns
the calculate metric value
- dice(predictions: Union[list, numpy.array, torch.Tensor], labels: Union[list, numpy.array, torch.Tensor])[source]
Calculate Dice Score.
- f1(predictions: Union[list, numpy.array, torch.Tensor], labels: Union[list, numpy.array, torch.Tensor])[source]
Calculate F1 score for binary classification.
- f1_macro(predictions: Union[list, numpy.array, torch.Tensor], labels: Union[list, numpy.array, torch.Tensor])[source]
Calculate macro F1 score for multi-class classification.
meerkat.ml.model module
- class Model(model: torch.nn.modules.module.Module, is_classifier: Optional[bool] = None, task: Optional[str] = None, device: Optional[str] = None)[source]
Bases:
torch.nn.modules.module.Module- activation(dataset: meerkat.datapanel.DataPanel, target_module: str, input_columns: List[str], batch_size=32) meerkat.ml.embedding_column.EmbeddingColumn[source]
An Operation that stores model activations in a new Embedding column.
- Parameters
dataset (DataPanel) – the meerkat DataPanel containing the model inputs.
target_module (str) – the name of the submodule of model (i.e. an intermediate layer) that outputs the activations we’d like to extract. For nested submodules, specify a path separated by “.” (e.g. ActivationCachedOp(model, “block4.conv”)).
input_columns (str) – Column containing model inputs
- classification(dataset: meerkat.datapanel.DataPanel, input_columns: List[str], batch_size: int = 32, num_classes: Optional[int] = None, multi_label: bool = False, one_hot: Optional[bool] = None, threshold=0.5) meerkat.datapanel.DataPanel[source]
- evaluate(dataset: meerkat.datapanel.DataPanel, target_column: List[str], pred_column: List[str], metrics: List[str], num_classes: Optional[int] = None)[source]
- forward(input_batch: Dict) Dict[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- static remap_labels(output_dict: Dict, label_map: List[int]) Dict[source]
Map the output labels of the model.
Example: 3-way classificaiton, with label_map = [1, 2, 0] => (model label 0 -> dataset label 1, model label 1 -> dataset label 2, …).
- training: bool
meerkat.ml.prediction_column module
- class ClassificationOutputColumn(logits: Optional[Union[Sequence, numpy.ndarray, pandas.core.series.Series, torch.Tensor]] = None, probs: Optional[Union[Sequence, numpy.ndarray, pandas.core.series.Series, torch.Tensor]] = None, preds: Optional[Union[Sequence, numpy.ndarray, pandas.core.series.Series, torch.Tensor]] = None, num_classes: Optional[int] = None, multi_label: bool = False, one_hot: Optional[bool] = None, threshold=0.5, *args, **kwargs)[source]
Bases:
meerkat.columns.tensor_column.TensorColumn- bincount() meerkat.columns.tensor_column.TensorColumn[source]
Compute the count (cardinality) for each category.
Categories which are not available will have a count of 0.
If
self.multi_label=True, the bincount will include the total number of times the category is seen. If an example is marked as 2 categories, the bincount will increase the count for both categories. Note, this means the sum of the number of classes can be more than the number of examplesN.- Returns
A 1D tensor of length
self.num_classes.- Return type
torch.Tensor
- entropy() meerkat.columns.tensor_column.TensorColumn[source]
Compute the entropy for each example.
If
self.multi_labelis True, each category is treated as a binary classification problem. There will be an entropy calculation for each category as well. For example, if the probabilities are of shape(N, C), there will beNxCentropy values.In the multi-dimensional case, this returns the entropy for each element. For example, if the probabilities are of shape
(N, C, A, B), there will beNxAxBentropy values.- Returns
Tensor of entropies
- Return type
- predictions() meerkat.ml.prediction_column.ClassificationOutputColumn[source]
Compute predictions.
- preds() meerkat.ml.prediction_column.ClassificationOutputColumn
Compute predictions.
- probabilities() meerkat.ml.prediction_column.ClassificationOutputColumn[source]