langspace.probe package

Subpackages

Submodules

langspace.probe.base module

class langspace.probe.base.LatentSpaceProbe(model: LangVAE, data: Iterable[Sentence], sample_size: int, **kwargs)[source]

Bases: ABC

Abstract base class for probing the latent space of a language VAE.

batched_encoding(data: Iterable[Sentence], annotations: Dict[str, List[str]] = None, batch_size: int = 100) Tensor[source]

Encodes the sentences

Parameters:
  • data (Iterable[Sentence]) – sentences

  • annotations (List[str]) – optional annotations to be used, if available in the data, and their respective

  • values. (possible) –

  • batch_size (int) – number of sentences to be processed simultaneously

Returns:

Latent representation

Return type:

Tensor

decoding(prior: Tensor, cvars_emb: List[Tensor] = None) List[str][source]

args: sent_num by latent_dim return: sentence list

encoding(data: Iterable[Sentence], annotations: Dict[str, List[str]] = None) Tuple[Tensor, Tensor, Tensor, List[Tensor]][source]

Encode the input data and return the mean, standard deviation, and latent representation.

Parameters:

data (Iterable[Union[str, Sentence]]) – The input data to encode.

Returns:

A tuple containing the mean, standard deviation, latent representation and conditional variable embeddings, as tensors.

Return type:

Tuple[Tensor, Tensor, Tensor, Tensor]

get_tokenized_data_seed(data: Iterable[Sentence], annotations: Dict[str, List[str]] = None) TokenizedDataSet[source]
abstract report() DataFrame[source]

Generate a report from the probe.

Returns:

The generated report.

Return type:

DataFrame

Module contents