Utilities#

Utility functions and classes.

lydata.utils.get_default_column_map() _ColumnMap[source]#

Get the default column map.

This map defines which short column names can be used to access columns in the DataFrames.

>>> from lydata import accessor, loader
>>> df = next(loader.load_datasets(institution="usz"))
>>> df.ly.surgery   
0      False
...
286    False
Name: (patient, #, neck_dissection), Length: 287, dtype: bool
>>> df.ly.smoke   
0       True
...
286     True
Name: (patient, #, nicotine_abuse), Length: 287, dtype: bool
class lydata.utils.ModalityConfig(*, spec: Annotated[float, Ge(ge=0.5), Le(le=1.0)], sens: Annotated[float, Ge(ge=0.5), Le(le=1.0)], kind: Literal['clinical', 'pathological'] = 'clinical')[source]#

Define a diagnostic or pathological modality.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'kind': FieldInfo(annotation=Literal['clinical', 'pathological'], required=False, default='clinical', description='Clinical modalities cannot detect microscopic disease.'), 'sens': FieldInfo(annotation=float, required=True, description='Sensitivity of the modality.', metadata=[Ge(ge=0.5), Le(le=1.0)]), 'spec': FieldInfo(annotation=float, required=True, description='Specificity of the modality.', metadata=[Ge(ge=0.5), Le(le=1.0)])}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

lydata.utils.get_default_modalities() dict[str, ModalityConfig][source]#

Get defaults values for sensitivities and specificities of modalities.

Taken from de Bondt et al. (2007) and Kyzas et al. (2008).

lydata.utils.enhance(dataset: DataFrame, infer_sublevels_kwargs: dict[str, Any] | None = None, infer_superlevels_kwargs: dict[str, Any] | None = None, combine_kwargs: dict[str, Any] | None = None) DataFrame[source]#

Enhance the dataset by inferring additional columns from the data.

This performs the following steps in order:

  1. Infer the superlevel involvement for each diagnostic modality using the

    infer_superlevels() method.

  2. Infer the sublevel involvement for each diagnostic modality using the

    infer_sublevels() method. This skips all LNLs that were computed in the previous step.

  3. Compute the maximum likelihood estimate of the true state of the patient using

    the combine().

Important

Performing these operations in any other order may lead to the loss of some information or even to conflicting LNL involvement information.

The result contains all LNLs of interest in the head and neck region, as well as the best estimate of the true state of the patient under the top-level key max_llh.