Utilities#

Utility functions and classes.

lydata.utils.get_github_auth(token: str | None = None, user: str | None = None, password: str | None = None) → <module 'github.Auth' from '/home/docs/checkouts/readthedocs.org/user_builds/lydata/envs/0.2.5/lib/python3.10/site-packages/github/Auth.py'>[source]#: Get the GitHub authentication object.

lydata.utils.update_and_expand(left: DataFrame, right: DataFrame, **update_kwargs: Any) → DataFrame[source]#

Update left with values from right, also adding columns from right.

The added feature of this function over pandas’ update() is that it also adds columns that are present in right but not in left.

Any keyword arguments are also directly passed to the update().

>>> left = pd.DataFrame({"a": [1, 2, None], "b": [3, 4, 5]})
>>> right = pd.DataFrame({"a": [None, 3, 4], "c": [6, 7, 8]})
>>> update_and_expand(left, right)
     a  b  c
0  1.0  3  6
1  3.0  4  7
2  4.0  5  8

lydata.utils.get_default_column_map() → _ColumnMap[source]#

Get the default column map.

This map defines which short column names can be used to access columns in the DataFrames.

>>> from lydata import accessor, loader
>>> df = next(loader.load_datasets(institution="usz"))
>>> df.ly.surgery   
0      False
...
286    False
Name: (patient, #, neck_dissection), Length: 287, dtype: bool
>>> df.ly.smoke   
0       True
...
286     True
Name: (patient, #, nicotine_abuse), Length: 287, dtype: bool

class lydata.utils.ModalityConfig(*, spec: Annotated[float, Ge(ge=0.5), Le(le=1.0)], sens: Annotated[float, Ge(ge=0.5), Le(le=1.0)], kind: Literal['clinical', 'pathological'] = 'clinical')[source]#

Define a diagnostic or pathological modality.

model_config: ClassVar[ConfigDict] = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lydata.utils.get_default_modalities() → dict[str, ModalityConfig][source]#

Get defaults values for sensitivities and specificities of modalities.

Taken from de Bondt et al. (2007) and Kyzas et al. (2008).

lydata.utils.infer_all_levels(dataset: DataFrame, infer_superlevels_kwargs: dict[str, Any] | None = None, infer_sublevels_kwargs: dict[str, Any] | None = None) → DataFrame[source]#

Infer all levels of involvement for each diagnostic modality.

This function first infers sublevel (e.g. ‘IIa’, and ‘IIb’) involvement for each modality using infer_sublevels(). Then, it infers superlevel (e.g. ‘II’) involvement for each modality using infer_superlevels().

lydata.utils.infer_and_combine_levels(dataset: DataFrame, infer_superlevels_kwargs: dict[str, Any] | None = None, infer_sublevels_kwargs: dict[str, Any] | None = None, combine_kwargs: dict[str, Any] | None = None) → DataFrame[source]#

Enhance the dataset by inferring additional columns from the data.

This performs the following steps in order:

Infer the superlevel involvement for each diagnostic modality using the
infer_superlevels() method.
Infer the sublevel involvement for each diagnostic modality using the
infer_sublevels() method. This skips all LNLs that were computed in the previous step.
Compute the maximum likelihood estimate of the true state of the patient using
the combine().

Important

Performing these operations in any other order may lead to the loss of some information or even to conflicting LNL involvement information.

The result contains all LNLs of interest in the head and neck region, as well as the best estimate of the true state of the patient under the top-level key max_llh.

Utilities

Contents

Utilities#