Some simple tools for working with parsed Sensospot data.

Holger Frey 64098856d1 If a DataFrame is split on a column containing NaN values, the rows with NaN values will now be included in the results		6 months ago
docs	added mkdocks for documentation	2 years ago
src/sensospot_tools	If a DataFrame is split on a column containing NaN values, the rows with NaN values will now be included in the results	6 months ago
tests	If a DataFrame is split on a column containing NaN values, the rows with NaN values will now be included in the results	6 months ago
.gitignore	changed isolated test runner from tox to nox	1 year ago
.pre-commit-config.yaml	Formatting changes after switching to "ruff" for linting	1 year ago
CONTRIBUTING.md	import of project template	2 years ago
LICENSE	import of project template	2 years ago
Makefile	added downstream testing to nox	9 months ago
README.md	the function `selection.split()` now accepts multiple columns for iteration	9 months ago
mkdocs.yml	added mkdocks for documentation	2 years ago
noxfile.py	simplified noxfile	9 months ago
pyproject.toml	If a DataFrame is split on a column containing NaN values, the rows with NaN values will now be included in the results	6 months ago

README.md

Sensospot Tools

Some small tools for working with parsed Sensospot data.

Selecting and spliting a pandas data frame

select(data: DataFrame, column: str, value: Any) -> DataFrame

Selects rows of a dataframe based on a value in a column

Example:


    from sensospot_tools import select

    print(data)
        category  value
    0      dog      1
    1      cat      2
    2    horse      3
    3      cat      4

    print(select(data, "category", "cat"))
          category  value
        1      cat      2
        3      cat      4

split(data: DataFrame, *on: Any) -> Iterator[tuple[Any, ..., DataFrame]]

Splits a data frame on unique values in multiple columns

Returns a generator of tuples with at least two elements. The last element is the resulting partial data frame, the element(s) before are the values used to split up the original data.

Example:


    from sensospot_tools import split

    print(data)
        category  value
    0      dog      1
    1      cat      2
    2    horse      3
    3      cat      4

    result = dict( split(data, column="category") )

    print(result["dog"])
        category  value
    0      dog      1

    print(result["cat"])
        category  value
    1      cat      2
    3      cat      4

    print(result["horse"])
        category  value
    2    horse      3

Working with data with multiple exposure times

select_hdr_data(data: DataFrame, spot_id_columns: list[str], time_column: str, overflow_column: str) -> DataFrame:

Selects the data for increased dynamic measurement range.

To increase the dynamic range of a measurement, multiple exposures of one microarray might be taken.

This function selects the data of only one exposure time per spot, based on the information if the spot is in overflow. It starts with the weakest signals (longest exposure time) first and chooses the next lower exposure time, if the result in the overflow_column is True.

This is done for each spot, and therfore a spot needs a way to be identified across multiple exposure times. Examples for this are: - for a single array: the spot id (e.g. "Pos.Id") - for multiple arrays: the array position and the spot id (e.g. "Well.Name" and "Pos.Id") - for multiple runs: the name of the run, array position and the spot id (e.g. "File.Name", "Well.Name" and "Pos.Id")

The function will raise a KeyError if any of the provided column names is not present in the data frame

normalize(data: DataFrame, normalized_time: Union[int, float], time_column: str, value_columns: list[str], template: str) -> DataFrame:

normalizes values to a normalized exposure time

Will raise a KeyError, if any column is not in the data frame; raises ValueError if no template string was provided.

Development

To install the development version of Sensospot Tools:

git clone https://git.cpi.imtek.uni-freiburg.de/holgi/sensospot_tools.git

# create a virtual environment and install all required dev dependencies
cd sensospot_tools
make devenv

To run the tests, use make tests or make coverage for a complete report.

To generate the documentation pages use make docs or make serve-docs for starting a webserver with the generated documentation