Some simple tools for working with parsed Sensospot data.

Holger Frey 66844969d8 the function `selection.split()` now accepts multiple columns for iteration		2 years ago
docs	added mkdocks for documentation	3 years ago
src/sensospot_tools	the function `selection.split()` now accepts multiple columns for iteration	2 years ago
tests	the function `selection.split()` now accepts multiple columns for iteration	2 years ago
.gitignore	changed isolated test runner from tox to nox	3 years ago
.pre-commit-config.yaml	Formatting changes after switching to "ruff" for linting	3 years ago
CONTRIBUTING.md	import of project template	3 years ago
LICENSE	import of project template	3 years ago
Makefile	added downstream testing to nox	2 years ago
README.md	the function `selection.split()` now accepts multiple columns for iteration	2 years ago
mkdocs.yml	added mkdocks for documentation	3 years ago
noxfile.py	simplified noxfile	2 years ago
pyproject.toml	changed isolated test runner from tox to nox	3 years ago

README.md

Sensospot Tools

Some small tools for working with parsed Sensospot data.

Selecting and spliting a pandas data frame

select(data: DataFrame, column: str, value: Any) -> DataFrame

Selects rows of a dataframe based on a value in a column

Example:


    from sensospot_tools import select

    print(data)
        category  value
    0      dog      1
    1      cat      2
    2    horse      3
    3      cat      4

    print(select(data, "category", "cat"))
          category  value
        1      cat      2
        3      cat      4

split(data: DataFrame, *on: Any) -> Iterator[tuple[Any, ..., DataFrame]]

Splits a data frame on unique values in multiple columns

Returns a generator of tuples with at least two elements. The last element is the resulting partial data frame, the element(s) before are the values used to split up the original data.

Example:


    from sensospot_tools import split

    print(data)
        category  value
    0      dog      1
    1      cat      2
    2    horse      3
    3      cat      4

    result = dict( split(data, column="category") )

    print(result["dog"])
        category  value
    0      dog      1

    print(result["cat"])
        category  value
    1      cat      2
    3      cat      4

    print(result["horse"])
        category  value
    2    horse      3

Working with data with multiple exposure times

select_hdr_data(data: DataFrame, spot_id_columns: list[str], time_column: str, overflow_column: str) -> DataFrame:

Selects the data for increased dynamic measurement range.

To increase the dynamic range of a measurement, multiple exposures of one microarray might be taken.

This function selects the data of only one exposure time per spot, based on the information if the spot is in overflow. It starts with the weakest signals (longest exposure time) first and chooses the next lower exposure time, if the result in the overflow_column is True.

This is done for each spot, and therfore a spot needs a way to be identified across multiple exposure times. Examples for this are: - for a single array: the spot id (e.g. "Pos.Id") - for multiple arrays: the array position and the spot id (e.g. "Well.Name" and "Pos.Id") - for multiple runs: the name of the run, array position and the spot id (e.g. "File.Name", "Well.Name" and "Pos.Id")

The function will raise a KeyError if any of the provided column names is not present in the data frame

normalize(data: DataFrame, normalized_time: Union[int, float], time_column: str, value_columns: list[str], template: str) -> DataFrame:

normalizes values to a normalized exposure time

Will raise a KeyError, if any column is not in the data frame; raises ValueError if no template string was provided.

Development

To install the development version of Sensospot Tools:

git clone https://git.cpi.imtek.uni-freiburg.de/holgi/sensospot_tools.git

# create a virtual environment and install all required dev dependencies
cd sensospot_tools
make devenv

To run the tests, use make tests or make coverage for a complete report.

To generate the documentation pages use make docs or make serve-docs for starting a webserver with the generated documentation