# snippets Misc code snippets I sometimes need and always have to look up how it works... ## linear_regression.py Calculate the linear regression on two columns of a data frame. The resulting object has the function `predict()` to calculate x or y values for a given counterpart. ```python from linear_regression import linear_regression df = pd.DataFrame({"temperature":[...], "signal":[...]}) regression = linear_regression(df, x="temperature", y="signal") repr(regression) == "Regression(intercept=1, coefficient=3, score=0.9998)" regression.predict(x=3) == 10 regression.predict(y=7) == 2 ``` ## split_uniques.py Splits a data frame on uniques values in a column Returns a generator of tuples with at least two elements. The _last_ element is the resulting partial data frame, the element(s) before are the values used to split up the original data. ```python from split_uniques import split_uniques df = pd.DataFrame({ "A": [1, 2, 2], "B": [3, 4, 3], "C": ["x", "y", "z"] }) result = list(split_uniques(df, ["B"])) assert len(result) == 3 value, data = result[0] assert value == 3 assert data == pd.DataFrame({ "A": [1, 1], "B": [3, 3], "C": ["x", "z"] }) value, data = result[1] assert value == 4 assert data == pd.DataFrame({ "A": [2], "B": [4], "C": ["y"] }) ``` This construct might look a little bit weird, but it makes it easy to use the function in a loop definition: ```python for well, probe, partial_data in split_uniques(full_data, ["Well", "Probe"]): ... ```