Misc code snippets
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

72 lines
1.6 KiB

2 years ago
# snippets
Misc code snippets I sometimes need and always have to look up how it works...
## linear_regression.py
Calculate the linear regression on two columns of a data frame. The resulting
object has the function `predict()` to calculate x or y values for a given
counterpart.
```python
from linear_regression import linear_regression
df = pd.DataFrame({"temperature":[...], "signal":[...]})
regression = linear_regression(df, x="temperature", y="signal")
repr(regression) == "Regression(intercept=1, coefficient=3, score=0.9998)"
regression.predict(x=3) == 10
regression.predict(y=7) == 2
```
## split_uniques.py
Splits a data frame on uniques values in a column
Returns a generator of tuples with at least two elements.
The _last_ element is the resulting partial data frame,
the element(s) before are the values used to split up the original data.
```python
from split_uniques import split_uniques
df = pd.DataFrame({
"A": [1, 2, 2],
"B": [3, 4, 3],
"C": ["x", "y", "z"]
})
result = list(split_uniques(df, ["B"]))
assert len(result) == 2
value, data = result[0]
assert value == 3
assert data == pd.DataFrame({
"A": [1, 1],
"B": [3, 3],
"C": ["x", "z"]
})
value, data = result[1]
assert value == 4
assert data == pd.DataFrame({
"A": [2],
"B": [4],
"C": ["y"]
})
```
This construct might look a little bit weird, but it makes it easy to use the
function in a loop definition:
```python
for well, probe, partial_data in split_uniques(full_data, ["Well", "Probe"]):
...
```