|
arsenalgear-py
A library containing general purpose Python utils.
|
Functions | |
| def | AMS_score (x_cut, predictions, label_vectors, weights) |
| "AMS_score" function More... | |
| def | RemoveOutliers (array, max_deviations) |
| RemoveOutliers. More... | |
| def | RemoveOutliersDF (dataframe, max_deviations, show_progressb=True) |
| RemoveOutliersDF. More... | |
| def arsenalgear.datascience.AMS_score | ( | x_cut, | |
| predictions, | |||
| label_vectors, | |||
| weights | |||
| ) |
"AMS_score" function
Function used to compute the AMS score.
Args:
x_cut ( double ): s the cut parameter of the AMS. It ranges from 0.5 to 1 in steps of 0.1.
predictions ( numpy array ): is a binary array, defined from the set of data that we're considering.
label_vectors ( numpy array ): is a binary array constructed from the dataset, used for each model, that distinguishes an event between signal and background.
weights ( numpy array ): it takes the weights associated to each data of my dataset.
Returns:
double: returns the AMS score.
| def arsenalgear.datascience.RemoveOutliers | ( | array, | |
| max_deviations | |||
| ) |
RemoveOutliers.
Function used to remove outliers from an array.
Args:
array (numpy.array): the interested array.
max_deviations (int): the maximum number of std.
Returns:
numpy.array: the array without outliers.
Testing:
>>> RemoveOutliers( np.array([1, 1, 1, 1, 1, 1, 42, 1, 1]), 2 )
array([1, 1, 1, 1, 1, 1, 1, 1])
>>> RemoveOutliers( np.array([20220314062656, 20220314092546, 20220314092736, 20220314092928, 20220314093120, 20220314092407, 20220314092642, 20220314092831, 20220314093026, 20220314094935]), 2 )
array([20220314092546, 20220314092736, 20220314092928, 20220314093120,
20220314092407, 20220314092642, 20220314092831, 20220314093026,
20220314094935])
>>> RemoveOutliers( np.array([20220314085233, 20220314092407, 20220314092547, 20220314092643, 20220314092738, 20220314092832, 20220314092930, 20220314093026, 20220314093121, 20220314094315]), 2 )
array([20220314092407, 20220314092547, 20220314092643, 20220314092738,
20220314092832, 20220314092930, 20220314093026, 20220314093121,
20220314094315])
>>> RemoveOutliers( np.array([20220314063832, 20220314092412, 20220314092551, 20220314092647, 20220314092741, 20220314092836, 20220314092933, 20220314093031, 20220314093125, 20220314102110]), 1 )
array([20220314092412, 20220314092551, 20220314092647, 20220314092741,
20220314092836, 20220314092933, 20220314093031, 20220314093125])
| def arsenalgear.datascience.RemoveOutliersDF | ( | dataframe, | |
| max_deviations, | |||
show_progressb = True |
|||
| ) |
RemoveOutliersDF.
Function used to remove outliers from a particular dataframe.
Args:
dataframe (pandas.DataFrame): the pandas dataframe.
max_deviations (int): the maximum number of std.
show_progressb (boolean): set progressbar visualization to True / False.
Returns:
pandas.DataFrame: the modified dataframe.
Testing:
>>> df = pd.DataFrame()
>>> df = df.append( { "Channel": 0, "20220314063832": 0, "20220314092412": 0, "20220314092551": 0, "20220314092647": 0, "20220314092741": 0, "20220314092836": 0, "20220314092933": 0, "20220314093031": 0, "20220314093125": 0, "20220314102110": 0 }, ignore_index = True )
>>> df = RemoveOutliersDF( df, 1, show_progressb = False )
>>> df.columns
Index(['20220314092412', '20220314092551', '20220314092647', '20220314092741',
'20220314092836', '20220314092933', '20220314093031', '20220314093125',
'Channel'],
dtype='object')