arsenalgear-py
A library containing general purpose Python utils.
Functions
arsenalgear.datascience Namespace Reference

Functions

def AMS_score (x_cut, predictions, label_vectors, weights)
 "AMS_score" function More...
 
def RemoveOutliers (array, max_deviations)
 RemoveOutliers. More...
 
def RemoveOutliersDF (dataframe, max_deviations, show_progressb=True)
 RemoveOutliersDF. More...
 

Function Documentation

◆ AMS_score()

def arsenalgear.datascience.AMS_score (   x_cut,
  predictions,
  label_vectors,
  weights 
)

"AMS_score" function

Function used to compute the AMS score.

Args:
    x_cut ( double ): s the cut parameter of the AMS. It ranges from 0.5 to 1 in steps of 0.1.
    predictions ( numpy array ): is a binary array, defined from the set of data that we're considering.
    label_vectors ( numpy array ): is a binary array constructed from the dataset, used for each model, that distinguishes an event between signal and background.
    weights ( numpy array ): it takes the weights associated to each data of my dataset.

Returns:
    double: returns the AMS score.

◆ RemoveOutliers()

def arsenalgear.datascience.RemoveOutliers (   array,
  max_deviations 
)

RemoveOutliers.

Function used to remove outliers from an array.

Args:
    array (numpy.array): the interested array.
    max_deviations (int): the maximum number of std.

Returns:
    numpy.array: the array without outliers.
    
Testing:
    >>> RemoveOutliers( np.array([1, 1, 1, 1, 1, 1, 42, 1, 1]), 2 )
    array([1, 1, 1, 1, 1, 1, 1, 1])
    >>> RemoveOutliers( np.array([20220314062656, 20220314092546, 20220314092736, 20220314092928, 20220314093120, 20220314092407, 20220314092642, 20220314092831, 20220314093026, 20220314094935]), 2 )
    array([20220314092546, 20220314092736, 20220314092928, 20220314093120,
           20220314092407, 20220314092642, 20220314092831, 20220314093026,
           20220314094935])
    >>> RemoveOutliers( np.array([20220314085233, 20220314092407, 20220314092547, 20220314092643, 20220314092738, 20220314092832, 20220314092930, 20220314093026, 20220314093121, 20220314094315]), 2 )
    array([20220314092407, 20220314092547, 20220314092643, 20220314092738,
           20220314092832, 20220314092930, 20220314093026, 20220314093121,
           20220314094315])
    >>> RemoveOutliers( np.array([20220314063832, 20220314092412, 20220314092551, 20220314092647, 20220314092741, 20220314092836, 20220314092933, 20220314093031, 20220314093125, 20220314102110]), 1 )
    array([20220314092412, 20220314092551, 20220314092647, 20220314092741,
           20220314092836, 20220314092933, 20220314093031, 20220314093125])

◆ RemoveOutliersDF()

def arsenalgear.datascience.RemoveOutliersDF (   dataframe,
  max_deviations,
  show_progressb = True 
)

RemoveOutliersDF.

Function used to remove outliers from a particular dataframe.

Args:
    dataframe (pandas.DataFrame): the pandas dataframe.
    max_deviations (int): the maximum number of std.
    show_progressb (boolean): set progressbar visualization to True / False.

Returns:
    pandas.DataFrame: the modified dataframe.
    
Testing:
    >>> df = pd.DataFrame()
    >>> df = df.append( { "Channel": 0, "20220314063832": 0, "20220314092412": 0, "20220314092551": 0, "20220314092647": 0, "20220314092741": 0, "20220314092836": 0, "20220314092933": 0, "20220314093031": 0, "20220314093125": 0, "20220314102110": 0 }, ignore_index = True )
    >>> df = RemoveOutliersDF( df, 1, show_progressb = False )
    >>> df.columns
    Index(['20220314092412', '20220314092551', '20220314092647', '20220314092741',
           '20220314092836', '20220314092933', '20220314093031', '20220314093125',
           'Channel'],
          dtype='object')