gatohep.data_generation#
- gatohep.data_generation.generate_toy_data_1D(n_signal=100000, n_bkg=100000, xs_signal=0.5, xs_bkg1=100, xs_bkg2=80, xs_bkg3=50, xs_bkg4=20, xs_bkg5=10, lumi=100.0, noise_scale=0.3, seed=None)#
Generate 1D toy data for signal and background events.
- Parameters:
n_signal (int, optional) – Number of signal events to generate. Default is 100000.
n_bkg (int, optional) – Number of background events to generate. Default is 300000.
xs_signal (float, optional) – Cross-section for signal events. Default is 0.5.
xs_bkg1 (float, optional) – Cross-section for the first background component. Default is 50.
xs_bkg2 (float, optional) – Cross-section for the second background component. Default is 15.
xs_bkg3 (float, optional) – Cross-section for the third background component. Default is 10.
xs_bkg4 (float, optional) – Cross-section for the fourth background component. Default is 20.
xs_bkg5 (float, optional) – Cross-section for the fifth background component. Default is 10.
lumi (float, optional) – Luminosity for scaling event weights. Default is 100.
seed (int or None, optional) – Seed for the random number generator. Default is None.
- Returns:
A dictionary of DataFrames, each containing the generated toy data with columns “NN_output” and “weight”.
- Return type:
dict of pandas.DataFrame
- gatohep.data_generation.generate_toy_data_3class_3D(n_signal1=100000, n_signal2=100000, n_bkg=500000, xs_signal1=0.5, xs_signal2=0.1, xs_bkg1=100, xs_bkg2=80, xs_bkg3=50, xs_bkg4=20, xs_bkg5=10, lumi=100.0, noise_scale=0.3, seed=None)#
Generate 3D Gaussian data for 2 signal and 5 background classes.
For each point, compute likelihood-ratio-based 3-class scores: [score_signal1, score_signal2, score_background].
- Parameters:
n_signal1 (int, optional) – Number of events for signal1. Default is 100000.
n_signal2 (int, optional) – Number of events for signal2. Default is 100000.
n_bkg (int, optional) – Total number of background events. Default is 500000.
xs_signal1 (float, optional) – Cross-section for signal1. Default is 0.5.
xs_signal2 (float, optional) – Cross-section for signal2. Default is 0.1.
xs_bkg1 (float, optional) – Cross-section for background1. Default is 100.
xs_bkg2 (float, optional) – Cross-section for background2. Default is 80.
xs_bkg3 (float, optional) – Cross-section for background3. Default is 50.
xs_bkg4 (float, optional) – Cross-section for background4. Default is 20.
xs_bkg5 (float, optional) – Cross-section for background5. Default is 10.
lumi (float, optional) – Luminosity for scaling event weights. Default is 100.0.
noise_scale (float, optional) – Scale of multiplicative noise applied to the data. Default is 0.2.
seed (int or None, optional) – Seed for the random number generator. Default is None.
- Returns:
A dictionary of DataFrames, each containing the generated toy data with columns: - ‘NN_output’: 3-vector of scores [score_signal1, score_signal2, score_background]. - ‘weight’: Event weight.
- Return type:
dict of pandas.DataFrame