OpTHyLiC
Online code documentation
Introduction
This page contains information about the OpTHyLiC code usage. For other information, please consult
the home page.
Contents:
- Configuration:
- Text and LaTeX outputs:
- Pseudo-experiments generation and derived quantities:
- Access to histograms:
Instance creation and configuration
To create an instance of the OpTHyLiC class, the following code is needed:
- OpTHyLiC oth(syst , stat [, random [, seed [, comb]]]);
The first two arguments are mandatory while the last three are optional. The meaning of the
arguments is explained in the
documentation.
syst
is the type of systematics interpolation/extrapolation function,
within the following list:
OTH::SystMclimit
→ same function as in McLimit
OTH::SystLinear
→ linear interpolation and extrapolation
OTH::SystExpo
→ exponential interpolation and extrapolation
OTH::SystPolyexpo
→ polynomial interpolation and exponential extrapolation
(recommended)
stat
is the type of statistical PDF generator, within the following list:
OTH::StatNormal
→ normal (Gaussian) distribution
OTH::StatLogN
→ log-normal distribution
OTH::StatGammaHyper
→ gamma function from hyperbolic prior (recommended)
OTH::StatGammaUni
→ gamma function from uniform prior
OTH::StatGammaJeffreys
→ gamma function from Jeffreys prior
random
is the type of pseudo-random generator engine, within the following list:
OTH::TR3
→ using the TRandom3
class provided by the ROOT software
OTH::STD_minstd_rand
→ provided by the <random> header of the C++11 standard library
OTH::STD_minstd_rand0
→ provided by the <random> header of the C++11 standard library
OTH::STD_mt19937
→ provided by the <random> header of the C++11 standard library
OTH::STD_mt19937_64
→ provided by the <random> header of the C++11 standard library
OTH::STD_ranlux24_base
→ provided by the <random> header of the C++11 standard library
OTH::STD_ranlux48_base
→ provided by the <random> header of the C++11 standard library
OTH::STD_ranlux24
→ provided by the <random> header of the C++11 standard library
OTH::STD_ranlux48
→ provided by the <random> header of the C++11 standard library
OTH::STD_knuth_b
→ provided by the <random> header of the C++11 standard library
seed
is the generator seed to be used,
comb
is the type of systematics
combination, within the following list:
OTH::CombAdditive
→ addition of systematics
OTH::CombMultiplicative
→ multiplication of systematics
OTH::CombAutomatic
→ automatic choice, depending on the type of systematics interpolation/extrapolation
Example:
- OpTHyLiC oth(OTH::SystPolyexpo,OTH::StatGammaHyper,OTH::STD_mt19937);
where the generator seed will be 0 (i.e. setting a random seed) and the systematics combination will be
automatic.
Confidence level setting
By default, the confidence level is set to 95%. The following method allows setting it to a different
value:
Example:
to set the confidence level to 99%.
It is possible to retrieve the current value with the command
- double cl=oth.getConfLevel();
Single channel description and input file syntax
The following method allows adding a new single-bin channel:
- unsigned int idCh=oth.addChannel(name , file);
and returns an identifier for this new channel. The two arguments are strings,
name
is the internal name of the channel,
file
is the name of the file that contains the description of the channel. Example:
- oth.addChannel("Channel1","channel1.dat");
The input file is a simple text file, complying to the following rules:
- each line must contain a single keyword
- blank lines and lines starting with
#
are ignored
- the keywords and their arguments are:
+nameLaTeX <latex text>
→ sets the LaTeX name of the channel
(optional)
+bg <name> <yield> <err>
→ adds a new background sample
with the given name (no space inside), yield and statistical uncertainty
+sig <name> <yield> <err>
→ defines the signal sample
with the given name (no space inside), yield and statistical uncertainty
.nameLaTeX <latex text>
→ sets the LaTeX name of current sample
(optional)
.syst <name> <up> <down>
→ adds a new systematic
uncertainty to the current sample, with its name and up and down relative variations of
the yield when the source of this systematic uncertainty is varied by +1σ and
-1σ
+data <yield>
→ defines the observed number of events in the data
Example of input file:
- # example of channel
- +nameLaTeX 1$^{st}$ channel
-
- +bg sample1 25.8 7.3
- .nameLaTeX First sample
- .syst syst1 0.15 -0.13
-
- +sig signal 3.7 0.3
- .syst syst1 0.02 -0.02
- .syst syst2 0.1 -0.1
-
-
- +data 27
where two different systematics have been defined (
syst1
and
syst2
), one of
which (
syst1
) affects both the background and the signal with 100% correlation, while the
second (
syst2
) does not affect the background and has an effect of ±10% on the signal.
Single channel description without input file
The main method to add a single-bin channel is to use the
input file method.
However, it is also possible to use an alternative method, without any input file:
- unsigned int idCh=oth.addChannel(name);
- unsigned int idS1=oth.getChannel(idCh)->addBkgSample(name , yield , err);
- oth.getChannel(idCh)->addBkgSystematics(idS1 , systname , up , down);
and with similar methods for the signal:
and for the data:
Example:
- unsigned int idCh=oth.addChannel("Channel1");
- unsigned int idS1=oth.getChannel(idCh)->addBkgSample("sample1",25.8,7.3);
- oth.getChannel(idCh)->addBkgSystematics(idS1,"syst1",0.15,-0.13);
- oth.getChannel(idCh)->setSigSample("signal",3.7,0.3);
- oth.getChannel(idCh)->addSigSystematics("syst1",0.02,-0.02);
- oth.getChannel(idCh)->addSigSystematics("syst2",0.1,-0.1);
- oth.getChannel(idCh)->setYieldData(27);
Multiple channel description and input file syntax for discriminant variables distributions
When using distributions of discriminant variables as inputs, each bin of the distribution is
considered as a single-bin channel.
The following method allows adding a new channel, using histograms as input:
- unsigned int idCh=oth.addChannel(name , file [, clean]);
and returns an identifier for the last bin of this new channel.
The first two arguments are strings and are mandatory,
name
is the internal name of the channel (each bin will correspond to a channel
whose name is
name
followed by the bin index),
file
is the name of the file that contains the description of the channel.
The last argument is optional and is a boolean. When not specified, the default value (
true
) is taken
and all the intermediate files will be removed, otherwise the files are kept and may be used for debugging or
to run on a sub-set of the bins.
Example:
- oth.addChannel("Channel1","channel1.dat");
The input file is a simple text file, complying to the following rules:
- each line must contain a single keyword
- blank lines and lines starting with
#
are ignored
- the keywords and their arguments are:
+nameLaTeX <latex text>
→ sets the LaTeX name of the channel
(optional)
+bg <name> <root-file>(<hname>)
→ adds a new background sample
with the given name (no space inside), from the histogram whose name is <hname>
in the file <root-file>
+sig <name> <root-file>(<hname>)
→ defines the signal sample
with the given name (no space inside), from the histogram whose name is <hname>
in the file <root-file>
.nameLaTeX <latex text>
→ sets the LaTeX name of current sample
(optional)
.syst <name> <root-file-up>(<hname>) <root-file-down>(<hname>)
→ adds a new systematic
uncertainty to the current sample, with its name and up and down absolute
yields when the source of this systematic uncertainty is varied by +1σ and
-1σ
.syst <name> <up> <down>
→ adds a new systematic
uncertainty to the current sample, with its name and up and down relative variations of
the yield when the source of this systematic uncertainty is varied by +1σ and
-1σ (in this case all bins are supposed to have the same relative variation)
+data <root-file>(<hname>)
→ defines the observed number of events in the data
- prior to the above lines, an optional setup block can be defined:
+setup
→ to start the setup block
.directory <prefix>
→ to define a default prefix to be added in front of all <root-file*>
.histoName <hname>
→ to define the default histogram name to be used for all samples; in this
case (<hname>)
can be removed from the +bg
, +sig
, +data
and .syst
lines
Example of input file:
- # example of channel
- +nameLaTeX 1$^{st}$ channel
-
- +bg sample1 ../histos/chan1/sample1.root(variable1)
- .nameLaTeX First sample
- .syst syst1 ../histos/chan1/sample1_syst1.root(variable1up) ../histos/chan1/sample1_syst1.root(variable1down)
-
- +sig signal ../histos/chan1/signal.root(variable1)
- .syst syst1 ../histos/chan1/signal_syst1up.root(variable1) ../histos/chan1/signal_syst1down.root(variable1)
- .syst syst2 0.1 -0.1
-
-
- +data ../histos/chan1/data.root(variable1)
Another example of input file:
- # example of channel
- +nameLaTeX 2$^{nd}$ channel
-
- +setup
- .directory ../histos/chan2
- .histoName variable1
-
- +bg sample1 sample1.root
- .nameLaTeX First sample
- .syst syst1 sample1_syst1up.root sample1_syst1down.root
-
- +sig signal signal.root
- .syst syst1 signal_syst1up.root signal_syst1down.root
- .syst syst2 0.1 -0.1
-
-
- +data data.root
Access to a single channel
Any channel can be accessed directly, using one of the two following commands:
- OTH::Channel *pChannel=oth.getChannel(idCh);
- OTH::Channel *pChannel=oth.getChannel(name);
where
idCh
is the channel identifier returned by the
addChannel
method and
name
is
the name of the channel. Example:
- OTH::Channel *pChannel=oth.getChannel("Channel1");
Yields and uncertainties printout
It is possible to print on the
cout
stream the list of the channels
and samples together with their yields and uncertainties with the command
There is also the possibility to print the yields and uncertainties for a single channel:
LaTeX table creation for the input yields
It is possible to create LaTeX tables containing the input yields as well as the
total uncertainties with the command
- oth.createInputYieldTable(out [, prec]);
The first argument is mandatory while the second one is optional.
out
is the output stream to which the tables will be written.
prec
is the number of digits to be printed after the decimal point. If not
specified the default value (-1) is taken, meaning that the precision will be automatically
determined for each yield, using the magnitude of the uncertainties.
Example:
- std::ofstream file("tables.tex");
- oth.createInputYieldTable(file,2);
LaTeX table creation for the systematic uncertainties
It is possible to create LaTeX tables containing the details of the
input systematic uncertainties with the command
- oth.createSysteTables(out , dict [, prec]);
The first two arguments are mandatory, the last one is optional.
out
is the output stream to which the tables will be written.
dict
is the name of a file containing a dictionary for LaTeX names (see below
for the syntax).
prec
is the number of digits to be printed after the decimal point. If not
specified the default value (2) is taken. The value -1 means that the precision will be
automatically determined for each yield, using the magnitude of the uncertainties.
Example:
- std::ofstream file("tables.tex");
- oth.createSysteTables(file,"dict.txt",1);
The dictionary file is a simple text file, complying to the following rules:
- each line must contain a single dictionary entry, with the following syntax:
<name> <latex text>
→ the systematics with the
name <name>
must be replaced by the LaTeX name
<latex text>
- blank lines and lines starting with
#
are ignored
Example of dictionary file:
- syst1 Uncertainty on energy
- syst2 Signal rate uncertainty
LaTeX table creation for the generated yields
It is possible to
generate distributions of yields from
pseudo-experiments and then
create LaTeX tables containing the median yields as well as the
±1σ ranges with the command
- oth.createGeneratedYieldTable(out [, prec [, nbexp]]);
The first argument is mandatory while the last two are optional.
out
is the output stream to which the tables will be written.
prec
is the number of digits to be printed after the decimal point. If not
specified the default value (-1) is taken, meaning that the precision will be automatically
determined for each yield, using the magnitude of the uncertainties.
nbexp
is the number of pseudo-experiments generated to compute the values. If
not specified the default value (10
6) is taken.
Example:
- std::ofstream file("tables.tex");
- oth.createGeneratedYieldTable(file,2);
Generation of expected yields
It is possible to generate distributions of expected yields from pseudo-experiments,
taking into account all variations due to statistical and systematic uncertainties
but not applying Poisson variations, with the command
nbexp
is the number of pseudo-experiments generated to compute the
distributions.
Example:
There exists
another command that automatically calls
the
generateDistrYield
method for all channels and creates a LaTeX table from
the result.
Scan of CLs as a function of the signal strength
It is possible to compute the CL
s value for several signal strengths, with the commands
- oth.scanCLsVsMu(min , max , steps , nbexp , type);
- TGraph *pG=oth.getCLsVsMu();
min
and
max
are the first and last values of the signal strength to be
tested.
steps
is the total number of signal strengths to be tested.
nbexp
is the number of pseudo-experiments generated to compute each CL
s.
type
is the type of CL
s to be computed, within the following list:
OTH::LimObserved
→ using observed test-statistic value
OTH::LimExpectedMed
→ using median expected test-statistic value
OTH::LimExpectedM1sig
→ using median-1σ expected test-statistic value
OTH::LimExpectedP1sig
→ using median+1σ expected test-statistic value
OTH::LimExpectedM2sig
→ using median-2σ expected test-statistic value
OTH::LimExpectedP2sig
→ using median+2σ expected test-statistic value
Example:
- oth.scanCLsVsMu(0.5,3,6,100000,OTH::LimObserved);
- oth.getCLsVsMu()->Draw("alp");
When using several channels, it is also possible to do the scan for a single one, not
combining with the other channels, with the commands
where the arguments are the same as above.
Computation of observed and expected limits
The computation of observed or expected limits
is performed by the command
- double cls;
- double limit=oth.sigStrengthExclusion(type , nbexp , cls [, mu [, method]]);
that returns the signal strength for a given
confidence level. The first three
arguments are mandatory while the last two are optional.
type
is the type of limit to be computed, within the following list:
OTH::LimObserved
→ observed limit
OTH::LimExpectedMed
→ median expected limit
OTH::LimExpectedM1sig
→ median-1σ expected limit
OTH::LimExpectedP1sig
→ median+1σ expected limit
OTH::LimExpectedM2sig
→ median-2σ expected limit
OTH::LimExpectedP2sig
→ median+2σ expected limit
nbexp
is the number of pseudo-experiments to be generated to compute each CL
s.
cls
is the value of the final CL
s.
mu
is a reasonable value for the result. It has no effect on the final computed value but may
speed up the process. If not specified, the default value (1) is taken.
method
is the type of method to be used for the computation, within the following list:
OTH::MethDichotomy
→ computation using dichotomy
OTH::MethExtrapol
→ computation using an extrapolation (the result may not
be very accurate but the computation is faster)
If not specified, the default value (
OTH::MethDichotomy
) is taken.
Example:
- double cls;
- double limit=oth.sigStrengthExclusion(OTH::LimExpectedMed,100000,cls,3.5);
When using several channels, it is also possible to compute a limit for a single one, not
combining with the other channels, with the command
- double limit=oth.getChannel(idCh)->sigStrengthExclusion(type , nbexp , cls [, mu [, method]]);
where the arguments are the same as in the combined limit computation above.
Alternative computation of expected limits
The alternative method described in the
documentation to compute
all expected limits (median, ±1σ, ±2σ) from the distribution of the
expected signal strengths is performed by the command
- double median=oth.expectedSigStrengthExclusion(nbmu , nbexp);
that returns the median expected limit for a given
confidence level.
The other quantiles (±1σ,
±2σ) are printed but not returned.
nbmu
is the number of entries in the expected signal strengths distribution.
nbexp
is the number of pseudo-experiments generated to compute each CL
s.
Example:
- double median=oth.expectedSigStrengthExclusion(100000,100000);
When using several channels, it is also possible to compute the limits for a single one, not
combining with the other channels, with the command
where the arguments are the same as in the combined limit computation above.
Significance computation
The computation of the significance of an observation can be performed by the command
- std::pair<double,double> s=oth.significance(type , nbexp [, mu]);
that returns the p-value (
s.first
) and the z-value (
s.second
) of the observation.
The first two arguments are mandatory while the last one is optional.
type
is the type of significance to be computed, within the following list:
OTH::SignifObserved
→ observed significance
OTH::SignifExpectedMed
→ median expected significance
OTH::SignifExpectedM1sig
→ median-1σ expected significance
OTH::SignifExpectedP1sig
→ median+1σ expected significance
OTH::SignifExpectedM2sig
→ median-2σ expected significance
OTH::SignifExpectedP2sig
→ median+2σ expected significance
nbexp
is the number of pseudo-experiments generated to compute each test-statistic distribution.
mu
is the signal strength to be used for the computation. If not specified, the default
value (1) is taken.
Example:
- std::pair<double,double> s=oth.significance(OTH::SignifExpectedMed,100000);
- double pval=s.first;
- double zval=s.second;
When using several channels, it is also possible to compute the significance for a single one, not
combining with the other channels, with the command
where the arguments are the same as in the combined significance computation above.
Generation of test-statistic distributions
It is possible to generate the test-statistic distributions under the background only and
signal plus background hypotheses with the commands
- oth.setSigStrength(mu);
- oth.generateDistrLLR(nbexp);
mu
is the given signal strength for the signal plus background hypothesis.
nbexp
is the number of pseudo-experiments generated to compute each distribution.
Example:
- oth.setSigStrength(1.5);
- oth.generateDistrLLR(1000000);
When using several channels, it is also possible to generate the distributions for a single one,
not combining with the other channels, with the command
Computation of p-value, CLs and test-statistic value
It is possible to compute the p-value, the CL
s value and the test-statistic value
for the observed data and a given signal strength with the commands
- double pval=oth.pValueData();
- double cls=oth.computeCLsData();
- double llr=oth.computeLLRdata();
In order to compute these values, the test-statistic distributions must
have been already generated, either
by generating them explicitely or
after a
limit computation.
When using several channels, it is also possible to compute these values for a single one,
not combining with the other channels, with the commands
obs
is the number of observed events in the data.
Computation of the excluded yield for a single channel
It is possible to compute the yield that is excluded for a given channel and a given
signal strength with the command
In order to compute this value, the test-statistic distributions must
have been already generated, either
by generating them explicitely or
after a
limit computation.
Test-statistic distributions
It is possible to access the generated test-statistic distribution for the background only or
the signal plus background hypothesis with the commands
- TH1 *pH=oth.getHistoLLRb();
- TH1 *pH=oth.getHistoLLRsb();
In order to access these histograms, the test-statistic distributions must
have been already generated, either
by generating them explicitely or
after a
limit computation.
Example:
- oth.getHistoLLRb()->Draw();
- oth.getHistoLLRsb()->Draw("same");
When using several channels, it is also possible to access the test-statistic distribution
for a single one with the commands
Distributions of observed number of events in pseudo-data
It is possible to access the generated distribution of the observed number of events in pseudo-data for a given channel,
under the background only or the signal plus background hypothesis, with the command
the
type
of distribution must be within the following list:
OTH::Channel::hDistrBg
→ background only hypothesis
OTH::Channel::hDistrSB
→ signal plus background hypothesis
In order to access these histograms, the test-statistic distributions must
have been already generated, either
by generating them explicitely or
after a
limit computation.
Example:
Distributions of systematic uncertainties
It is possible to access the generated distribution of a given systematic uncertainty for a given channel
and the signal sample or a given background sample with the commands
syst
is the name of the systematic uncertainty,
sample
is the name of the
background sample.
In order to access these histograms, the distributions must
have been already generated,
by generating a test-statistic distribution,
after a
limit computation or after a
yield generation.
Example:
Distributions of expected yields
It is possible to access the generated distributions of the expected yields for a given channel and a given sample
(not taking into account the Poisson fluctuations), for the signal, a given background and the total background
with the commands
name
is the name of the requested background sample.
In order to access these histograms, the distributions must have been already
generated.
Example:
Distribution of signal strengths
It is possible to access the distribution of excluded signal strengths from which the
alternative computation of expected limits is performed with the commands
for a combined limit or a single-channel limit.
Excluded signal strength evolution
After an
alternative computation of expected limits for a single channel,
it is possible to access the evolution of the excluded signal strength as a function of the number of
observed events with the command
Example:
Last change: 2015-07-22