Skip to content

Commit 93cec25

Browse files
authored
Merge pull request #1 from columnflow/chapters/exercise
Add additional chapters + exercise environment
2 parents 4bd944b + 99a5fb9 commit 93cec25

10 files changed

+271
-21
lines changed

Diff for: chapters/advanced.tex

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
\chapter{Advanced Topics}\label{chap:advanced}
2+
3+
\include{sections/categorizer}
4+
\include{sections/shifts}
5+
\include{sections/event_weights}
6+
\include{sections/inference}

Diff for: chapters/exercise.tex

+5-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
1+
\chapter{Basic Functionalities}\label{chap:basics}
2+
3+
This chapter illustrates how to employ the most basic features of \columnflow. By the end of it, you should be able to perform a calibration, apply a selection, calculate an observable and finally also produce the corresponding distribution for multiple processes. Please note that this chapter is merely meant to summarize the most important aspects of these features. For a more in-depth discussion and presentation, please consider Ref.~\cite{cf_repo}.
4+
5+
\include{sections/configs}
16
\include{sections/taskarrayfunctions}
27
\include{sections/calibrator}
38
\include{sections/selector}
49
\include{sections/producer}
5-
\include{sections/categorizer}
6-
\include{sections/inference}

Diff for: main.tex

+65-4
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@
3131
\usepackage{tabu}
3232
\usepackage{tabularray}
3333
\usepackage[most]{tcolorbox}
34-
34+
\usepackage{tocloft} % for lists for custom environments
35+
\usepackage{xparse}
36+
\usepackage{quotes}
3537

3638
\NewTblrTheme{longtable1}{
3739
\DefTblrTemplate{conthead-text}{fancy}{}
@@ -75,6 +77,7 @@
7577
% Following line generates undefined control sequence: section for me, so comment it out for now
7678
%\renewcommand{\sectionmark}[1]{\markright{{\sectionname\ \thesection.\ #1}}{}}
7779

80+
7881
\newcommand*\NewPage{\newpage\null\thispagestyle{empty}\newpage}
7982

8083
\setlength{\parskip}{0.6em}
@@ -181,6 +184,57 @@
181184
\definecolor{codegray}{gray}{0.9}
182185
\newcommand{\code}[1]{\colorbox{codegray}{\texttt{#1}}}
183186

187+
% define style for exercise environment
188+
\tcbuselibrary{skins, breakable}
189+
190+
\tcbset{
191+
myboxstyle/.style={
192+
breakable,
193+
colback=white, % Background color for the content
194+
colframe=black, % Frame color
195+
colbacktitle=gray!15, % Background color for the title
196+
coltitle=codeLimeGreen, % Text color for the title
197+
fonttitle=\bfseries, %
198+
title={#1}, % The title content will be passed as an argument
199+
enhanced,
200+
attach boxed title to top left={yshift=-2mm, xshift=2mm},
201+
boxed title style={
202+
colframe=black,
203+
arc=1mm,
204+
outer arc=1.5mm,
205+
boxrule=0.5mm, % Border thickness for the title
206+
},
207+
overlay={},
208+
}
209+
}
210+
211+
% create a new list for the exercises, with it's own counter
212+
\newlistof[chapter]{xrcise}{ex}{List of Exercises}
213+
214+
% Define the 'exercise' environment to use the custom style
215+
216+
\NewDocumentEnvironment{exercise}{mo}{
217+
% #1: Title for Exercise
218+
% #2: Solution
219+
\refstepcounter{xrcise}
220+
\par
221+
\begin{tcolorbox}[myboxstyle={Exercise \thexrcise: #1}]
222+
}{
223+
\ExplSyntaxOn
224+
\IfNoValueTF{#2}
225+
{}
226+
{\newline The~solution~can~be~found~in~\code{#2}}%
227+
\ignorespaces
228+
\ExplSyntaxOff
229+
\end{tcolorbox}
230+
\addcontentsline{ex}{xrcise}{Exercise \protect\numberline{\thexrcise}: \protect#1}
231+
\par
232+
}
233+
234+
% also create an automatically-generated list for the exercises
235+
236+
237+
184238
\input{style_declarations}
185239

186240
% define which files to consider for compilation.
@@ -192,17 +246,20 @@
192246
sections/goal,
193247
sections/setup,
194248
sections/strategy,
249+
sections/configs,
195250
sections/taskarrayfunctions,
196251
sections/calibrator,
197252
sections/selector,
198253
sections/producer,
199254
sections/categorizer,
255+
sections/shifts,
256+
sections/event_weights,
200257
sections/inference,
201258
}
202259
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
203260

204261
\begin{document}
205-
\pagenumbering{roman}#
262+
\pagenumbering{roman}
206263

207264
\begin{titlepage}
208265
\frontmatter
@@ -235,15 +292,19 @@
235292
\tableofcontents
236293
%\listoffigures
237294
%\listoftables
295+
\listofxrcise
238296

239297
\mainmatter
240298
\thispagestyle{empty}
241299

242300
\pagestyle{fancy}
243301
\captionsetup{justification=raggedright,singlelinecheck=false}
244302

245-
\input{chapters/intro}
246-
\input{chapters/exercise}
303+
304+
\input{chapters/intro}
305+
\input{chapters/exercise}
306+
\input{chapters/advanced}
307+
247308
\appendix
248309
\printbibliography
249310

Diff for: sections/calibrator.tex

+84-1
Original file line numberDiff line numberDiff line change
@@ -1 +1,84 @@
1-
\section{Writing a Calibrator}\label{sec:calibrator}
1+
\section{Writing a Calibrator}\label{sec:calibrator}
2+
3+
\CCSPStlye{Calibrator}s are dedicated \CCSPStlye{TaskArrayFunctions} that perform a calibration of objects, such as jets, leptons or missing transverse energy.
4+
Since this calibration modifies the four-momenta in the events, they can influence the selection of a given analysis.
5+
Therefore, calibrations should generally be performed before applying analysis selections.
6+
The associated task within the workflow is \CCSPStlye{cf.CalibrateEvents}, which is executed before the selection modules within the task graph.
7+
8+
\columnflow provides generally-used calibrations for different objects which follow the common (CMS) guidelines.
9+
For more information about which calibrations are implemented and how to use them, we recommend to consult the current status of the documentation~\cite{cf_repo}.
10+
11+
The \code{H4L} analysis includes an exemplary \CCSPStlye{Calibrator} in \code{h4l/calibration/jets.py}.
12+
In this module, we perform a calibration of jets in our events, which is based on the implementation that comes with \columnflow itself.
13+
14+
% TODO: include code here?
15+
%\begin{lstlisting}[language=python]
16+
% # coding: utf-8
17+
%
18+
% """
19+
% Jet energy calibration methods.
20+
% """
21+
%
22+
% from columnflow.calibration import Calibrator, calibrator
23+
% from columnflow.calibration.cms.jets import jec, jer
24+
% from columnflow.util import maybe_import
25+
%
26+
% ak = maybe_import("awkward")
27+
%
28+
%
29+
% # custom jec calibrator that only runs nominal correction
30+
% jec_nominal = jec.derive("jec_nominal", cls_dict={"uncertainty_sources": []})
31+
%
32+
%
33+
% @calibrator(
34+
% uses={jec_nominal},
35+
% produces={jec_nominal},
36+
% )
37+
% def jet_energy(self: Calibrator, events: ak.Array, **kwargs) -> ak.Array:
38+
% """
39+
% Common calibrator for jet energy corrections, applying nominal JEC for data, and JEC with
40+
% uncertainties plus JER for MC. Information about used and produced columns and dependent
41+
% calibrators is added in a custom init function below.
42+
% """
43+
% # correct jet energy scale
44+
% events = self[jec_nominal](events, **kwargs)
45+
%
46+
% # jet energy resolution smearing (MC only)
47+
% if self.dataset_inst.is_mc:
48+
% events = self[jer](events, **kwargs)
49+
%
50+
% return events
51+
%
52+
%
53+
% @jet_energy.init
54+
% def jet_energy_init(self: Calibrator) -> None:
55+
% # return immediately if dataset object has not been loaded yet
56+
% if not getattr(self, "dataset_inst", None):
57+
% return
58+
%
59+
% # add columns producs by JER smearing calibrator (MC only)
60+
% if self.dataset_inst.is_mc:
61+
% self.uses.add(jer)
62+
% self.produces.add(jer)
63+
%
64+
%\end{lstlisting}
65+
66+
First the relevant modules are imported.
67+
Note that \code{awkward} is loaded with the \code{maybe\_import} mechanism.
68+
This is necessary due to the encapsulated structure of the underlying software stack.
69+
In the scope of this exercise, we don't want to consider all the different sources of uncertainties that are associated with jet calibration yet.
70+
Therefore, we use the \code{derive} mechanism of \CCSPStlye{TaskArrayFunctions} to define a new class called \code{jec\_nominal}, which inherits from the original \code{jec} \CCSPStlye{Calibrator} but overwrites the corresponding class member variable.
71+
72+
Next, we define our new \CCSPStlye{Calibrator} class \code{jet\_energy} as shown before.
73+
Since we want to call the \code{jec\_nominal} class within this \CCSPStlye{Calibrator}, we need to add it to the \code{uses} set.
74+
This will load all columns that \code{jec\_nominal} needs, and will additionally make \code{jec\_nominal} accessible within the main body of our new \CCSPStlye{Calibrator} as shown below.
75+
We also want to save all columns that \code{jec\_nominal} produces to disk for later use, which is why we need to add it to the \code{produces} set as well.
76+
77+
All \CCSPStlye{TaskArrayFunction}s have access to information of the current point within the task graph, such as the \code{config} object mentioned in Sec.~\ref{sec:configs} and the current dataset that is processed.
78+
Their behavior can depend on this information, which is shown for the jet energy resolution~(JER) calibration of our new \code{jet\_energy} module.
79+
JER needs to be applied to simulated samples only, which is realized in the code correspondingly.
80+
The set of columns to be loaded from disk is also dynamically configured in the \code{init} function of the \code{jet\_energy} \CCSPStlye{Calibrator} such that columns corresponding to JER are only added to the \code{uses} and \code{produces} sets if necessary.
81+
82+
\begin{exercise}{Writing a Calibrator}%[h4l/calibration/jet.py]
83+
Familiarize yourself with how the \code{jet\_energy} \CCSPStlye{Calibrator} works.
84+
\end{exercise}

Diff for: sections/configs.tex

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
\section{Configuring the workflow}\label{sec:configs}
2+
3+
% TODO: write this section
4+
Concepts to (briefly) introduce here
5+
\begin{itemize}
6+
\item Order objects: Analysis, configs
7+
\item law config for module resolution
8+
\item brief walk through through demo config?
9+
\end{itemize}
10+
11+
Might want to move this to Chapter 1.

Diff for: sections/event_weights.tex

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
\section{Define Sets of Weights to use for Templates}\label{sec:event_weights}

Diff for: sections/selector.tex

+9-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,15 @@
11
\section{Writing a Selector}\label{sec:selector}
22

3-
The lepton selection we want to implement in this exercise is the following:
3+
The \CCSPStlye{Selector} class should be used to implement analysis selections.
4+
This is a crucial step in the workflow since the decision to keep or reject objects or even whole events is performed here.
5+
Since the selection usually depends on for example four-momenta of the objects within the events, it is executed after the calibration.
6+
The corresponding task is called \CCSPStlye{cf.SelectEvents}.
7+
For more information, please consider Ref.~\cite{cf_repo}.
48

9+
\begin{table}[t]
10+
\Caption{Selection criteria for leptons}{Shown are the selection criteria for electrons (muons) at the 'loose' and 'tight'}
11+
\end{table}
12+
In this part of the tutorial, we will write selections for electrons and muons.
513
\textbf{\underline{Loose Electrons}}
614
\begin{itemize}
715
\item

Diff for: sections/setup.tex

+20-12
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
\section{Installation \& setup}
22
\justifying
33
\begin{tcolorbox}[colback=green!5!white,colframe=green!75!black,width=\textwidth]
4-
Note: ColumnFlown only runs on Linux and may require up to 4 GB of disc space. \tcblower
4+
Note: \columnflow only runs on Linux and may require up to 4 GB of disc space. \tcblower
55
Also, the machine where you run this exercise must be mounted with CERN AFS.
66
\end{tcolorbox}
77

88
Start by going to the GitLab repository of this exercise:
99

10-
\texttt{\textcolor{LimeGreen}{\href{https://gitlab.cern.ch/cms-analysis/analysisexamples/columnflow-demo}{\underline{https://gitlab.cern.ch/cms-analysis/analysisexamples/columnflow-demo}}}}
10+
\CCSPStlye{\href{https://gitlab.cern.ch/cms-analysis/analysisexamples/columnflow-demo}{\underline{https://gitlab.cern.ch/cms-analysis/analysisexamples/columnflow-demo}}}
1111

1212
To have your own copy of the code, fork the repository into your personal area. You can do this by clicking the \code{Fork} button on the upper right corner of the page. To set your Project URL please type your CERN username in the \code{Select a namespace} option.
1313

@@ -41,14 +41,20 @@ \section{Installation \& setup}
4141
git clone --recursive ssh://[email protected]:7999/<cern_username>/columnflow-demo.git
4242
\end{lstlisting}
4343

44-
The directory you have thus created will be referred to as \code{basedir}. You can now go inside your local repository and install ColumnFlow. The \code{setup.sh} bash script will initialize the software environment with \code{micromamba}. Here, we define \code{dev} as the setup name, but you are free to name it as you wish.
44+
The directory you have thus created will be referred to as \code{basedir}. You can now go inside your local repository and install \columnflow. The \code{setup.sh} bash script will initialize the software environment with \code{micromamba}. Here, we define \code{dev} as the setup name, but you are free to name it as you wish.
4545

4646
\begin{lstlisting}[language=bash]
4747
cd columnflow-demo
4848
source setup.sh dev
4949
\end{lstlisting}
5050

51-
You will be asked to define a series of variables, the first of which is your CERN username. For all other variables you can keep the default value by just pressing \code{Enter}. Variables specific to this exercise will start with \code{H4L\_}, while ColumnFlow specific variables start with \code{CF\_}. You can find all variables in the \code{.setups/dev.sh} bash file. We invite you to check out this file and familiarize yourself with these variables.
51+
52+
You will be asked to define a series of variables, the first of which is your CERN username.
53+
For all other variables you can keep the default name by just pressing \code{Enter}.
54+
Variables specific to this exercise will start with \code{H4L\_}, while \columnflow specific variables start with \code{CF\_}.
55+
You can find all variables in the \code{.setups/dev.sh} bash file.
56+
We invite you to check out this file and familiarize yourself with these variables.
57+
5258

5359
\begin{figure}[!h]
5460
\centering
@@ -57,9 +63,11 @@ \section{Installation \& setup}
5763

5864
Note that the first installation of the software can take \underline{up to several minutes}.
5965

60-
Every time you want to work with ColumnFlow (e.g.\ if you open a new terminal window), you will need to source the \code{setup.sh} script again.
66+
Every time you want to work with \columnflow (e.g.\ if you open a new terminal window), you will need to source the \code{setup.sh} script again.
67+
68+
69+
Once the installation is complete you should see a line of green text stating that the analysis has been successfully set up. You are now ready to start working with \columnflow!
6170

62-
Once the installation is complete you should see a line of green text stating that the analysis has been successfully set up. You are now ready to start working with ColumnFlow!
6371

6472
\begin{figure}[!h]
6573
\centering
@@ -72,7 +80,7 @@ \section{Installation \& setup}
7280
\includegraphics[scale=0.62]{images/CF_demo.png}
7381
\end{figure}
7482

75-
%\subsection{ColumnFlow Tasks}
83+
%\subsection{\columnflow Tasks}
7684

7785
This exercise is organized in the form of \code{law} tasks, where different tasks create some form of output. You can view the available tasks by running:
7886
\begin{lstlisting}[language=bash]
@@ -82,11 +90,11 @@ \section{Installation \& setup}
8290
This exercise will focus on the following tasks:
8391

8492
\begin{itemize}
85-
\item \texttt{\textcolor{LimeGreen}{cf.CalibrateEvents}} / \texttt{\textcolor{LimeGreen}{cf.SelectEvents}}
86-
\item \texttt{\textcolor{LimeGreen}{cf.ProduceColumns}}
87-
\item \texttt{\textcolor{LimeGreen}{cf.PlotCutflow}}
88-
\item \texttt{\textcolor{LimeGreen}{cf.PlotVariables1D}} / \texttt{\textcolor{LimeGreen}{cf.PlotVariables2D}}
89-
\item \texttt{\textcolor{LimeGreen}{cf.CreateDatacards}}
93+
\item \CCSPStlye{cf.CalibrateEvents} / \CCSPStlye{cf.SelectEvents}
94+
\item \CCSPStlye{cf.ProduceColumns}
95+
\item \CCSPStlye{cf.PlotCutflow}
96+
\item \CCSPStlye{cf.PlotVariables1D} / \CCSPStlye{cf.PlotVariables2D}
97+
\item \CCSPStlye{cf.CreateDatacards}
9098
\end{itemize}
9199

92100
By default, these tasks will save their output on a remote file system (e.g.\ \texttt{WLGC}), for which you will require a \code{voms-proxy}. If you would like to save certain/all outputs locally, we recommend to create a directory on a system with a larger amount of disk space (e.g.\ \texttt{EOS}). For such cases, you will need to update the \code{law.cfg} file accordingly.

Diff for: sections/shifts.tex

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
\section{Defining Systematic Uncertainties}\label{sec:shifts}

0 commit comments

Comments
 (0)