
Event Reconstruction and Analysis
Calibration and Simulation

Simulation of an electron shower in a calorimeter crystal.
Electromagnetic Calorimeter
The CMS electromagnetic calorimeter ("Ecal") consists of 75,848 tightly packed lead-tungstate crystals. The Ecal identifies electrons and photons by their heavily ionizing showers, which illuminate one to two dozen neighboring crystals near the point of impact. Energy depositions are measured with percent-level precision by collecting the light emitted. Cornell developed the software that describes the detailed geometrical layout of the crystals as well as their support structures and electronics, which is used for both simulation and reconstruction of particle interactions. Some of these codes, such as those storing, accessing, and applying individual cell geometries and alignments, have also been adapted for the CMS hadron calorimeters ("Hcal"). We also manage and continue to improve upon the software packages that strive to accurately model the response of the electronic readout, an essential ingredient in verifying key physics signatures to be observed in CMS data. Group: Brian HeltsleyEvent Reconstruction
Missing ET significance
In proton-proton collisions, the total momentum component transverse to the beam must balance among all the particles produced in the collision. If a particle that does not interact appreciably with ordinary matter is produced in a decay, it results in an imbalance in that transverse momentum, which we dub "missing ET". Many scenarios for new, fundamental interactions predict such particles, so a large missing ET is a powerful, generic signature of new particles. When an event exhibits missing ET, it's essential to ensure that it is real, so Cornell is providing a tool, the MET significance, that indicates the probability that it is simply the result of measurement errors. Group: Lawrence Gibbons, Jim Alexander, Aleko Khukhunaishvili, Xin Shi.Analysis Tools and Data Access
The LHC imposes formidable challenges for analysis software and data management. Cornell has been deeply involved in developing the CMS approach to solving them. The following sections describe Cornell's contributions.Fireworks event display
Fireworks is a lightweight tool for rendering collisions in the CMS detector. It is used by CMS physicists as a quick and easy way to see individual collisions.
Fireworks display of one of the first events at 7 TeV.
- Fireworks: A physics event display for CMS, J. Phys.: Conf. Ser. 219 032014 (2010).
FWLite
All CMS data is stored in files that can be read directly from within the ROOT analysis environment. When viewed this way, though, only the basic information that is archived for each object is accessible, and that basic form of the information is often inconvenient for use within a high-level analysis. The FWLite environment, developed at Cornell, combines ROOT with the object definitions, allowing full use of the CMS objects within the ROOT environment. It gives the physicists the best of both worlds: the speed, visualization and flexible analysis capabilities of ROOT, the physically-intuitive view of the CMS data, and the security of working with data of known, tracked provenance. Group: Lawrence Gibbons, Dan RileyData Aggregation Services
The CMS data needed for a particular study, analysis task, calibration task, etc., can depend on the LHC running conditions, the beam energy, the detector configuration, a particular trigger type, the version of the processing code, or on another of a wide variety of quantities, and can be located at a number of sites across the world. The different types of information are stored in a number of databases, each of which is maintained by different subgroup within CMS, or by the LHC itself. To provide the physicists with a single, integrated view of the information, Cornell has written a very generic Data Aggregation Service (DAS) based on the MongoDB document-oriented database. DAS has wide applicability, and the DISCOVER Research Service Group has taken up DAS support and dissemination as one of its core tasks.- The CMS Data Aggregation System, ICCS 2010, Procedia Computer Science, Volume 1, 1529 (2010)
Data Discovery
CMS will accumulate petabytes of data, with specialized samples scattered around the world. Cornell developed the Data Bookkeeping Service (DBS) that allows CMS physicists to find and access the data.- The CMS DBS query language, J. Phys.: Conf. Ser. 219, 042043 (2010).
- Rapid web development using python and AJAX, J. Phys.: Conf. Ser. 119, 042011 (2008).
- A multi-dimensional view on information retrieval of CMS data, J. Phys.: Conf. Ser. 119, 072013 (2008).
- The CMS Dataset Bookkeeping Service, J. Phys.: Conf. Ser. 119, 072001 (2008).
- CMS Offline Web Tools, J. Phys.: Conf. Ser. 119, 082007 (2008).
Provenance Tracking
In order to be reproducible, data analysis requires keeping careful track of the versions of the software used in reconstruction. Provenance tracking shifts this responsibility from the physicist to the software environment, which automatically stores all version information directly in the file. Cornell has contributed significantly to the design of the provenance-tracking infrastructure, and has provided the initial tools for examining provenance within CMS.- File level provenance tracking in CMS, Journal of Physics: Conf. Ser. 219 032011 (2010).
- Provenance in High-Energy Physics Workflows, Computing in Science & Engineering, June 2008, Vol 10, No. 3, p. 22 (2008).