C++ traceback for CLEO legacy application
INTRODUCTION
One of the common tasks every developer needs to do is to locate different bugs in
specific applications. When program crashes it's not obvious to trace down the error.
In
CLEO experiment
it was requested to write a C++ traceback module which will
keep track of loaded modules and report invalid data requests.
CLEO software is a perfect example of data on demand system.
It consists by a set of dynamically loaded modules which perform various tasks.
Among them:
- sinks writes data to a persistent store
- processors requests/analyse/filters data
- sources reads data from a persistent store
- producers produces data on demand
REQUIREMENTS
- monitor loaded modules and objects they produce during execution time
- should not add additional overhead to the execution process
- be informative upon invalid data request and allow users understand
the underlying problems
IMPLEMENTATION
The code was designed to register every loaded modules in a separate stack which
kept updated during execution time. Every module provides a proxy to load and registers
a particular object with descriptive key (type,usage tag,production tag).
Once exception being thrown,
a stack content run out up to a proxy key which initiates the data
request. Such rollback mechanism allowed users to understand the data flow and
easily identify the source of the problem.
EXAMPLE
In this example we demonstrate a dynamically generated traceback output of invalid
data request initiated by
Level4Proc object.
Tue Mar 30 12:17:25 2004 Run: 202126 Event: 471 Stop: event
%% WARNING-Processor.Level3Proc: No data of type "Constants" "" ""
in Record beamenergyshift
A proxy for this data exists, but it's unable to deliver the data for this
beamenergyshift record.; skipping crystal decision
%% ERROR-JobControl.ProcessingPaths:
Starting from Level4Proc we called extract for
[1] type "Level4Decision" usage "" production ""
[2] type "FATable" usage "" production ""
[3] type "FATable" usage "" production ""
[4] type "FATable" usage "" production ""
[5] type "DoitTrackFinder" usage "" production ""
[6] type "ADRSenseWireStore" usage "" production ""
[7] type "DGDetectorStore" usage "DR3" production ""
[8] type "Constants" usage "" production "" <== exception occured
caught a DAException:
"No data of type "Constants" "" "" in Record dralignment
A proxy for this data exists, but it's unable to deliver the data for this
dralignment record." ; will continue...
As can be seen, the report message dumps which data objects were requested and
instructs which data were missing.
RESULTS
Such implementation significantly speed-up debugging process of the client code. In most
cases no additional debugger were required. We benchmark our exception library
and measured only 5% of additional overhead to the execution time which was
acceptable for our legacy application.
REFERENCES
For more information about CLEO software, please visit this
link. A specific talks about CLEO software can be found in
Computer in High-Energy Physics
(
CHEP)
conference proceedings.