Monitoring CMS Tracker construction and data quality using a grid/web service based on a visualization tool

G. Zito, M.S.Mennea, A. Regano, UNIVERSITY & INFN , Bari, Italy

Abstract

The complexity of the CMS Tracker (more than 50 million channels to monitor) now in construction in ten laboratories worldwide with hundreds of interested people , will require new tools for monitoring both the hardware and the software. In our approach we use both visualization tools and Grid services to make this monitoring possible. The use of visualization enables us to represent in a single computer screen all those million channels at once. The Grid will make it possible to get enough data and computing power in order to check every channel and also to reach the experts everywhere in the world allowing the early discovery of problems . We report here on a first prototype developed using the Grid environment already available now in CMS i.e. LCG2. This prototype consists on a Java client which implements the GUI for Tracker Visualization and a few data servers connected to the tracker construction database , to Grid catalogs of event datasets or directly to test beam setups data acquisition . All the communication between client and servers is done using data encoded in xml and standard Internet protocols. We will report on the experience acquired developing this prototype and on possible future developments in the framework of an interactive Grid and a virtual counting room allowing complete detector control from everywhere in the world.

Introduction

The complexity of the CMS Tracker (more than 50 million channels to monitor) now in construction in ten laboratories worldwide with hundreds of interested people , will require new tools for monitoring both the hardware and the software. In our approach we use both visualization tools and Grid services to make this monitoring possible. The use of visualization enables us to represent in a single computer screen all those million channels at once. The Grid will instead make it possible to get enough data and computing power in order to check every channel.

Deploying this visualization tool as a Grid service means also that it can be used everywhere in the world enabling the experts to be in constant touch with the detector. This allows the early discovery of problems in hardware and software.The tool developed is used in this early stage to monitor the detector construction and the simulated data quality.

We report here on a first prototype developed using the Grid environment already available now in CMS i.e. LCG2. This prototype consists on a client which implements the GUI for Tracker Visualization . This part is implemented in Java and can run on any computer connected to Internet . When this program needs data to be displayed, it uses Grid and Web services to get them. These data may come from a database interfaced to the Web (like the tracker construction database) or from events in some dataset registered in the RLS. The data are provided from servers that access and analyze them using CMS software and send the result to the visualization client. All the communication between client and servers is done using data encoded in xml and standard Internet protocols.

The client can display a generic tracker defined as a set of "modules" organized in a hierarchy with rings , layers ,etc ... The structure of the detector is read from a set of xml files available from the same server that provides the event data .In this way the client can be used to monitor a series of test prototypes in addition to the final complete detector.

We will report on the experience acquired developing this prototype and present our opinion about some questions concerning CMS software in the field of interactive analysis and monitoring and the possible evolution of this project:

  1. Is a virtual counting room with complete detector control from everywhere in the world feasible?
  2. How difficult is to transform an "offline" application in a Web/grid service?
  3. How relevant are interactive Web and Grid services for CMS ?
  4. What is the best way to integrate CMS software with Web/Grid interactive applications?

Design Criteria

Overall

The system should consist of a highly portable lightweight client capable of running on many different platforms connected to Internet.The term lightweight implies that the client should be independent of the huge legacy libraries making up the main reconstruction and analysis code.

The server application should incorporate standard analysis tools and can be in direct contact with the event store or other databases. The connection protocol between client and server should be standardised.

Visualisation client

The usual way to represent Hep detectors and events is 3D with standard projections and "layered projections" to provide both good overall pictures for public relations and images useful for event analysis.In our case , the sheer complexity of the tracker makes these standard representations not sufficient. Successful monitoring requires representations where you can see the whole detector with each one of 17000 modules visible. For this reason we have developed a new 2D representation.This is the main representation for detector monitoring and is supplemented by the normal 3D/2D representation.

Why Web services

The use of web services is essential to ensure the interoperability of our visualisation application with different data sources.Some of these sources (the event store) are on the Grid and their use requires Grid services which are based on Web services. Other data sources are relational databases. In both cases Web services let you establish a standard protocol for data exchange between disparate data sources and a consumer.They serve as a layer of abstraction isolating from the details of data representation, exchange and query. For example the visualisation client handles in the same way tracker data coming from the construction relational data base and from a grid dataset.

System Architecture

The architecture of the developed system can be seen in this picture. The system is of the client-server type with a highly portable lightweight client capable of running on many different platforms connected to Internet. The server application works as a gateway in direct contact with the construction database or the event store where standard analysis program (ORCA - Cms reconstruction project) write their data. The connection between client and server uses standard internet protocols. To extract tracker data from xml files we use the technique called xml data binding as implemented by free source software Castor .

Visualization client implementation

The visualization client , which we named Tmon, is implemented in Java. It can execute on any computer with Java Runtime 1.4 installed. This allows the use of our software from any computer connected to Internet without the need to install special software. This client communicates with data servers using standard Internet protocols. All files exchanged contain data encoded in xml.

This client implements the user graphics interface and fulfills these main tasks:


Visualisation client
To extract tracker data from xml files we have used the technique called xml data binding using the free source program Castor[1]. This program uses directly the schema definition file to create a set of Java classes that can be used to read (unmarshall) the tracker data from a xml file like this and to have access to them in the visualisation client. The client is available for download from Internet[3].

Data servers implementation

As data server we have used Tomcat[5] + Axis[6]. Tomcat can be used as a normal Web server using http protocol. The client requests events or other data to the server by putting the information about the request directly in the URL with the GET directive. In some cases the POST directive is used and a file containing the information in XML format is sent.The server answers by sending the data in the XML format. To get events from Montecarlo federations we have used the so called REST approach to Web services. This consists in naming all resources provided with standard Web addresses(URI) and using the commands already available in http protocol(i.e. GET,PUT,DELETE,POST) to manage the resources. Tomcat implements all of them. We use instead Axis (working as a Tomcat servlet) to provide web services following the W3C definition with SOAP and WSDL. This approach is used to serve the data in the construction database.

Up to now the data servers can provide data from three different sources

  1. - federations of Montecarlo events for the full detector.
  2. - real data from a prototype in a test beam(not yet implemented)
  3. - data from the construction data base

Serving data from the construction database

The construction database is based on a Oracle relational database. This data base accepts any sql query embedded in a special xml command. The result is a xml files containing the answer to our query. To get the data we have implemented an Axis web service named tracker.This service will accept requests for information in the data base as normal URI and transform them in the xml/sql request to the oracle database. The resulting xml file is transformed in the standard cms tracker event format and sent to the visualization client using SOAP. The web interface is described by a WSDL file. The service is of the type "document" and has the following operations implemented:
Number of dead strips in each module represented with a color code.

Serving data from Montecarlo federations

The data server connected to Montecarlo events is implemented with a Tomcat servlet that basically answers the request get next event in the tracker. The event is taken from a local disk cache and returned in xml format. The servlet will get the events by querying a number of grid computers. The events may come from any number of computers. Each available computer will run a simple Orca application that generates new Montecarlo events and writes the tracker rechits in a disk area accessible also to the Tomcat servlet. The servlet checks from time to time if some new event is available and , if this is true, it copies the event to the local cache.
Example of Montecarlo event(around 100000 rec hits overlayed view) represented by Tmon.Same event but separated mode view.

Detector and event data representation in xml

The XML files describing the various tracker layouts follow the tracker data model developed by the same authors to implement a detailed tracker visualization in the CMS offline software[2]. A generic tracker can be described as a hierarchy of modules. Starting from the single module (a trapezoidal or rectangular box) we have the following groupings : rings, layers, detector parts(i.e. endcap+z, endcap-z, barrel), subdetector (i.e. pixel detector,silicon inner detector, silicon outer detector), full tracker.

This structure is mapped by the following hierarchy of xml tags:

<tracker> 
 <detector> 
  <subdetector> 
   <detectorPart> 
    <layer> 
     <ring> 
      <module id="0" xcenter="-0.553762" ycenter=-"0.094102" 
          zcenter="-2.6308" type="0" /> 
      <module id="1" xcenter="-0.560815" ycenter=-"0.0315087" 
          zcenter="-2.6278" type="0" /> 
           ...
     </ring> 
    </layer> 
   </detectorPart> 
  </subdetector> 
 </detector> 

<moduletypes> 
 <moduletype id="0" length="0.1151" width="0.0714" thickness="0.0003" 
       widthAtHalfLength="0.06465" nStrips="512" />
    ...
</moduletypes> 

</tracker> 

Here is the schema definition . This is instead the description of the whole tracker.

N.B. We don't use directly the DDD : this is too complex for our purposes and also incomplete. We use a simpler tracker description and populate it with the data extracted (once for all) from the DDD .

We have also defined the XML structure for any type of data coming from the tracker : event data, monitoring data, data from construction. Here is an example of event data containing reconstructed hits in the tracker.

<trackerEvent> 
 <recHits module="0"> 
 <recHit globalX="-0.557684" globalY="-0.0389969" globalZ="-2.6375"
   localX="-2.34123" localY="0" errorX="0.0191961" errorY="11.04" />
  ...
 </recHits> 
 <recHits module="1"> 
  ...
 </recHits> 
  ...
</trackerEvent> 
Here a complete event.

Xml data binding with Castor and web service implementation with Axis

The relationship between Axis and Castor can be seen in this picture. Castor uses this description of the data model to generate automatically the java classes implementing it (Castor beans). The web service itself is described in the WSDL language in this file. This file uses the XML-Schema file as a definition of the way the Web service should transfer data to the client. The program WSDL2Java (provided with the Axis distribution) uses the WSDL description of the Web service to generate the Java code implementing it. Serialization/Deserialization (converting data to/extracting data from) xml files is done by Castor. Although not present here, the event data model and the event transfer in the web service is implemented in the same way. Note also that the only difference , at the level of data transfer protocol , between REST and SOAP services is that in SOAP services the events and tracker description are sent included in a SOAP envelope. The XML schema used is the same in both cases.

Performance

The data servers are run on a PC Pentium IV. We have tried the visualisation client in Java on many Linux and Windows PC with Java Runtime 1.4 installed . The time needed to fetch the description of the whole tracker and unmarshal it is 10 seconds. This time is reasonable taking in account that the xml file sent is more than 1 Mb large and the Java program has to validate and load the data .Reading a single event or the data from the construction database will take a time that depends on the data size but never exceeds 10 seconds. This time is obtained of course because both reading events and data from database are done by servlets that work as gateway to the real data provider caching the result of query.

The table below gives some examples of response time with events of different sizes.(We don't do any optimization, compressing, etc or use any trick to decrea se this time).

# rechits in eventxml file size(MB) response time(sec)
66001.03
100001.64
36000610
860001421
1000001625

Lessons learned and conclusion

This was first of all a didactic exercise to get ready for the Grid.In fact the next step is to trasform the Web service in a Grid service. In this second phase we plan to try to use frameworks like MonALISA[6} and Clarens[7] already used to provide experimental Grid enabled data analysis for Cms. To implement our software we had to learn a lot of new technologies mostly connected to Java and Xml and it can be of interest to know what were our difficulties grasping these new technologies and also to know our opinion on these. We will report on this considering separately the various tasks that we had to manage: Anyhow all the software used has proven to be reliable and with a good performance.But at the end you ask yourself if undertaking this ordeal was really necessary. Our answer is yes! Xml is the future. It is the future like Java was the future only a few years ago: perhaps Java has been a failure but its main concept was so compelling that Microsoft has reinvented it with .Net. To illustrate this point we would like to speak about another xml dialect discovered doing this work, SVG(Scalable Vector Graphics). SVG is very interesting in our application because it can replace completely the Java client! For this reason we are investigating its use. How does it work? The servlet instead of sending the data will send a complete image of the tracker with the data in the SVG format. This image can be read by a Web browser (with an appropriate plugin) and has all the code to allow the user to interact with the image including zooming,selection, etc... Yes Xml is the future.

Is monitoring a detector like CMS tracker feasible from Internet? Yes. The results obtained are promising and indicate that an expert can have a detailed update of the situation in the control in a few minutes. The tracker map used as a convenient way to summarize monitoring data about the tracker, is essential since it limits the amount of data that have to travel between server and client.

But essential to the success of this schema is also the development by CMS of a standard Web and Grid interface to its monitoring data. Developing our application we have found that the access to construction data base was very easy because Oracle databases have a standard Web interface. Instead the access to CMS event data was very difficult because CMS has no Web interface defined and we had to invent it from scratch by ,for example, defining a XML format of CMS events.In our opininion , developing such standard Web interface, is absolutely essential to CMS especially if it wants to fully exploit the capabilities of the Grid. There are efforts going in this direction at least in the offline analysis environment: these are based on software like the monitoring framework MonALISA[6] and Clarens[7]:it must be checked if they answer to the requests made by Cms tracker online monitoring.

References

  1. http://www.castor.org/
  2. Mennea,M.S; Regano,A; Zito,G. "CMS Tracker Visualisation" , CMS-NOTE-2004-009; Geneva : CERN,08Jun2004
  3. http://cmsdoc.cern.ch/~gzito/tmon.zip
  4. Events used by tmon
  5. http://www-106.ibm.com/developerworks/webservices/library/ws-castor/ : Create Web services using Apache Axis and Castor
  6. http://jakarta.apache.org/tomcat/
  7. http://ws.apache.org/axis/
  8. http://monalisa.cacr.caltech.edu/
  9. http://clarens.sourceforge.net/

Other materials

  1. First draft of poster in pdf: 1, 2, 3.
  2. poster in ppt format
  3. First draft of paper in pdf(in the Proceedings of IEEE NS Rome Oct.2004)
  4. Second draft of paper in pdf(Submitted to Transactions on Nuclear Science )