Gather Model Calibration Data and Analyse Log Files

Aus SDQ-Wiki

Short description of how to analyse log files to

  • calibrate Palladio Component Model
  • and/or validate model prediction results

In order to calibrate Palladio models, one needs information to estimate the resource demands. Furthermore, to judge whether performance prediction results make sense, one needs to compare the prediction results with measurements. The most common way to do both is the use of log data which is created by execution infrastructure (web server, application server, middleware, etc.). This page summarises which data is required and what specialties have to be respected.

Metrics, Measured Data

This section lists base information for building calibrated Palladio Component Models. Calibrated Palladio Component Models have reliable information on branching propabilities, branching conditions, resource demands, number and behaviour of users in a system. Hence, such models are a reliable base for predictions. Of the following metrics, some information can be captured from static systems, other measures require the execution of the system under study.

Usage Profile

For the estimation of the usage profile of each individual components (system and per-component level) and for the estimation of branching probabilities:

  • Response time and throughput per provided service. Important: Do not use the average response time; if available prefer full distribution (values per individual request) or at least median
  • Distribution of requests among provided services (which provided services are requested how often; e.g. browse front page and list available products of a web shop; required for the system level and individual components)
  • Typical data passed to provided services (e.g. size of uploaded files, size of arrays to sort, number of items to search for)
  • Typical number of parallel requests (i.e. number of concurrent users of a system)
  • Percentage of failing requests (i.e. error)

Execution Environment

Static information:

  • Number of CPU cores per server
  • Network latency between servers
  • Network throughput (median, max, distribution)

Monitor per machine of the execution environment during measurements:

  • Processor utilisation
  • IO throughput

While measuring, one should monitor the processor utilisation (Unix: top) and the IO (Unix: iostat). If values reach or are close to maximum expected values, experiments and gathered data become unreliable due to effects of potential bottlenecks. Such data can only be used during validation but not for model calibration.

Software

Static information:

  • Limited parallelism (e.g. thread pool, restriction for the number of processes, data mutex, database connection pool)
  • Distribution strategies of load balancers, proxies (e.g. round robin, first come first serve)
  • Presence of caches: Cache hit/missing propability; if important: depending on the kind of requested data

Adapt Logging Formats

Apache Web Server

  • Use %D in the log format to enable logging of delivery time (available starting from Apache 2): time spent to serve a request in microseconds
  • Do not use %T: This log format option (delivery time in seconds) in deprecated and too rough.
  • Apache logs only delivery time (=response time + network time to deliver response data). Hence, values also depend on the external network connection.
  • Link: http://www.ducea.com/2008/02/06/apache-logs-how-long-does-it-take-to-serve-a-request/

Tomcat

  • Configuration of access log valve (example)
  <Valve className="org.apache.catalina.valves.AccessLogValve"
    directory="logs" prefix="timing." suffix=".log"
    pattern="%t %U %s %D" resolveHosts="false" />

Glassfish

  • Log response time: Switch the monitoring level for the ejb containers to HIGH or LOW (see Monitoring and log formats for Glassfish)
  • Change access log pattern format. Use the format attribute of the <access-log> subelement of <http-service> to specify the access log pattern of virtual servers and add add: %time-taken%. This option is available for Glassfish v2.0 and newer. time-taken is the time in milliseconds.
 format="%client.name% %auth-user-name% %datetime% %time-taken% [...]"

Analysing Log Files

  • ProM 6 Import and clean up for log files (predominantly process description files: Causal Net, CFG Open Net, COSA Petri Net, CPN, CPN XES, CSV, CTLS, EPML EPC, LoLA, LTL, PND, PNML, ProM, TPN, TSML, XPDL, YAWL)
  • Microsoft Log Parser log file import, export and query functionality
  • Pentaho Kettle data aggregation

Measuring Response Time in Code

Plain Java

  • Java: Use System.nanoTime():
    • platform-independent: available all platforms running the JRE version 5 or later
    • relatively precise (high resolution), relatively low overhead: accuracy of 1000 ns or better, invocation cost of 2000 ns or better
    • not wall-clock time, but does not shift/fluctuate when JVM is not restarted - across processes, however, it may use different offsets/epochs and, thus, deliver measurements that must be aligned across processes

Perf4J

  • Perf4J Performance data instrumentation framework with attached basic performance analysis capabilities

Converting JMeter Timestamps to Microsoft Excel-Date Time Format

  • You can configure JMeter to save the time stamps in a standard date/time format:
-Jjmeter.save.saveservice.timestamp_format="yyyy-MM-dd HH:mm:ss"
  • Converting the typical JMeter time stamps to Excel is easy, though
    • Timestamp (Jmeter): the number of milliseconds since midnight, January 1, 1970 UTC
    • Time (Excel): number of days since January 1, 1900; additional fraction of a day for times
  • Note: jmeter logs the times from UTC, so if you are living in middle Europe you need to add 1 hour. For Middle European Summer Time (MEST), you need to add 2 hours.
  • Say we have the jmeter timestamps in column A and want to have the formatted times in column B, then B1 is
=A1 / (24*3600*1000) + DATEVALUE("1-1-1970") + TIMEVALUE("02:00")

In German Office installations:

=A1 / (24*3600*1000) + DATWERT("1-1-1970") + ZEITWERT("02:00") 

Useful Tools

Keywords: Log File, Analysis, Log Format, Log Pattern, Palladio, Model Calibration, Resource Demand, Reposonse Time, Throughput