Stateful Software Performance Engineering

This page describes a scientific approach to assess the dependence of the predicted system's performance on the context and history-dependent internal state of the system (or its components). The questions that rise for current prediction models are (i) how to include different types of stateful information in a prediction model, and (ii) how to balance the expressiveness and complexity of created models via an effective abstraction of state modelling. Only a few performance prediction approaches deal with modelling states in component-based systems. Currently, there is neither a consensus in the definition, nor in the method to include the stateful information in prediction models. We conduct a state-of-the-art survey of existing approaches addressing their expressive power to model stateful dependences, and based on the results, we introduce a classification scheme and present the state-defining and state-dependent model parameters, implemented into a chosen model-based performance prediction approach, the Palladio Component Model (PCM). Moreover, we study the performance impact of the individual state categories, and discuss the model-size costs they imply. The most important observations are formulated into a set of heuristics guiding system engineers in state modelling, and their validity experimentally evaluated.

In the following, the basics of the approach are described, as well as examples of its application and experiments.

Background

The question that rises for current performance models is how to include the software application properties identified above in a performance model, and how to build more accurate and expressive models of stateful component-based systems. In this respect, we can identify four main issues.

State definition: The property of statefullness can be identified in various artifacts of component-based systems, varying over several system life-cycle stages. Existing literature lacks the localization of state-holding information identifiable in component-based systems, and their classification into a transparent set of categories. Available surveys consider the capability to model state only partially or not at all.
Performance impact: The benefits of state modelling include increased expressive power of the models and higher accuracy of predictions. It is however not well studied, as observed by a number of authors, what is the increase of prediction accuracy achieved by state modelling, especially in comparison to the increased effort for modelling and analysis.
Prediction difficulty: The balance between expressiveness (state modelling) and complexity (model size increase) is a challenging research question. Only when it is understood what costs need to be paid for the increase in prediction accuracy, we can competently decide on the suitable abstraction of state modelling (to what extent the state-related information present in the analysed system shall be included in the model).
State support in component models: The lack of work addressing the discussed issues can be explained by insufficient support of stateful information in existing performance-prediction models. Industrial models (like EJB, CCM or Corba) have been designed to support internal state, since it is one of the crucial implementation details, but lack the support of broad analysis capabilities with respect to system properties. Academic research-oriented stateful component models (like SOFA) are often accompanied with a special analysis method for a set of functional system properties (model checking), but not for performance, which is of our interest. The performance-driven research-oriented component models either lack support for state modelling or model state only partially (often abstract state dependency to such an extend that models do not mirror the reality).

Stateful Models

We extended the component behaviour model of the PCM (especially the SEFF specification) to allow the modelling of component internal state. With this extension, also system specific global state can be modelled by adding a blackboard component that makes its internal state available to other components in the system. Only two additions to the PCM metamodel are required to model component internal state and global system state. First, we declare a set of state variables for a component. Only a declared state variables can be used within a SEFF. Second, we add a SetStateAction to the SEFF, which allows to set the state variable to a given expression. Input data of the SEFF, other state variable values and the previous state variable value can be used in the expression. Now, the state variable can be used in branch conditions or resource demands as a parameter.

An extended PCM model can be analysed with the extended version of the SimuCom simulation to obtain the performance metrics. At simulation runtime, each component is instantiated and holds its state variables. When a SetStateAction is evaluated, its expression is evaluated and stored in the state variable. If BranchActions and InternalActions access state variables, the value is retrieved. The extension increases the expressive power of SEFFs and allows programming, although the language does not become Turing complete (all loops are bounded). As multiple requests to the system are analysed concurrently, we can encounter race conditions and resulting unexpected behaviour. In our example above, race conditions are excluded because the branch condition and SetStateAction are evaluated in the same simulation event (no time passes in simulation). However, in general, if a resource demand is executed between reading the state in a BranchAction and setting the state in one of the branches, both actions are executed in separate simulation events. Here, a second request to the component could read or change the state in between, leading to race conditions. With the extended state modeling capability, steady-state behaviour is not guaranteed any more. While this limits analysability, it also can help to detect problems in a software design.

For example, assume a system service that becomes the more expensive the more requests have been served. Then, the response time of the system will ever increase (The Ramp antipattern) and no steady state can be reached. With the extended state modeling capability, this performance antipatterns can be detected in the simulation results.

State-Effect Analysis

When studying the performance impact of state modelling, we have compared stateful models to their approximations with probabilistic models. The experiments detailed here and discussed in the paper [1] show that even if the probabilities in the stateless models reflect system usage and environment, the results of the performance evaluation may deviate significantly from the stateful models. The deviation is best visible on the probability distribution of the response-time values and the time series, which are the most fine-grained response-time metrics. Also the variance and best/worse case are very different, with a higher variance of stateless models. On the other hand, the median and mean values use to be quite stable, deviating often only slightly from the stateful model. Detailed experiments and discussion of their results follow.

Protocol State

Example Protocol State: A SEFF of open()

Description: The protocol state, which is the only state category included in this class, is used for a very specific purpose. It holds an information about currently acceptable service calls of a component.

Example: File managing component.

Observations:

The performance impact of the protocol-state modelling highly depends on the a-priori knowledge of the usage profile, which in general cannot be guaranteed since component behaviour and usage profile are typically defined independently by different developer roles.
Even if the usage profile is known, the actual probabilities of service execution depend on component's environment through which the usage profile is propagated, and thus can be very hard to quantify.

Heuristic 1.1

Definition: The importance of protocol-state modelling raises with lower knowledge of the usage profile.

Experimental evaluation: Our experiments have shown that already a very little inaccuracy in the usage profile may lead to a very imprecise stateless (i.e. probabilistic-abstraction) model, since the inaccuracies can be easily magnified by system control flow. While any user input is in the stateful model readily propagated to the corresponding system state (valid for the protocol, internal or global state in particular), in the stateless model the input effects may be distributed throughout the whole system model. There are two performance-related arguments that justify the inclusion of a state-dependent parameter into the model in this case. Firstly, significantly more effort is required in the stateless model to update its transition probabilities to a more accurate usage profile. Secondly, an adaptation of the probabilities in the stateless model does not need to be sufficient to reflect the usage-profile change accurately. A structural change of the model may be necessary (valid for all response-time metrics). To illustrate these two phenomena, we have experimented with an explicit (i.e. state-free) propagation of usage profile changes into the system model, applying it to all the protocol, internal and global state. We present the evaluation details in Heuristic 2.1, which is an analogy of Heuristic 1.1 for internal/global state.

Heuristic 1.2

Definition: The importance of protocol-state modelling raises with higher complexity of component's environment.

Experimental evaluation: In some situations common in complex systems, it may be very hard or even impossible to estimate probabilities for the stateless model precisely. A simple exemplary model illustrating this phenomenon can be build on the fact that the probabilistic abstraction can hardly be foreseen in the models where the same service is called twice and each time behaves differently based on the actual protocol-state value that may change in the meantime. In such a case, two models of the same service would need to be present in the stateless system model to make it accurate. Otherwise, all the response-time characteristics of the probabilistic model (even the mean value, which uses to be very stable) may significantly deviate from the values of the more-precise stateful model.

Detailed discussion of the experiments can be found here.

Internal/Global State

Example Internal State: A SEFF of processData()

Description: The internal state, as well as the global state, holds local (resp. global) information used to coordinate the behaviour of the system or its components.

Example: Full/compress mode of a component.

Observations:

The performance impact of the internal/global-state modelling highly depends on the a-priori knowledge of the usage profile and the complexity of the system (environment of each studied component).
See the example above with strongly positively correlated branches (let us denote the alternatives in the first branch A and B, and in the second branch C and D). Note that while in the stateful model, there are only two possible service executions (either A followed by C, or B followed by D), in the probabilistic model, four alternatives are possible (both A and B can be followed by both C and D).

Heuristic 2.1

Definition: The importance of internal/global-state modelling raises with lower knowledge of the usage profile.

Experimental evaluation: Our experiments revealed that already a very little inaccuracy in the usage profile may lead to a very imprecise stateless (i.e. probabilistic-abstraction) model, since the inaccuracies can be easily magnified by system control flow. This observation is valid for three types of states (due to their usage dependence): the protocol, internal and global state. We demonstrate the observation on a global-state example of a simple library search functionality. In the evaluation, we have first studied the effects of usage profile propagation to the control flow and then designed a number of usage scenarios validating the observation.

Detailed discussion of the experiments can be found here.

Heuristic 2.2

Definition: The importance of internal/global-state modelling raises with higher complexity of the system (environment of each component).

Experimental evaluation: The evaluation of Heuristic 2.2 can be built on analogous reasoning to Heuristic 1.2. Two already presented models can be used for demonstration of the observations. First, the model employed in Heuristic 1.2 could be modelled with an internal state and used here. Second, the model employed in Heuristic 2.1 integrates also the complexity of the environment and demonstrates this observation.

Heuristic 2.3

Definition: The importance of internal/global-state modelling raises with higher correlation of subsequent state-driven decisions.

Experimental evaluation: In the experimental evaluation, we have used a number of models analogical to the example outlined above, with an internal action in between of the branches. The experiments show that even if the probabilities of the branches accurately reflect the usage profile, the results computed from the stateless model can be very imprecise. Notice that, already in a very simple model (one service with two or three branches), the probability distribution (mainly the variance and best/worst case) of the stateless-model results deviates significantly from the stateful-model distribution. The mean and median values tend to be quite stable for simple examples, and start to deviate when more complexity is introduced into the models.

Detailed discussion of the experiments can be found here.

Allocation/Configuration State

Description: This class comprises of four state categories, in particular the component-specific and system-specific allocation and configuration state, all coordinating system behaviour according to a fixed (deployment or configuration) parameter.

Example: The maximal length of a queue used by a component/system.

Observations:

As distinct to so far discussed categories, the general influence of the allocation/configuration state to system performance is independent of the usage and the environment. For each service, the state-guarded branches are evaluated in a fixed way, irrespective of the service clients.
On the other hand, the prediction accuracy is highly dependent on the knowledge of deployment/configuration parameters, which allows the architect to cut off the behavioural branches in the stateless model that go against the expected value of the parameter. When such an information is not available to the component developer (since it is determined by a different role and or later in development process) and the uncertainty about the state value needs to be expressed with probabilities, the probabilistic model exhibits high inaccuracies.

Heuristic 3.1

Definition: The importance of allocation/configuration-state modelling raises with lower knowledge of deployment/configuration parameters.

Experimental evaluation: The experimental evaluation reveals that whenever there is any uncertainty about the value of the parameters, which hence needs to be in the stateless model modelled with probabilities, the model may become very imprecise. The reason for this fact is that while in the stateful model, the parameter value for the whole system execution remains the same (the uncertainty about the parameter value is included on only one place), the stateless model includes also the behaviours reflecting the unrealistic cases of parameter changes during system execution (similarly to the phenomenon observed in Heuristic 2.3). Interestingly, the deviation of the stateless model from the stateful results tends to exhibit a common phenomenon regarding the probability distribution of the reported values. In particular, while the mean and the median of the results use to be the same (or very similar), the variance of the stateless results tends to be significantly higher, with much smaller best value (fastest response) and much higher worst value (slowest response) compared to the accurate results of the stateful model.

Detailed discussion of the experiments can be found here.

Session/Persistent State

Description: The session state, as well as the persistent state, holds an information remembered for each individual user (in or even in between sessions), and is used to customize system behaviour accordingly.

Example: Sale scenario in a supermarket where the information about customerCard is propagated through the whole scenario.

Observations:

The impact scale is not very dependent on the usage profile and environment, but highly dependent on the knowledge of the distribution of the state values (similarly to the knowledge of deployment parameters in case of the allocation/configuration state).
Since the subsequent queries on the state value are highly correlated, probabilistic models can hardly model session/persistent-state dependent behaviour faithfully (similarly to the internal/global state).

Heuristic 4.1

Definition: The importance of session/persistent-state modelling raises with lower knowledge of the corresponding user-given parameter values.

Experimental evaluation: Similarly to Heuristic 3.1, whenever there is any uncertainty about the value of the user-given parameter, which hence needs to be in the stateless model modelled with probabilities, the model may become very imprecise. The reason for this fact is that while in the stateful model, the parameter (session/persistent) value for the whole system execution often remains unchanged or is changed very rarely (the uncertainty about the state value is included on only one place, in the usage profile), the stateless model includes also the behaviours reflecting the unrealistic cases of state changes during system execution (similarly to the phenomenon observed in Heuristic 3.1). Interestingly, the deviation of the stateless model from the stateful results tends to exhibit a common phenomenon regarding the probability distribution of the reported values. In particular, while the mean and the median of the results use to be the same (or very similar), the variance of the stateless results tends to be significantly higher, with much smaller best value (fastest response) and much higher worst value (slowest response) compared to the accurate results of the stateful model.

Detailed discussion of the experiments can be found here.

Heuristic 4.2

Definition: The importance of session/persistent-state modelling raises with higher correlation of subsequent state-driven decisions - which is typically very high.

Experimental evaluation: The evaluation is built on a set of examples analogical to the set employed in the evaluation of Heuristic 2.3. Moreover, it demonstrates that the correlation can be very high, since the state value (for both the session and persistent state) is highly stable along system execution (i.e. also between the state-dependent decisions).

Detailed discussion of the experiments can be found here.

Tool Support

Tool support for stateful systems modelling and prediction is realized as a set of Eclipse plug-ins, included into the PCM installation. To get the PCM3.0. stable version, follow the instructions on the Palladio download page.

Team

Dr. Lucia (Kapova) Happe, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Dr. Barbora Buhnova, Masaryk University, Brno, Czech Republic
Prof. Dr. Ralf Reussner, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Links

Palladio Component Model (PCM) homepage
HTML Documentation of version 3.3 of the PCM meta-model (as of 2010-07-14).

References

[1] L. (Kapova) Happe, B. Buhnova, R. Reussner: Stateful Component-Based Performance Models. Submitted to Software and Systems Modeling, Springer, 2012.