Exploration of Markov Decision Processes (MDPs) with Probabilistic Model Checking (PMC) Results

Screenshot of PMC-Vis, showing 3 panes (a, b, c). Pane (a) shows the initial states of the MDP (as a grid of unconnected nodes above, and as polylines in the Parallel Coordinates Plot below). Pane (b) shows a set of selected actions (in orange). Pane (c) shows a selection over the possible outcomes of a chosen action (states, in blue). Lastly, the settings sidebar can be seen in (d), currently linked to pane (b) as indicated by the green border on top of this pane. 

Usage Scenarios

To validate our results with respect to DG-1 and DG-2, we illustrate the usage of PMC-Vis on the basis of an example model of a self-adaptive Server Management System (SMS). This model has a total of 93588 states, of which 84 are initial states. Roughly, the model describes how a number of tasks with differing workload must be distributed to a number of servers. Tasks can be distributed to various servers, but a single server can only compute for one task at a time. The workload that a server can manage depends on its hardware and software settings. Roughly speaking, the behavior of this model can be split onto 3 cyclic phases: (1) generating tasks, (2) re-configuring, and (3) assigning work.

These phases repeat until some termination criteria is reached (e.g., a set number of phases have been completed). With this example SMS, we will illustrate three individual scenarios. For simplicity, we consider the scenarios separately, although, in reality, users may work on their tasks in parallel or seamlessly switch between them. Scenario 1 deals with the model configuration, in which the user is interested in finding a good server setup. A server setup describes how many servers and what hardware to use. Scenario 2 covers the model exploration, which aims at developing an understanding of its behavior and structure. Finally, scenario 3 focuses on finding managing strategies that fulfil typical SMS objectives: In this example, (1) finishing all tasks at the end of a phase and (2) minimizing ongoing costs (e.g., energy costs).

Scenario-1: Model Configuration

Once the user opens the model in PMC-Vis, the graph view and attribute view of a single pane are populated with the 84 initial states of our model, encoding all the SMS parameters that cannot change at runtime (like server number and type). Finding a good initial configuration for our model (i.e., finding a server setup satisfying the users specifications) simply entails making a selection of the best candidates from those 84 initial states.

These are the PMC results for our main criteria, maximized and minimized over the MDP:
  • the probability of finishing all assigned tasks (PrMinHappy, PrMaxHappy),
  • the expected energy consumption over the task (LeastEnergy, MostEnergy), and
  • the accrued violations for tasks not fulfilled (LeastViolations, MostViolations).


Using the parallel coordinates plot, the user can hone onto the configurations that fulfil the desired criteria. In our case, the user would want to maximize the probability of finishing all tasks (by brushing over PrMaxHappy as close to 1 as possible), and from the resulting options, a second selection should be made for minimal energy consumption (brush over the lowest values in MostEnergy or LeastEnergy), resulting in 9 selected configurations. The user can then note that the property MostViolations further refines this selection onto three groups at values 0, 3 and 6. Naturally, the user would decide to minimize the violations, and by brushing over the value 0, the selection is refined to only 3 best candidates. With this, the user has found a set of good server setups.
It is worth noting that the patterns discernible in the parallel coordinates plot also point to relations between the PMC results, which may be insightful for the user.
For instance, it is possible to see that some configurations that keep violations at minimum are expected to use a lot of energy (connections from low values in LeastViolations to high values in MostEnergy).
Likewise, it can be seen that the minimum probability of finishing all tasks (PrMinHappy) is the same for all configurations.

Scenario-2: Model Exploration

Instead of randomly selecting initial states to explore, the previous scenario has yielded a small set of initial configurations that are worth exploring. Starting from these configurations it is possible to explore how the model develops by expanding the state nodes. Repeatedly expanding within the same pane is helpful to see the general structure of the model, but it would soon lead to many irrelevant states cluttering the view.
For example, once a set of initial configurations has been chosen, the rest are no longer relevant to the user, meaning that if Scenario-1 yielded 3 promising initial states, 81 will remain in the pane without purpose.
While it would be possible to hide irrelevant states, actions and edges, the user may want to adjust their selection later, or they may want to explore sub-optimal paths in the model when debugging. Using our multi-pane approach, the user can create a checkpoint on their current graph and expand the new nodes to explore onto a new pane, without affecting the state of the previous panes. By progressively expanding the graph in either the same pane or multiple, the user can see the general structure of the MDP through the patterns that appear in the graph view, as illustrated in this image:
Here we can see the three phases of be computed over an even distribution. The patterns correspond to (a) a single MDP action assigning tasks to (b) a number of sequential reconfiguration choices, which may (or may not) affect the SMS and its prospects, and in (c) tasks are assigned to each server, until all servers have started, leading back to task generation (a) for each.

Scenario-3: Model Reconfiguration & Strategies

In this Scenario, the user wants to find a good strategy for managing the servers. This means finding a strategy that re-configures the system into a fitting state (in phase (b)), as well as assigns the right servers to the right tasks (for phase (c)). In order to find such a strategy, the user must understand the possible outcomes of certain actions in the graph and compare the alternatives.
Once a decision of interest has been identified, the user may open PMC-Vis provides multiple ways to inspect and compare the actions, towards the creation of a strategy:
  • Per-node details: Every node can show a tooltip with properties of interest (e.g., per-state PMC results, variable/parameter configurations, state labels and reward functions). If the user wants to change which properties appear in these tooltips and parallel coordinates plots, the settings sidebar provides a selection of all available properties.
  • Parallel coordinates comparison: Nodes can be selected depending on the behavior set in the settings sidebar (only states, only actions, or both simultaneously). Action nodes summarize the properties of all states reachable from them, and, similar to states, they can be inspected in the parallel coordinates plots. In a process similar to Scenario-1, the user can compare selected actions by properties of interest to make a decision on which path to follow to continue exploring.
  • Schedulers: In order to understand how non-determinism is resolved for maximum or minimum PMC results, the user can check the extracted scheduler information from the model checker. By selecting the property of interest in the scheduler settings, PMC-Vis will highlight all edges related to the actions a scheduler optimizing this property could take. For example, selecting PrMaxHappy as the scheduler property will indicate which actions to take to ensure that all system tasks will be terminated successfully. This not only allows the user to quickly discard sub-optimal actions and states, but also enables a initial understanding of optimal strategies for single goals with minimal effort. From the set of actions and states marked by the scheduler, it is also possible to refine the selection using the parallel coordinates plot as described in Scenario-1.
  • Multiple Schedulers: Although only one scheduler can be selected at a time for a pane, multiple SMS objectives may need to be considered simultaneously (e.g., since a higher number of tasks completed goes along with higher energy consumption). To find a fitting strategy, we need to combine the strategies of multiple schedulers. While it is possible to define properties that combine our objectives, or use Pareto functions to find compromises, these alternatives do not allow dynamic adjustment of the trade-off applicable. For example, one may not be interested in a equal compromise between energy consumption and task fulfilment, but instead want to favor one side over the other. With this in mind, PMC-Vis allows cloning panes as an alternative which we believe suffices to support such multi-scheduler decisions in most cases. As stated earlier, each parallel coordinates plot can be manually set to select configurations optimal to multiple simultaneous criteria. With multiple simultaneous panes showing the same decision space, a side-by-side comparison of the parallel coordinates plot selections as well as the scheduler highlighting on the graph view can be inspected.
  • Exporting the strategy: Aided by the previously discussed features, the user has accrued a sequence of decisions that altogether constitute the desired strategy to manage the servers, until the final states of the MDP are reached. However, the free exploration of the MDP in several panes means that not only the desired choices, but also several dangling nodes and half-explored paths may be present. To support the user in organizing their work across several panes PMC-Vis allows marking nodes of interest (via the context menu), which can then be exported (i.e., downloaded as JSON) independently of the pane content for all panes at once. This exported sequence may contain gaps within the MDP, either because the user did not mark all the relevant nodes that they found, or because it was unclear which states best suited the strategy at certain points. For such cases, it is possible to import the working strategy onto a pane of PMC-Vis, and in a similar process to \emph{Scenario-2}, the user may continue to explore the gaps in the system until a strategy has been completed.

Video Walk-throughs

Since our paper discusses a visual tool which is more easily explained by watching it in action, we prepared the following videos to illustrate the discussed usage scenarios.

  • Scenario 1: Intro to the Server Management System + Finding good initial configurations:


  • Scenario 2: Understanding the model behavior by finding patterns


  • Scenario 3: Reconfiguration and schedulers



DOI to the reproducibility package in Zenodo


This package includes the docker images, experiments, source code and instructions on how to replicate our results.