Focus on Valves
    A new motorized control valve for the semiconductor…

Comment PDF Operations & Maintenance

A Holistic Approach for Pump-System Analytics

By Stylianos Giannoulakis, Sulzer |

A step-by-step approach is presented to address the modern requirements of a pump-system digital solution

In recent years, significant developments have been achieved in the fields of cloud computing and machine learning techniques. In conjunction with increased awareness regarding the benefits of digital transformation, from both users and manufacturers of industrial equipment, a unique opportunity is emerging. This article describes a comprehensive solution for addressing pump system monitoring, and predictive and optimization analytics in a holistic manner. The target of the proposed approach is to leverage all available system data and infrastructure and to address issues of traditional monitoring solutions. Typically, traditional performance monitoring and predictive solutions are simply accepting errors of raw sensor data or modeling assumptions, or both. In addition, any reliability alerts are based on thresholds of individual sensors, which are usually triggered at the time of failure, providing no reaction time to the user.

As shown in Figure 1, five distinct steps are defined as part of an integrated solution: 1) connect remote industrial equipment to the internet; 2) monitor the state of the equipment; 3) analyze the operating conditions; 4) predict the behavior of individual components and of the complete system by training customized models; and 5) optimize a pump system at the aspects of performance efficiency, reliability and profitability. This article presents the background of the successful deployment of the first three steps to a major pipeline operator, along with the current research and development efforts to cover the remaining two steps.

Figure 1. Visual representation of a step-by-step approach for pump system analytics


Different connectivity options are available today, however the preferred solution will exploit existing infrastructure. Typically, most pump systems of medium or high criticality are equipped with a certain number of permanent sensors, registering values of fluid volumetric flow, pressures, density, pump status, vibrations, temperatures and driver energy consumption. All sensor values are stored in a central local database and can be accessed by the operator. The proposed solution extends the process by fetching all required values from the local database and transferring it to a cloud solution (Figure 2). This approach leverages existing infrastructure, does not disrupt existing operations and delivers all necessary data to a cloud environment, which offers global access to multiple users and leverages modern techniques of cloud computing.

Figure 2. This diagram shows the automated data flow of asset-sensor values to the cloud infrastructure for further evaluation

Raw sensor data are averaged for a short pre-defined time interval (typically in the range of a few minutes) and a selection of monitored parameters are concentrated into a file, which is subsequently pushed to a secure cloud service. The data transfer is only one way, enabling the user to push the required data, but at the same time ensuring that no access is possible through the firewall. Each file pushed from the user includes sensor values as a representation of the equipment’s operating conditions for a discrete timestamp. All data files end up at the cloud landing zone, where they are queued for the first levels of pre-processing. A selection process takes place ensuring proper sequencing of data according to their timestamps. Other standardization layers ensure that data are converted into default units of measure and they are structured to a relational database. The data are structured following the relations and topology of physical equipment. More details on the topic of equipment topology are provided below.


Analytics architecture

This section discusses the overall data flows and analytics algorithms, providing a better understanding of the targeted architecture, which is shown in Figure 3.

Figure 3. Shown here is an overview of data flows and analytics algorithms

Following the standardization path of the previous section, for every new timestamp, raw sensor values are stored in the cloud database. Apart from operational data, static information is also stored in the same database, which characterizes the different components of the physical process. Such model parameters define the nominal expected behavior of the equipment.

The first step of the process is to evaluate the collected data for every timestamp. Targeting to minimize both the measurement errors and the model uncertainties, a data-reconciliation method is employed. Such a method leverages conservation equations and component modeling, to correct both measurement values and model parameters. The adapted values are stored back to the database and they will be the basis of all downstream calculations.

After establishing the monitoring calculation, the user can assess the impact of the validated operating conditions of the asset along with the deviation of corrected model parameters by comparison to the nominal ones. The latter can provide insights on degradation patterns over time.

Corrected model parameters of recent operation are used to simulate the performance of the pump system. Instead of relying on theoretical values, the system performance is estimated based on the latest measurements. Last but not least, an optimization engine triggers simulations, to identify the optimal operating conditions satisfying the user’s requirements for overall system performance.

Downstream calculations are also leveraging the monitoring results. Deviations from the pump’s best efficiency point (BEP) provide insights to both performance losses and accumulated impact on the pump’s reliability. Finally, machine learning algorithms are employed to identify abnormalities in pump operation and to notify the user in case of an imminent failure.


From monitoring to analytics

Following a digital-twin monitoring approach, a physical representation of the system (in addition to the sensor data) is required. Input shall be provided concerning the system topology, along with technical characteristics of the individual assets (that is, pumps, drivers, couplings, pipes, valves, sensors and so on). The combination of operational data and system characteristics offers the basis for pump-system monitoring. In a traditional monitoring system, every component is evaluated based on nearby sensor measurements and the overall performance of the system by using a subset of the local sensors. In the proposed approach, local raw measurements are combined with equipment topology and modeling. This strategy enriches the available sensor data, imposing physical and numerical constraints to the solution.

In detail, a steady-state network flow approach is followed to construct the analyzed pumping system. System equipment (pump, pipe, valve, electrical driver and so on) are represented as system nodes and pre-defined physical equations describe their behavior. The combination of these nodes can represent complex pumping configurations. During the digital-twin configuration, node combinations define the asset topology. Furthermore, a number of model parameters need to be specified to describe the nominal performance of the specific equipment. All required topology and model parameters are stored in a central cloud database to be accessible by the analytics algorithms.

Both process data and equipment modeling suffer from some degree of error, either random or systematic. A data-reconciliation method is applied, inspired by applications in chemical and power plants [1]. Such methods are able to evaluate non-linear models of equipment. They rely on data redundancies (that is, sensor values, equipment topology and physical equations), to optimally adjust measured quantities and model parameters by respecting problem constraints and conservation equations. This method derives the most probable operating conditions of both the complete system and of individual components (pumps, motors, pipes, valves and so on). This global approach reduces the level of uncertainty for every evaluated timestamp and derives customized component characteristics according to the latest status of the equipment. Any general data reconciliation procedure must solve the following constrained least-squares problem; that is, to minimize Equation (1) subject to the constraint of Equation (2):



J ( ŷ, ẑ) = ( y – ŷ) T V –1 ( y – ŷ) (1)


Subject to:

A y ŷ + A z ẑ = 0 (2)


Where, y and z are the raw measured and non-measured parameters, ŷ and ẑ are the adapted parameters respectively, A is the incident and V the variance matrices.

Figure 4. This diagram shows a pump/electrical-driver system with measurements for volumetric flow (Q), pressure (P), suction, discharge, rotational speed (N) and electrical current (I)

A simplified example is shown in Figure 4, where a number of measurements are available for a pump/electrical driver system. One can also take into account the physical modeling equations that describe the expected behavior of the system. In such a case, an overdetermined system is derived, where measurements will not fit the nominal expected behavior of the components. A data-reconciliation method is applied and Table 1 illustrates the corresponding results. It is noted that depending on the amount of confidence given to measurements and model parameters, the adaptation results can vary. In this example, higher confidence is given to measured values and lower to model parameters, which corresponds to higher adaptations for the model parameters. Moreover, the sensor data of the evaluated timestamp, indicate that the pump performance is deviating (18% for efficiency and 5% for head) from the nominal expected behavior. Increasing deviation over time can be an indication of performance degradation.


Additional constraints can be added to the system, by increasing the number of available sensors and the complexity of the pumping system. This leads to a problem closer to the scale of modern pumping systems and data reconciliation is proven an appropriate method for reducing measurement and model uncertainties.

Every timestamp is evaluated following the above methodology, and this offers greater confidence on the collected data set for analysis. The analysis step will provide a clear picture of the current and historical operation of every pump and the rest of the equipment. For every component, the validated operating conditions can be compared to the nominal and preferred operation. Either by using API or a vendor’s recommended zones, pump operation will be evaluated. A visual comparison between real operating data and recommended operating regions is shown in Figure 5.

Figure 5. This graph shows the comparison between sensor data and the nominal curve and recommended operating regions

Conclusions can be drawn concerning the equipment’s performance and reliability status. Not only performance, but also the pump’s reliability are influenced by deviating from best efficiency point (BEP) flowrate, for example through the damaging effects of part load recirculation [2]. The impact of historical operation can be analyzed, providing insights on operational improvements.


Performance and reliability

Greater financial value for the user can be achieved when predictions are made concerning the future equipment operation. Performance predictions can offer significant insights about the system capabilities. The user is able to test different operational scenarios and assess their impact. This can improve the scheduling of future operation and match demand with supply requirements more accurately. The proposed solution offers performance predictions for pump systems by leveraging the derived customized equipment characteristics, as described in the previous paragraph. In combination with the system topology and specified boundary conditions, the user shall trigger simulations to derive the operating conditions of individual components and of the complete system. The simulation process involves the complete pump system calculation and the final solution shall fulfill all physical modeling equations.

Figure 6.  This diagram shows a pump/electrical-driver system with volumetric flow (Q), suction pressure (P) and rotational speed (N) as system boundary conditions

Figure 6. This diagram shows a pump/electrical-driver system with volumetric flow (Q), suction pressure (P) and rotational speed (N) as system boundary conditions

As in the previous section, a simulation example of a simplified pump/electrical driver system is shown in Figure 6 and the corresponding results in Table 2. Comparison results are provided by both using the nominal pump-performance characteristics and by applying the degradation factors, as calculated in the previous section. As expected in comparison to nominal behavior, the pump discharge pressure is decreased, whereas the power consumption is increased.


Physical modeling of every component is similarly used as in the data reconciliation step. The modeling equations describe conservation laws and the equipment’s performance. In comparison to data reconciliation, where an overdetermined system is resolved, in simulation, the number of unknowns matches exactly the available equations. This requirement dictates the necessary number of parameters that need to be specified in advance as system boundary conditions (in this example, it is equal to three). An approach for solving nonlinear equations is employed [3], by iteratively solving a system of linear equations:


Solve per iteration:

A z ∙ δz = –F (3)


Apply corrections:

ẑ = z + δz (4)


Where, z are the original unknown parameters and ẑ are the calculated parameters by applying the respective corrections δz, A z is the Jacobian and F the residual matrices.

Following the proposed methodology, complex pump systems can be simulated. Where the performance characteristics of each component shall be adapted in advance, according to recent sensor values and data reconciliation results. This approach offers a great number of what-if scenarios for evaluating the current performance of the pump system. The operator shall test the operating requirements by performing system simulations, reflecting the different operating scenarios available. Due to the nature of the approach, boundary parameters can be specified according to the desired operating conditions. In case the throughput is of major interest, the volumetric flow is provided as a boundary condition and the system should resolve the other parameters. Similarly, if discharge head or pump operation close to BEP are of interest, these are defined as boundary conditions.

Concerning reliability predictions, the target of the proposed approach alerts the user with sufficient notice, in order to prevent an imminent failure. A clear benefit is gained through applying corrective measures before the failure even occurs. It can reduce the number of catastrophic failures, which have a significant impact on repair costs and downtimes. To achieve this reliability goal, machine-learning techniques are employed. More precisely, a combination of unsupervised anomaly detection with pump-physics-driven modeling is selected. While the specificity of anomaly detection techniques can be inadequate, the problem space may be reduced considerably by imposing constraints on variables, especially by modeling the correlations between components [4].

The target is to determine if a pump is operating according to normal operation standards or abnormality evidence is indicating an imminent failure. For that reason, the training data are used to learn a model of the normal behavior. When doing inference, new data are compared to the expectation and then classified. A review of common such methods can be found in Ref. 5. As seen in Figure 7, after every major maintenance action of the pump, a model representing the healthy asset is trained, which is used as the benchmark model for any future operation. Such models shall represent the pump healthy operation at the full range of operating conditions. For every new timestamp available, the pump conditions are compared to the healthy benchmark model and every deviation and abnormality is a potential indication of a developing issue.

Figure 7. The sequence at the top shown how an unsupervised machine-learning anomaly-detection algorithm works over time. The benchmark model is trained from healthy data (graph, lower left), which can then be used to identify operational abnormalities (graph, lower right)

The key point of the proposed solution is that such a method does not rely on individual sensors’ threshold values. Instead, multiple sensors are combined together, pump key performance indicators are evaluated and compared to their expected value. In several cases, this approach can provide an indication of the developing issue much earlier than the traditional threshold-monitoring approach. The latter usually triggers an alert only seconds before the actual failure.


Optimization capabilities

By establishing performance and reliability predictions, the next target of this solution is to offer optimization capabilities to the user. An optimizer shall trigger simulations of the pump system and based on an optimization strategy and boundary conditions, the operation can be optimized for single or multiple targets. The pump system performance can be optimized, by ensuring the minimum energy is consumed for given load requirements. Another target can be the operation shift close to preferred conditions, achieving also maximization of the reliability index. Furthermore, the use of drag-reducing agents can be optimized, by ensuring at the same time that the pump system will operate at the desired conditions. It is apparent that all discussed optimization strategies also have a positive impact on the financial targets of the system operation.

An optimization example is shown in Figure 8, where competing targets are illustrated. A typical target of a pipeline operator is throughput maximization. In addition, any reduction in operational costs (that is, electricity or fuel consumption) is of great interest. Last but not least, ideally all pumps shall operate as close as possible to their BEP, to ensure performance and reliability benefits. Based on the established data reconciliation method, physical pump and pipeline models are adapted to represent the current equipment behavior.

Figure 8. Shown here is a multiple-target optimization problem of a pump system. Underlying performance models are adapted to represent latest equipment behavior

In the proposed approach, an optimization method shall set the boundary conditions and trigger simulations of the pump system digital twin. The optimizer shall evaluate the simulation results and classify them according to the pre-defined target or targets. Several simulation optimization methods are available in the literature (for instance, gradient-based search, heuristic methods and so on), a critical review of available methods is presented in Ref. 6.

An example of the expected optimization results is shown in Figure 9. In case of two competing targets (that is, maximum system throughput and pump operation at BEP), a number of optimal solutions formulate a pareto front. The algorithm offers a number of viable solutions to the operator to choose from.

Figure 9. Shown here is a pareto front between two competing targets of a pipeline system



This article was first presented at the VDMA’s 4th International Rotating Equipment Conference, Wiesbaden, Germany, September 24–25, 2019.



1. Romagnoli, J.A. and Sanchez M.C., “Data Processing and Reconciliation for Chemical Process Operations,” Academic Press, California, 2000.

2. Gülich, J.F., “Centrifugal Pumps,” 2nd ed., Springer, Heidelberg, Germany, 2010.

3. Larock, B.E., Jeppson, R.W. and Watters, G.Z., “Hydraulics of Pipeline Systems,” CRC Press, Florida, 1999.

4. Svendsen, N., Wolthusen, S., Using Physical Models for Anomaly Detection in Control Systems. in: Palmer, C., Shenoi, S., Critical Infrastructure Protection III. IFIP Advances in Information and Communication Technology, vol. 311. Springer, Berlin, Heidelberg, 2009.

5. Pimentel, M.A.F, Clifton, D.A., Clifton, L. and Tarassenko, L., A Review of Novelty Detection, Signal Processing, vol. 99, pp. 215–249, 2014.

6. Carson, Y. and Maria, A., Simulation Optimization: Methods and Applications, Proceedings of the 1997 Winter Simulation Conference, Atlanta, Ga., 1, pp. 118–126, 1998.



Stylianos Giannoulakis is a data scientist and machine diagnostics engineer in the Global Technology division of Sulzer Pumps Equipment (Neuwiesenstrasse 15, 8401 Winterthur, Switzerland; Email: stylianos.giannoulakis@sulzer.com). He received a Diploma in Mechanical Engineering from National Technical University of Athens (NTUA) and an M.Sc. in Energy Science and Technology from Swiss Federal Institute of Technology (ETH Zurich). He leads data analytics, machine learning and software development prototyping for pumps equipment. He supervises the complete lifecycle of data analytics activities, from research/prototyping to operationalizing new product developments as customer products offerings.

Related Content
CPI Bookshelf: September 2020
The following is a list of recently published books with relevance to the chemical process industries (CPI): Trevor Kletz Compendium:…

Chemical Engineering publishes FREE eletters that bring our original content to our readers in an easily accessible email format about once a week.
Subscribe Now
Wet process analyzer for FPD and solar cell manufacturing for semi-conductors
Fluidized bed drying and cooling for temperature-sensitive polymers and plastics
CoriolisMaster: The SmartSensor solution
The Big 6 flowmeter technologies: Where to use them and why
Hydrofluoric acid alkylation (HFU) unit optimization

View More