Analytics builds the competitive advantage, part 3 – data modelling

Print | Download as PDF

An extremely important element and link of business analytics is the process of data modelling, thanks to which the extracted information are transformed into knowledge which can be utilised to make more effective business decisions. To put it briefly - the process of modelling transforms raw data into conclusions with practical applications through which the models are transformed into universal tools. Modelling allows for forecasting future business results, assessing risk, evaluating the situation and the widely understood effective management of time, money and resources.


Models can have the form of functions, equations or parameters. A model is developed on the basis of various computational methods, e.g. linear/logistic regression, decision/regression trees, random forests, neural networks, clustering, association, Bayes methods. The term ‘model’ can be applied to a result obtained through the application of a specific method, however one model can be based on several methods at the same time. We are limited only by our imagination, data (and time).

Building models is based on our own experience. We know what the result of something else is - we know where to look for the reasons for a growth or drop in production. Yet also here statistical methods can come in handy when selecting factors (variables) to be modelled in order to focus on the most important ones. We can also discover dependencies we didn’t know about before. The purpose of a model is to discover the strongest connection between the result, which interests us, and other data (variables). If a change of the result is the consequence of changed variables, we can talk about dependency and try to model it, i.e. describe it using parameters and coefficients.

This set of parameters and coefficients is the acquired information. It is a general picture extracted from the depth of data. It is a tool to forecast what will happen and to control the future. A model is like a cooking recipe - it tells us that if we add a certain amount of ingredient A, we will receive something hot and spicy, and if we add B, the outcome will be sweet.

Data modelling allows for introducing order into the complex and multidimensional reality. The purpose of a model is to make an understandable and useful generalisation and to extract essential features of our business. This is where we can store our experience and knowledge of the business we manage. We can look through it at the changing world.

Statistics and models are a convenient method of describing the reality which we can refer to, which we can confront and question. The principles we used to build statistical data are important and as long as nothing changes, we can expect significant repeatability. However be wary! Statistics and statistical models have nothing “for certain” - probability is the king. That is why statistics are based on intervals and not on precise values. The colloquial phrase “more or less”, so often used in business, offers the taste of this statistical intervals. Despite all this, the cognitive power of statistics and models is significant. They use numbers to visualize how certain phenomena depend on one another. They quantify dependencies and they measure their probability.

The most important thing is to use the result in running the business. We can use the models to make decisions or to include them in the systems used by the company. The latter is a step towards Machine Learning. Nevertheless we can count on improving the precision of our decisions and actions. However, if a model stops working, it may mean that a new circumstance has occurred which has not been taken into account. It must be detected and business operations need to focus on it as soon as possible.

Seasonality is also a model

The seasonal model can be found in many companies. For instance, quarterly seasonality can be written down as the following equation:

This is our statistical model. We see that in the 1st quarter we have an average drop in operations by 8 and in the 2nd a growth by 3 units, whereas the largest increase is in the last quarter of the year. Assuming the we have correctly adjusted this model to the reality, we can draw several conclusions.  For example, that rationalization project or investments should be carried out at the beginning of the year because of lighter workload.  Or that in order to avoid downtime, you should look for customers who have their peaks in their first or third quarter. We can also plan holiday leaves based on such a model - we know exactly how much less work we are going to need. We are able to better estimate the demand for materials whereas before the season it is worth reminding our customers about our trade offer.

Smart organizations build models

Using models in the organization is not common. Despite the fact that many methods have been developed before the 20th century, they still form a psychological barrier. There are a few major reasons for that.

We still rely on reports in the form of charts and tables, we calculate sums, average percentage and we stop at that. One the one hand managers do not request models for themselves and, on the other, employees find it hard to justify the significant, additional workload just to present something. Business is not familiar with models so it does not want them Also the application of results can be problematic. You must be able to set the model in operational applications. And, above all, you must trust them as your work methodology. Applying results calls for the corporate culture based on data, for instance including them in the decision-making procedure. Lack of tools is a problem as well. Office packages do not provide functionalities which would allow for developing models based on known techniques. Commercial software is quite expensive even for a large company. And it is worth mentioning specialists who know how to build such models.

These are the obstacles which must be overcome in order to be able to use data based on statistical models. For many years now three sectors of the economy commonly have utilised models: banking, retail and the telecommunications industry. They most commonly use the models of clustering and retention of customers, a shopping basket and a shopping sequence. Retail chains launch loyalty programs which use discounts and bonuses to gather information about buying habits. The customer’s basket shows the force of habit which can be used to safely launch a new product from the same segment, to plan shelves in shops, to create advertisements and build the price strategy.  The telecommunications industry launches retention plans which should pre-empt the takeover of the customer by the competition so, for instance, phone calls concerning the contract prolongation are made 2, 3 or even 6 months before the end of the current one.  These examples are quite well known.

Many new companies which base their business on digital products use complex algorithms which adopt the results of statistical models as their control parameters.    Also traditional companies, during the process of digitization, buy and develop programs which use statistical data modelling to support management or even co-management.

Now it’s your time...


Adam Karolewski

Adam Karolewski

Genius Lab Analist

An analyst with logistics educational background. Involved in data analysis for many years. Has been working for Raben Group since January 2016. Currently the head analyst in Genius Lab, responsible for the research and development of the Group.


Learn about our services

Zbigniew Kepiński
Anthony Ranson
Nikolett Szuha
Maja Kierzek-Piotrowska
Péter Erdei
Antoni Zbytniewski
Marek Pluciak
Aleksander Kroll
Łukasz Michałowski
Věnceslav Dobrynský
Adam Karolewski
Daniel Rösch
Aleksandra Kocemba
Edyta Staszczyk
Łukasz Dubina
Anna Szymanowicz
Paweł Rymarowicz
Rafał Kukotko
Katarzyna Jaeger
Marek Zychla
Grażyna Łukasik
Marcin Turski
Monika Appolt - Bubacz
Bartłomiej Łapiński
Sławomir Rajch
Paweł Trębicki
Other articles