Skip to content

Commit

Permalink
organization of files + readme.md completed
Browse files Browse the repository at this point in the history
  • Loading branch information
lucazav committed Jan 30, 2022
1 parent b0b4345 commit 6accd68
Show file tree
Hide file tree
Showing 10 changed files with 53 additions and 4 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
test-code.R
tables.xlsx
.gitignore
/interactivets/.tmp/*
55 changes: 51 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

In addition to being a fantastic data transformation and modeling tool, Power BI gives the analyst the ability to organize data graphically in a very appealing way.

Assuming you want to display a time-series including also the prediction interval for the forecasting part, Power BI by default allows you to do so, but it is the tool itself that determines the forecasting according to the type of series through two different algorithms: a [seasonal (ETS AAA)](https://powerbi.microsoft.com/es-es/blog/describing-the-forecasting-models-in-power-view/#ETSAAA) and a [non-seasonal (ETS AAN)](https://powerbi.microsoft.com/es-es/blog/describing-the-forecasting-models-in-power-view/#ETSAAN) algorithm.
Assuming you want to display a time-series including also the prediction interval for the forecasting part, Power BI by default allows you to do so thanks to the _Line chart_ visual, but it is the tool itself that determines the forecasting according to the type of series through two different algorithms: a [seasonal (ETS AAA)](https://powerbi.microsoft.com/es-es/blog/describing-the-forecasting-models-in-power-view/#ETSAAA) and a [non-seasonal (ETS AAN)](https://powerbi.microsoft.com/es-es/blog/describing-the-forecasting-models-in-power-view/#ETSAAN) algorithm.

Take the following time-series as an example:

![img01](/img/01.png)

One can guess by eye that an annual seasonality is present. If we add a forecast in Power BI and let the tool automatically determine the seasonality, the results are not the best:
One can guess by eye that an annual seasonality is present. If we add a forecast in Power BI thanks to the _Analytics_ tab and let the tool automatically determine the seasonality, the results are not the best:

![img01](/img/02.png)

Expand All @@ -22,7 +22,7 @@ For this reason, the analyst may prefer not to use the default forecasting algor

Now suppose to perform a batch scoring through the aforesaid model and therefore to have a CSV file as output with all the necessary information to realize the plot. It __doesn't exist a visual__ in Power BI that allows to draw actual values, predicted values and also the upper and the lower bound of the predictive interval taken from an existing table.

This Power BI custom visual that I am sharing with you was created to fill the gap just described. Thanks to it, it is possible to plot the shared time-series forecasting data in this repository (_ts_forecast.csv_) as well:
This Power BI custom visual that I am sharing with you was created to fill the gap just described. Thanks to it, it is possible to plot the shared time-series forecasting data in this repository (_data/ts_forecast.csv_) as well:

![img01](/img/04.png)

Expand All @@ -32,7 +32,7 @@ Unlike the previous forecasting provided by Power BI, it is evident that the val

The main goal I set for myself was not only to give the analyst the ability to plot a time-series forecasting plot given all its component values, but also to create an __interactive__ custom visual.

Since I'm a big fan of the R language and aware that Power BI allows you to create custom visuals using R, I got to thinking about what R packages might be able to generate a time-series forecasting plot. Anyone who knows a little about R and the [Tidyverse framework](https://www.tidyverse.org/) will surely know about the [Tidymodels framework](https://www.tidymodels.org/) and, specifically for time-series, Matt Dancho's fantastic [_timetk_](https://github.com/business-science/timetk) and [_modeltime_](https://github.com/business-science/modeltime) packages. In particular, the _modeltime_ package contains the _plot_modeltime_forecast()_ function that allows you to plot a time-series with the predicted values from one or more models. Moreover, the output of this function can be an interactive plot thanks to the use of [_plotly.R_](https://github.com/plotly/plotly.R). It is exactly what I was looking for!
Since I'm a big fan of the R language and I'm aware that Power BI allows you to create custom visuals using R, I got to think about what R packages might be able to generate a time-series forecasting plot. Anyone who knows a little about R and the [Tidyverse framework](https://www.tidyverse.org/) will surely know about the [Tidymodels framework](https://www.tidymodels.org/) and, specifically for time-series, Matt Dancho's fantastic [_timetk_](https://github.com/business-science/timetk) and [_modeltime_](https://github.com/business-science/modeltime) packages. In particular, the _modeltime_ package contains the _plot_modeltime_forecast()_ function that allows you to plot a time-series with the predicted values from one or more models. Moreover, the output of this function can be an interactive plot thanks to the use of [_plotly.R_](https://github.com/plotly/plotly.R). It is exactly what I was looking for!

The only problem I encountered was the fact that Power BI service has an outdated R engine (_Microsoft R Open 3.4.4_) and the available R packages do not provide neither _timetk_ nor _modeltime_. Because of this, to be able to publish a report that used this custom visual on the service, I had to extract the _plot_modeltime_forecast()_ dependencies from these packages and had to make very minor changes to integrate them with each other. The files extracted from the various packages are as follows:

Expand Down Expand Up @@ -62,4 +62,51 @@ Right after clicking on the new icon, you will be asked to enable the custom vis

![img01](/img/07.png)

The requested data are the following:

* __Date__: (mandatory) variable containing date values
* __Value__: (mandatory) actual values of the time-series
* __Value Type__: (mandatory) variable containing labels about the type of value to which it refers ("actual" or "prediction")
* __Confidence Low__: (optional) value of the low bound of the predictive interval
* __Confidence High__: (optional) value of the high bound of the predictive interval
* __Model ID__: (optional) Numerical ID of the model that scored the predicted values it refers in the dataset. If the predicted values present in the dataset are generated by a single model, this field may be omitted
* __Model Description__: (optional) Description of the model that scored the predicted values it refers in the dataset. If the predicted values present in the dataset are generated by a single model, this field may be omitted

Once you have entered the first data in the above fields, you can explore the _Format_ tab of the custom visual. In particular, there is the "_Plot Settings_" section that allows you to interact with the elements of the displayed plot:

![img01](/img/08.png)

Now let's see what is the expected format of the input dataset that feeds the data fields.

## Expected Format of the Input Dataset

Suppose you have a time-series running from time T<sub>0</sub> to time T<sub>n</sub>, for a total of _n_ observations. We will then have _n_ current values in the series. Suppose now you want to do the forecasting for the values associated with the time range from T<sub>i</sub> to T<sub>n</sub>. This means that you want to predict the last _n-i+1_ values of the series. Since the dataset in input has a single variable containing the values (_Value_), distinguished in their function by the variable _Value Type_, the variable _Value_ will contain both all the actual values (_n_ values for which _Value Type = "actual"_), and the prediction values (the _n-i+1_ values for which _Value Type = "prediction"_). This means that in the _Date_ column we will find the temporal range from T<sub>i</sub> to T<sub>n</sub> repeated twice, once for the actual values and once for the predicted values:

![img01](/img/09.png)

If you want to plot the forecasting obtained from 2 models, in addition of adding other _n-i+1_ rows to the dataset, you have to introduce two new variables: _Model ID_ and _Model Description_. Through these two variables it is possible to associate the predictions contained in the _Value_ field to a specific model from which they have been obtained. All becomes clearer looking at the following figure:

![img01](/img/10.png)

Let's see now how to put into practice everything seen.

## Showing everything in a demo

You can find the ts_forecasting.pbix file in the "demo" folder of the repository. In it I have highlighted the differences in applying forecasting to a time-series provided in a CSV file using the standard Power BI features and using the time-series custom visual. The CSV file is located in the "data" folder of the repository.

You can also test the time-series custom visual in a live demo published via Power BI's "Publish to web" option. Here is [the link](https://app.powerbi.com/view?r=eyJrIjoiNTZlZTNkZTctMzZiNi00NTUzLTlkMzgtMTZkZTRlNDc1YmU2IiwidCI6IjA4MjRlOGM5LWQzNWEtNDAwMC1hYTYxLTQ3YjM5MDdjMDEyZSIsImMiOjF9&pageName=ReportSection) to the live demo.

## References

Below is a list of links to key references:

* [Describing the forecasting models in Power View](https://powerbi.microsoft.com/es-es/blog/describing-the-forecasting-models-in-power-view/)
* [Timetk, time series analysis in the tidyverse](https://business-science.github.io/timetk/)
* [Modeltime, tidy time series forecasting with tidymodels](https://business-science.github.io/modeltime/)
* [Create visuals by using R packages in the Power BI service](https://docs.microsoft.com/en-us/power-bi/connect-data/service-r-packages-support)

If you want to learn how to create a Power BI R Custom Visual from scratch, as well as many other topics on how to better integrate Python and R into Power BI, you can find detailed guides in my book "[_Extending Power BI with Python and R_](https://www.amazon.com/Extending-Power-Python-transform-analytical/dp/1801078203/)":

<a href="https://www.amazon.com/Extending-Power-Python-transform-analytical/dp/1801078203/" rel="Extending Power BI with Python and R">![img01](/img/11.png)</a>


File renamed without changes.
Binary file not shown.
Binary file added images.pptx
Binary file not shown.
Binary file added img/08.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/09.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/10.snagx
Binary file not shown.
Binary file added img/11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6accd68

Please sign in to comment.