If 2020 showed us anything, it has been to expect the unexpected. But industries such as retail base some of their most crucial decision-making, months ahead, on their ability to predict the future.
Now, retailers have seen for themselves that past data alone can’t predict the future. So what do you do when shuttered stores, drops in footfall, and a switch in customer journeys has completely distorted 2020 data? What can retailers do with the data they have, to make sure it works for them in the future?
Our Head of Data Science, Marcos Peñamil, explains why retail data is so distorted this year, some of the pitfalls of attempting to directly use it for forecasting, and how retailers can treat both data and models to set themselves up for success in the future.
What happened in 2020 from the data perspective?
Data is essential because it gives you the ability to comprehend the reality of your retail business, and becomes the main source for your models to forecast and plan the future. Therefore, the quality of your future predictions is extremely dependent on the inclusion and quality of your past data and forecasting models.
Missing or distorted data have a big impact on what is reflected in your KPIs, but since forecasting models are so complex, the impact on these models will probably be bigger, severely impairing your ability to look ahead and predict the future.
The biggest factors distorting retail data
If you take a quick look at your 2020 data, what do you see? While this might seem like an obvious question, let’s see what it’s likely telling you:
- The overall volume of sales is lower due to closed stores. But on the other hand, e-commerce demand picked up significantly, even to historic levels.
- Store-specific demand patterns have also changed. Stores with high sales pre-COVID-19 might have experienced lower sales, while smaller stores might have increased sales (e.g. increased demand in local stores and lower sales in areas of high tourism).
- The drop and changes in demand across stores has left higher-than-usual stock levels.
- At the same time there was greater volatility in terms of frequency, average ticket, etc.
- There are changes in the relative weight of families. For example, the demand for loungewear and activewear has grown this year while that of formal wear and business apparel has dropped.
- Perhaps you have data quality issues or even missing data due to limited staffing during lockdown.
While these anomalies may just seem like symptoms of the current, abnormal situation, we can’t ignore their limitations. These data sets are the most recent information we have, and they must be properly factored in our future forecasts.
More data coming into play: additional complexities that you must consider
To make things more challenging, in addition to your typical data looking strange, there’s also a lot of new, relevant data that has entered into the game that you should consider in your forecasting models, to predict the future and retroactively understand past data.
For instance different COVID-19 policies applied across countries and regions should be considered, such as store closures, movement restrictions, limitations on opening hours and occupancy, etc. These resulted in performance variations across time and locations, with no clear pattern to follow other than lockdown periods, which can’t be ignored.
What would happen if you applied your forecasting models as if nothing had happened?
In summary, 2020’s data is full of “noise”. So much in fact, that it’s generally useless to build for example a seasonality curve. And like a domino effect, you’ll likely encounter problems across your business decisions:
- How much/what type of products to buy: under the assumption that comparable products have similar sales behavior, if you can’t for instance differentiate what was exceptional from what was seasonal, you can’t assume that new products will behave the same way. Therefore, it will be hard to define which quantities to buy.
- Determining allocation: again, if you can’t assume that new products will perform like similar products in the past did, it will be hard to know how to allocate them across stores.
- Managing replenishments: how will you be able to model the higher demand volatility? It will be challenging to determine trends and the impact of COVID-19 policies at a local level and understand how to manage them.
- Planning promotions: how will you calculate elasticity of demand if your base demand is distorted?
- Avoiding overstocks: without a reliable seasonality cure, you won’t get visibility of how to act tactically to avoid overstocks across stores. This might result in piles of stock in one place, and not enough in others.
What you can do to regain control
Retailers must identify and remove the anomalies in past data, considering also those resulting from external events. Luckily, there are actions you can take to “reduce this noise” and make 2020 data meaningful.
Add external data, and continue doing so systematically
Distorted data isn’t the only problem that retailers are facing. The lack of relevant data may also cause forecasting models to stop working properly. Therefore, it’s important for retailers to record all relevant events and external factors that occur at a very granular level to reduce the anomalies that we usually find.
Another approach is to automatically detect events, though this is also complicated and has its limitations, because properly interpreting and tagging those events requires knowing what’s happening in the stores at all times. A good trade-off can be found in finding, analyzing, and tagging at least the bigger anomalies in past sales data.
While COVID-19 is making it clearer than ever that retailers must factor additional sources of data in their forecasting models, this is not a new issue. Historical data alone is not enough to have a clear picture of what happened in the past or to predict what may happen in the future.
Allow forecasting models to reflect the estimated impact on future demand
As for your models themselves, not only is it important to take past events into account, it’s also going to be essential for them to be prepared and built, to include the impact of both past, and potential future, events. The most practical step is to allow your models to reflect the estimated impact on future demand across different regions and channels, which may come as a result of an analysis of comparable situations.
For example, after the first wave of the pandemic, closely observing sales in regions that had begun reopening stores, to understand and model what sales behavior might be like when comparable stores in other regions would reopen later. The key is having flexible models in which you can add external events, applying short term bets when you identify a sudden change in demand.
Additionally, models must also be able to differentiate between what is a seasonal effect and an external event. In order to predict future sales, when looking at this year’s sales you need to be able to differentiate between recurrent/seasonal events, and those due to a particular situation affecting particular stores.
Review and monitor your forecasting models
Not only is it important for you to treat and closely monitor and track your data, you should also make sure you review these models to make sure they are robust, especially if it comes to black box models. And by evaluating which data your models require, you can better understand how COVID-19 related and other data anomalies might be impacting them.
By analyzing your models, you can see which assumptions don’t hold anymore. For example, if your model needs a minimum amount of sales to generate robust predictions, what happens if you apply data that includes months with no sales at all due to lockdown measures? The model simply won’t work or may even break.
Continue to monitor your models and build regression tests to see if the accuracy of any of your models is degrading. Also be sure to monitor the data your models are using, investigating any anomalies. If something goes wrong you should know immediately.
It’s not too late to start making 2020 data meaningful
While 2020 data might seem like it is working against you and your ability to accurately predict future demand, you can still take action to make it work in your favor. Not only can you draw insights from it, you can also mitigate against inaccurate demand forecasting due to existing gaps and distortions.
You can begin recording, tagging and tracking COVID-19 effects on your data now, even for events that occurred in the past. Not only is it helpful, it is essential to getting demand forecasting right in the future.
Sound daunting? If you’re struggling with how to manage 2020 data to forecast demand, or are generally looking for a more data-driven approach to your merchandising to help you remain agile during times of uncertainty, you’re not alone. Contact us to find out how Nextail can help!