Demand Forecasting in Supply Chain Management-A time-series approach (2/2)

Cleaning and preprocessing

df_features.head()
first 5 entries in features
df_features.isna().sum()
Null values in df_features
df_features.CPI.plot();
CPI plot
df_features.Unemployment.plot();
Unemployment plot
CPI for store #20
for i in range(1,46):
df_features[df_features.Store==i] =
df_features[df_features.Store == i].interpolate()
CPI for store #20 post imputation
df_features[df_features.columns[4:9]] = df_features[df_features.columns[4:9]].fillna(0)
df_stores.head()
df_sales.head()
df_all_1 = df_features.merge(df_sales, 'right', on = ['Date', 'Store', 'IsHoliday'])df_all = df_all_1.merge(df_stores, 'left', on = 'Store')
df_all = df_all.sort_values('Date')
df_all.reset_index(inplace = True)
df_all.replace({'IsHoliday':{True:1, False:0}}, inplace=True)
df_all.replace({'Type':{'A':3, 'B':2, 'C':1}}, inplace=True)

EDA

df_by_date = df_all.groupby('Date',as_index=False).agg({'Temperature': 'mean', 'Fuel_Price': 'mean', 'CPI': 'mean', 'Unemployment': 'mean', 'Weekly_Sales': 'sum', 'IsHoliday': 'mean'})df_by_date.Date = pd.to_datetime(df_by_date.Date, errors='coerce')
df_by_date.set_index('Date', inplace=True)
df_by_date.head()
df_by_date_new = df_by_date.resample('W').mean().fillna(method='bfill')
First 10 samples of the new dataframe df_by_date_new
from statsmodels.tsa.seasonal import seasonal_decomposemulti_plot = seasonal_decompose(df_by_date_new['Weekly_Sales'], model = 'add', extrapolate_trend='freq')
multi_plot.observed.plot(title = 'weekly sales')
multi_plot.trend.plot(title = 'trend')
A fairly flat trend (note that y-axis limits are close to each other)
multi_plot.seasonal.plot(title = 'seasonal')
Strong seasonality that tends to kick in during the Nov-Dec period
multi_plot.resid.plot(title = 'residual')
Roughly negligible noise (except for 2012 Nov-Dec)
sns.heatmap(df_by_date_new.corr('spearman'), annot = True)
  • strong +ve correlation b/w Fuel_Price and CPI
  • strong -ve correlations b/w Unemployment and Fuel_Price and Unemployment and CPI
  • surprisingly, the unemployment rate doesn’t really seem to affect the weekly sales (directly at least) suggesting that the stores might be overstaffed.
sns.boxplot(data = df_by_date, x = 'IsHoliday', y = 'Weekly_Sales');
holiday weeks don’t necessarily mean that the weekly sales go up but it is often the case
df_by_store = df_all.groupby('Store').agg({'Weekly_Sales': 'sum',
'Type': 'max'})
sns.boxplot(data = df_by_store, x = 'Type', y = 'Weekly_Sales')
There’s a clear hierarchy here
monthly_sales = df_all.groupby(df_all.Date.dt.month).agg({'Weekly_Sales':'sum'})

sns.barplot(x=monthly_sales.index, y=monthly_sales.Weekly_Sales);
df_by_dept = df_all.groupby('Dept').agg({'Weekly_Sales':'sum'})df_by_dept.sort_values(by = 'Weekly_Sales', ascending = False, inplace = True)
the five best and worst-performing channels

Forecasting using the Holt-Winters Model

from statsmodels.tsa.holtwinters import ExponentialSmoothing as esfit_model = es(df_by_date_new['Weekly_Sales'[:120], trend = 'add',  
seasonal = 'add', seasonal_periods = 52).fit()
prediction = fit_model.forecast(34)
plt.plot(df_by_date_new.index[120:], prediction, 'predicted')plt.plot(df_by_date_new.index[120:], df_by_date_new.Weekly_Sales[120:], 'actual')plt.legend();
Our model follows the general trend till the seasonality component kicks in during the Christmas period. Note that a similar peak was observed in all the other years as well during Christmas time.
def mean_absolute_percentage_error(y_true, y_pred): 
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
print("Mean Absolute Percentage Error = {a}%".format(a=mean_absolute_percentage_error(df_by_date_new.Weekly_Sales[120:],prediction)))
fit_model = es(df_by_date_new['Weekly_Sales'][:-2],
trend = 'add',seasonal='add',
seasonal_periods=52).fit()
preds_2013 = fit_model.forecast(56)plt.plot(df_by_date_new.index, df_by_date_new.Weekly_Sales)
plt.plot(preds_2013, '--')
plt.legend(['2010-2012 actual', '2013 forecast'])
Additive time series with a gradual downtrend and a strong seasonality component.

--

--

--

The Business Club Of NITT

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Data Atsume

Know How Tweets Were Analyzed By Twitter With The Help Of Pig

Risk Management — Keeping Up Appearances

What’ll happen to the world ?

Basic Algorithms — Counting Inversions

A/B Hypothesis Testing & Regression Analysis

Weekly Digest for Data Science and AI: Python and R (Volume 21)

AWS Glue DataBrew — No Code Data Prep & ETL on AWS

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sigma

Sigma

The Business Club Of NITT

More from Medium

PolkaWar; Marketplace and Logistics

Why Matrix Index?

RFI report finds a responsible investment opportunity in combining ESG and Islamic investment…

Nifty Index Options & Strategy

High Level Design of Dynamic Option Strategy