Filling out dates in Pandas dataframes

By October 3, 2018Data science, Python

Often when working with time-series data, there isn’t data for every interval.  For example, consider the following transaction data:

You may want to look at a pivot of that data, grouped by Category


dfPiv = pd.pivot_table(df, values='Amount', columns='Category', index='Date').fillna(0)
dfPiv.head()

 

But, if you want to look at something like rolling averages, there are now missing days.

An easy way to fill in these missing days is to use the Panda’s resample method.


dfPiv = dfPiv.resample('d').sum()
dfPiv.head()

 

Now there is a row for every day, and rolling means etc work correctly.