Andrew Drinkwater
When I was responsible for enrolment forecasting at a large university, we used to update our scenario data monthly during the fall term, and then at the end of the fiscal year. After this, all went quiet, until the dreaded rollover to the next academic year in summer. The rollover always caused panic:
What would break in my code? What if someone changed the input tables? The rollover involved a lot of steps, far too many of which were manual. The database and the R model we used worked together, but not particularly well, and definitely not for rollover.
Back in March, I wrote about When is a New Scenario a New Scenario?
In that article, we discussed ways to think about scenario versions:
When we think about scenario versions, it’s helpful to break them down into different buckets:
- Changes in user-controlled parameters.
- Changes in automatically updated (non-user-controlled) parameters.
- Changes in the underlying model.
Our best candidate for automation is #2 (automatically updated parameters) followed by #3 (changes in the underlying model).
Why would you want to automate this?
- Save staff time
- Ensure that data is updated in a timely manner even if everyone is on vacation
- Improve trust in the data as stakeholders know when and why updates occur
Automatically updated parameters can include:
- Updated student behaviour data available in your data
- A weekly, biweekly, monthly or similar schedule of data refreshes
- A rollover
Ideally, you want your forecast outputs to update every time there is a change in inputs such as student behaviour.
However, my experience is that having your model update every day is confusing for your stakeholders, unless it is important to have real time changes reflected. Given we’re not trying to sell the next item on an online store, but rather predict how many students will arrive or continue in future terms, real time is probably not necessary. Additionally, if data is changing every day, it becomes difficult to make apples-to-apples comparisons between different scenarios: if one stakeholder ran on Monday and another on Tuesday with a data refresh between, it’s hard to distinguish differences due to user-controlled parameters compared to automatically updated parameters. The major exception to this idea is the admissions world, where individual days and times make a huge difference to outcomes.
Let’s say that we want our model to update every Monday morning. What would that look like:
- Monday morning, 4:00am: new data available, whether as a data warehouse loads completing or in another form.
- Monday morning, 4:30am: every existing scenario for the current cycle is run with a new iteration.
- Monday morning, 5:00am: Tableau or other visualization tool refreshes any data extracts and notifies users of new data available.
- Monday morning, 8:30am: your enrolment management meeting has the most up-to-date forecast available.
How do we make this possible? You’ll need:
- A tool that can run your data updates. Often, this is an ETL (extract, transform, load) tool such as Safe Software’s FME, Alteryx, Informatica or many others. This would update your data warehouse or other data in the form needed for forecasting and could call your forecast model to run updates.
- A way to update your forecast scenarios. Maybe your tool is built in Python, or tools like FME or Alteryx can call the model and tell it what to update.
- Make sure you have a way to identify each iteration within a scenario, as opposed to comparing across scenarios.
- Depending on the scale of your operations, you might need a queuing service. You could use one of the ETL tools, or use an open source alternative like Celery for Python .
- A way to bring your forecast results into a visualization tool. Tableau Server or Cloud are really good at this – they can refresh on a schedule, or when called by a tool such as FME or Alteryx.
- A business process that permits using data that is automatically updated.
- An extra touch would be a way to automatically identify changes between different scenario iterations. This can help your staff (and especially your budget professionals) zero in on the deltas that matter and need to be explained.
Remember that dreaded rollover? The same concepts apply there – it’s just another chance to update existing scenarios.
The one part that’s different?
You may have to decide which scenario(s) to kick start your new cycle with. At many institutions, the new base scenarios are the ones that were approved by Senate the prior year, or the actuals from the prior year. If you’re only rolling those scenarios forward, you may not need to update all of the historical ones.
Last but not least: as you update your scenarios, be sure to check for actuals that they could be compared with. Reflecting on these differences will help you improve your model over time.