Spreadsheets versus a Data Warehouse – the two sides of the Analysis coin
Ash van der Spuy from our Sydney office shares his thoughts on how IT and business can work more closely to bridge the gap of familiar spreadsheet analysis vs reporting from a data warehouse for Financial Analysts.
Spreadsheets have been the mainstay of the Financial Analyst for decades, and its blank canvas and powerful calculation functions ensures that it remains the “go to” tool for both quick and complicated calculations. There are no tools as efficient for mocking up a quick analysis as a spreadsheet and it is important to keep this in mind, because there is a tendency is to criticise Excel so as to highlight the importance of formal reporting and analysis. While spreadsheets cannot do everything, similarly data warehouses cannot always meet the demands of the Financial Analysts; for proper analytics these functions have to meet each other half way.
1. The IT side of the coin
Every tool has its purpose and spreadsheets have their place. The same is to be said for formal reporting and analysis tools such as the ones developed by Tableau, Qlik, Microsoft, IBM, Oracle, SAS and SAP. These reporting and analysis tools however work best with well-structured data, and even better if this data is in a database in a proper data model. There are tools which are improving data exploration using diverse data sets, but their functionality is limited. When it comes to complicated calculations using snippets of information, often from temporary or ad hoc data sources to answer a specific pressing question; the spreadsheet rules supreme.
This is the side of the coin that the Business Intelligence or Information Management professionals need to support. Some questions need immediate answers and pose a question a data warehouse (DW) may not accommodate, but this doesn’t mean that the DW should just leave the Analysts to blunder ahead.
Business and Financial Analysts in particular, can easily reach a place where they view the data warehouse as slow and cumbersome. They have pressing questions, and can’t wait until tomorrow or next week to answer them and this is when they often resort to entering the data into a spreadsheet.
But it is exactly this haste that leads to keying errors, mistakes in formulas or using the wrong data. Worse than that, is using the correct data without knowing that it is of bad quality – Just because data comes from a financial system or an ERP, its quality is not guaranteed.
I have seen that the best analytics and reporting comes from organisations where the Data Warehouse and Business work together; and Analysts and IT professionals often forget that. The goal for a data warehouse should be to ensure that for those questions that must be answered in a spread sheet, the largest proportion of the data must come from good quality, trusted sources which the data warehouse has prepared and approved. When that doesn’t happen, the perfect storm is unavoidable.
The perfect storm is the scenario where a spread sheet contains both good data from a data warehouse and bad quality data and incorrect formulae input by analysts, which is then used by Management to make decisions. Since some of the data is from the data warehouse, the results are taken with high regard and decisions are made with confidence, even when affecting people’s lives and jobs.
We have seen this happen to companies big and small, both locally and globally. Take the example of Barclay’s and their overpayment of hundreds of millions for subprime debt. During the GFC, hidden columns in a spread sheet led to this their perfect storm. http://www.computerworld.com/s/article/9117143/Excel_error_leaves_Barclays_with_more_Lehman_assets_than_it_bargained_for
Another example has come from an economics study which was strategically used, largely by Europe and America to deal with the GFC. Upon further analysis, academics found the study was based on results from a spreadsheet containing errors and the resulting fallout from this was international.
Recommendations for a way forward
Here are my 6 key observations and recommendations for Business and IT to meet in the middle:
- Analysts have to keep inmind that the formal systems and policies employed by the DW are there to
- A. Supply automatic results that accelerate decision making which is also more efficient in the future.
- B. Ensure the data is of a high quality, and can be trusted for decision making.
- Once Analysts realise they are answering the same ad hoc repeatedly, and/or major decisions are being made based on the answers, the help of the DW should be sought. This is to automate the results and to ensure that the data is cleansed and verified if necessary.
- Cleansing and verifying the data will give further weight to the results and allow Analysts to focus on the next set of ad hoc questions.
- The automation process will also deconstruct the formulae in the analysis, providing another level of confidence to the results.
- IT professionals have to keep in mind that business needs are ever changing, and there is no way that they can keep up with every requirement. Instead the focus should be to grow that middle ground, and feed it with more and more trusted data.
- Alternatives to the traditional DW architectures should be assessed for fit, as a sandbox or multi-modal (this could include Lambda architecture) data platform can go a long way towards lowering the latency requirements that drive users away from the DW.
Spreadsheets, business intelligence software, analysts and specialists all have their place, they key is to understand the functions of each, and promote synergy through collaboration, and thereby better assist the broader company.
In future blogs we will explore how sandboxes and multi-modal, including Lambda architecture are changing the Information Management landscape.