Tableau Does It All – But Don’t Forget Your Data Strategy

OK, so you’ve invested, or at least thinking about investing in a powerful BI Platform with many features and functionality, many supported usage scenarios.  “If you build it, they will come” applies here, Tableau Dashboards and Views are easy to create and actually fun to use, user adoption is almost guaranteed.  User adoption will increase demand for more Views and Dashboards, more data requirements, more measures, more systems.  Realization sets in for the Business Intelligence Team, as a few well received BI Applications evolves to a cross-functional then holistic view of the organization.  The realities of the need for BI Standardization hits hard.

Business Intelligence Platforms can directly contribute to an organization’s competitive advantage.  In fact, I would go so far as to declare BI as one of the last frontiers for an organization to gain competitive advantage (OK, so I might be biased).  Let’s face it, Business Intelligence Platforms are complex systems and therefore require standardization, policy and governance.  This is a blog and we can’t boil the ocean here, so wanted to bestow some low hanging fruit as a point of awareness – Understanding Your Data Strategy.  Data Strategy refers to an organization’s solution for a data structure based on that organizations requirements.  Tableau can connect directly to a data source or can be extracted to  work in-memory.  A data strategy often includes a combination of both types of data access.

By default, Tableau provides a real-time experience by issuing a new query every time the user changes their analysis.  While this can be very beneficial, this can also be an issue if datasets are large are large or data sources are underperforming or even offline.  When data is not constantly changing (volatile transactional data), real-time queries create unnecessary workload.

Some reasons why real-time requirements cannot be met are:

  • Information requirements are too great for as-is retrieval;
  • Business is interested in analyzing data which often requires extensive preparation (cleansing, subject-oriented, routine aggregation, advanced formulae, etc.);
  • Data in separate business systems cannot be joined;
  • Data residing in multiple systems is inconsistent;
  • Data requires extensive manipulation before it can be used as information;

In such cases where real-time data is not required or not feasible, Tableau offers an extract capability that brings data back from a an initial query and stores it locally.  The extract is stored in Tableau’s columnar database that is highly compressed and structured for rapid retrieval.  A summary of extract use cases:

  • Extracts can be created for all data types except multi-dimensional databases (cubes);
  • Used for highly transactional systems that cannot afford the resources for date-time queries;
  • Can be refreshed nightly and available offline to users the next day;
  • Can be based on a fixed number of rows or filtered;
  • Can be incremental in nature. An incremental extract will update existing extract with new data;
  • Extracts are necessary to share packaged workbooks (.twbx file type). Packaged workbooks contain all the data which is makes it both portable and sharable with other users;
  • Temporary disk space used to build an extract can be significant (e.g. star schema with long fact table);
  • Tableau Server provides centralized management of Tableau Data Extracts by managing data connections and joins, calculated fields, field definitions, sets and groups, user filters;

Data Strategy is only one facet of the challenge, albeit a big one.  Hope you’ve enjoyed the article, let us know!  More to come.