Business Intelligence in the Cloud
There is a lot of excitement around about ‘cloud’ solutions right now. But do you know how cloud solutions are used for your Business Intelligence needs?
What is the cloud?
The ‘cloud’ in cloud computing can be defined as the set of hardware, networks, storage, services, and interfaces that combine to deliver aspects of computing as a service. So rather than having to set up your own server, database, ETL and BI/reporting tools and then having staff to maintain all of these items, cloud gives you the option of renting these things as a service on a pay-as-you-go basis. They also generally have the added benefit of being scalable, meaning they can grow or shrink based on your needs.
There are a couple of different types of cloud services as shown in Figure 1. The most popular are:
- Infrastructure as a Service (IaaS)
- Platform as a Service (PaaS)
- Software as a Service (SaaS)
Cloud Infrastructure as a Service (IaaS) is defined by Gartner as ‘computing resources, along with associated storage and network resources, offered to the customer via self-service in a highly-automated way, on-demand and in near-real-time.’ They are generally barebones environments with networking infrastructure provided. Examples of IaaS services include Amazon Web Services (AWS) Simple Storage Service (S3), Amazon Web Services (AWS) EC2, Windows Azure Virtual Machines and Rackspace.
Cloud Platform as a Service (PaaS) is cloud middleware, and provides a computing platform and a solution stack (software) as a service. These are computing platforms which typically include the operating system, programming language execution environment, database and/or webserver. This means that time-consuming operational tasks such as configuring, optimizing and continuously updating your environment are handled on your behalf. Examples of PaaS services include AWS Relational Database Service (RDS), AWS Elastic Beanstalk, Heroku, Force.com and the Google App Engine.
Cloud Software as a Service (SaaS) is where the service provider hosts both the application and the data. Instead of owning and installing the software on a local computer, you access it via the internet (usually through a web browser). Example of SaaS services include Salesforce and Tableau Online.
Figure 1: A comparison of traditional packaged software models versus the different cloud service offerings.
How can cloud services be used for your Business Intelligence needs?
Gartner defines Business Intelligence as ‘an umbrella term that includes the applications, infrastructure, tools and best practices that enable access to and analysis of information to improve and optimize decisions and performance.’ Almost everything that you can do in-house for Business Intelligence can be done using cloud services.
Consider a simple but common example, where you want to source data from multiple sources, build a data warehouse from it and perform some reporting and analysis off this data. You might choose to Extract, Transform and Load your data from multiple sources using an AWS EC2 instance running Linux using ETL tools such as Pentaho Data Integration or Talend. You might then choose to store your data in an Amazon RDS instance running SQL Server standard, or in a Redshift database. Then when producing reports, you can connect to this RDS instance or Redshift instance and produce reports the same way as if that service was hosted locally. You might choose to also have this in the cloud by using cloud reporting tools like Tableau Online or Tibco Spotfire Cloud.
What are the benefits of using cloud services for your BI?
There are a number of benefits of using cloud services for your BI needs. The most commonly noted are:
- Services are easy to get running. Rather than having to wait for servers to be purchased, software to be installed and networks to be set up, you can simply “spin up” an instance which can take just a matter of minutes.
- Services are maintained by your cloud service provider. For example, consider Tableau Online, who will automatically update their Tableau Online software for you, so there is no need to do this yourself.
- Services such as storage services are fairly reliable. Amazon’s S3 service strives to achieve 11 9’s of reliability in terms of data loss, which means that “if you store 10,000 objects you can expect an average loss of a single object every 10 million years”. They also strive to be 99.99% available.
- Services are often cheaper. You no longer need to pay for hardware and maintenance of staff, you don’t have to pay up-front for separate licenses for Operating Systems, Databases and Software and you can use these on a pay-as-you-go basis.
- Services can be scalable and somewhat flexible. Most (not all) cloud services can be ‘scaled-up’ as required. So for example, say you have 200GB of storage space allocated on S3 and you realised a year after setting up the service that it wasn’t enough. You can easily increase this to 1TB if you want. If the server is likely to have a heavy access load you can also increase its capacity to handle a higher request rate. Many cloud solutions can also handle processing large volumes of data through tools like AWS’s Elastic Map Reduce (EMR) or Windows Azure HDInsight which employ the Hadoop and MapReduce framework to quickly retrieve results.
- Services are easy to access from multiple devices and locations. Analysts or Managers for example who may want to access reports on the go could do so from a mobile device like an iPhone from the train or from a laptop at home. This removes the need to take a personal copy of data onto a USB and move it between locations.
What are the challenges of using cloud services for your BI?
While there are many benefits there are also a number of challenges. These include:
- Disparate Data Sources. This is a challenge of most Business Intelligence endeavours, where data is stored in multiple source systems. Many companies have data stored in both the cloud (e.g. in Google Analytics) and locally (e.g. in Databases and Excel), and these need to be brought together for consumption. The use of an ETL (Extract, Transform and Load) tool that can access the multiple data source types can assist with this process.
- Data Security. Data security is one of the most commonly cited reasons for not using cloud services. However, if security is set up correctly, cloud services can be set up to be just as secure as many in-house services. Consider also that most in-house databases also provide some level of connectivity through networks and many data sources are already in the cloud, such as Salesforce data that usually contains private customer details. Occasionally there may be requirements for data security which may not be met by cloud solutions, and in these instances perhaps a private cloud could be an alternative, but for many solutions security might not be as big of a concern as it is commonly made out to be.
- Maturity of Cloud Products. With some offerings, in particular, being quite new to the market there may be some concerns around the maturity of the product features and their reliability. Most products do experience some teething issues. For example, AWS is not without fault and has experienced quite public outages in the past. This publicity is partially due to a large number of major websites relying on these services including Thomson Reuters, Netflix, Reddit, Spotify and Airbnb. Other services such as SaaS services also may experience minor teething issues when first launched, however, these generally do improve over time as the service matures.
- Feature Differences. One important consideration is that a particular platform or software in the cloud may have different features to an in-house hosted solution. For example, on Amazon RDS, SQL Server operates slightly differently to the self-hosted SQL Server Edition. One gotcha, for example, is the different backup mechanism on RDS, where the backup is stored in an Amazon restore format rather than the traditional .bkp file due to the lack of access to the filesystem where your SQL Server instance is. You also cannot restore a .bkp file onto an Amazon RDS SQL Server instance at present.
So is BI in the cloud for you?
Here are some questions to help your self-assessment.
- Do you already have data in the cloud?
- Consider revisiting any existing BI tools – which functionality is lacking? What do you expect in the cloud BI to be different?
- Are you interested in the analysis of unstructured data?
- Do you need to integrate different data sources? Do you need to integrate in-house data sources with cloud data sources?
- If you are making a choice between in-house vs. cloud BI consider the investment amount as well as the investment structure. The pay-as-you-go model allows for greater investment flexibility and can often be seen as operational expenses.
- What is the past user experience with BI/cloud software in your company? Do you expect any issues with user acceptance?
Altis has experience helping clients deliver value to their business by implementing cloud BI solutions. If you are considering a cloud solution and would like to discuss further with us how we might be able to help you, please contact us.