Qlikview and Google BigQuery – Data Visualization for Big Data


Introduction

Google have launched its BigQuery cloud service in May to support interactive analysis of massive datasets up to billions of rows. Shortly after this launch Qliktech, one of the market leaders in BI solutions who is known for its unique associative architecture based on colunm store, in-memory database demonstrated a Qlikview Google BigQuery application that provided data visualization using BigQuery as backend. This post is about how Qlikview and Google BigQuery can be intagrated to provide easy-to-use data analytics application for business users who work on large datasets.

Qlikview and Google BigQuery

Qlikview has two capabilities depending on the needs and the volume of the data:

Qlikview BigQuery Connector: this add-on is written in .NET – thus it requires Microsoft .NET 4 framework to be installed on your computer -, it loads the data into the in-memory data model and various view types (table, barchart, etc) can be then created on the fly to visualize the data or its subset.

Qlikview BigQuery Extension Object: in case of a huge volume of data not all the data can be loaded into memory. Qlikview BigQuery Extension Object provides a web-based solution, it is built upon Google Javascript API. Users can navigate using the extension object and get only the relevant portion of the data from BigQuery.

Preparing the dataset in Google BigQuery

Before we start working with Qlikview BigQuery solutions, we need to create a dataset in Google  BigQuery. We are going to use Apple marketdata donwloaded from finance.yahoo.com site in csv format.

First we need to create a dataset called apple using Google BigQuery browser tool:

Then we need to create a table called marketdata:

The next step is to upload the csv file into the  table – you need to use Chrome browser, as Internet Explorer does not work for file upload as of the writing of this post. The schema that was used is {date:string, open:float, high:float, low:float, close:float, volume:integer, adjclose: float}- just to demonstrate the Google BigQuery is capable of handling various data types:

Finally we can run a simple SQL query to validate that the data has been successfully uploaded:

So far so good, we have the data loaded into Google BigQuery

Qlikview BigQuery Connector

We need to have Qlikview installed, in my test I used Qlikview Personal Edition that can be dowloaded for free from Qliktech website. Then we need to download Qlikview BigQuery Connector  from Qlikview market.

Once Qlikview BigQuery Connector is installed, it appears in a similar way as any other connectors (just like ODBC or OLE DB). Go to Edit Script and then choose BigQuery as database:

Once we click on Connect, an authorization window pops up on the screen – Google BigQuery relies on OAuth2.0, thus we need to have OAUth2.0 client id and client secret. The client id and client secret can be created using Google API console. Select ‘Installed application’ and ‘Other’ options.

In Qlikview we need to authenticate ourself using the client id and client secret:

After authentication the next step is to define the Select statement that will be used to load the data from Google BigQuery into memory:

When we click on OK button the data is being fetched into Qlikview in-memory data model (in our case it is 7,000+ lines):

We can start processing and visualizing the data within Qlikview. First we are going to create a table view by right click and then selecting New Object Sheet:

Let us then define another visualization object, a Line Chart:

And then a BarChart – so we will get the following dashboard to present the data that was loaded from Google BigQuery backend into Qlikview in-memory column store:

Qlikview BigQuery Extension Object

As said before, not necessarily all the data can  fit into the memory – even if Qlikview is very strong at compressing data, we are talking about massive datasets, aren’t we – that is what big data is all about. In this case Qlikview BigQuery Extension Object comes to the rescue. We need to download it from Qlikview market and install it.

As it is a web-based solution using Javascript (Google Javascript API), we’d better have a Google client id and client secret for web applications, we can create it in the same way as described above for Qlikview connector. The ‘javascript origins’ attribute needs to be modified to http://qlikview.

Then we need to turn on WebView in Qlikview:

Now we are ready to create a new visualization object by right click and selecting New Sheet Object:

Select Extenstion Objects, BigQuery and drag it onto the Qlikview window

We need to define the project, dataset and table – bighadoop, apple, marketdata respectively:

Then we have to define the visualization type (Table in this case) and the select statement  to fetch the data. Please, note that we limited the data to 100 lines using ‘select date, open, close, high, low from apple.marketdata limit 100; SQL statement:

We can define various visualization objects, similarly to the BigQuery Connector scenario:

Qlikview on Mobile

QlikTech promotes a unified approach for delivering BI solutions for different platforms based on HTML5, no need for additional layers to support data analytics and visualization on mobile devices. Qlikview Server is capable of recognizing mobile browsers and supports touch-screen functionalities.

Reference:  http://www.slidshare.net/QlikView_UK/qlikview-on-mobile 

Conclusion

Qlikview has gained significant popularity among BI tools, Gartner positioned QlikTech in the  leaders zone of the Business Inteligent Platforms Magic Quadrant in 2012. It provides highly interactive, easy-to-use graphical user interface for business users and the technology partnership with Google to provide seamless integration with BigQuery can just  further strengthen its position.

Mobile BI and Big Data – How to use AWS Elastic MapReduce results with Roambi Mobile BI Analyics


So far we covered server-side/cloud components – how to process data with  MapReduce running in the cloud or on our own Hadoop cluster. This time it is about client-side.

If you have a look at Mary Meeker’s latest brilliant presentation about the Internet trends, one of the key messages is the significant increase in mobile 3G subscriptions and the mind-boggling sales figures for tablets (read: iPad) and smartphones (read: iPhone and Android):

Internet goes mobile and the applications follow the trend – that can be seen in mobile business intelligence, too that has shown a significant momentum recently. People are on the move with mobile devices that have similar performance as a notebook a few years ago, see geekbench results in here. It is time to use this power at hand for business intelligence, too. The tools are already out there to analyse big data and then publish results to mobile devices.

Amazon Elastic MapReduce

In the March post we covered Amazon Elastic MapReduce. Having talked about the mobile internet subscriptions and the enourmous growth in that area, this time we will analyse mobile subscriptions data from Worldbank. This data is about subscriptions to a public mobile telephone service using cellular technology, postpaid and prepaid subscriptions included.

To create an AWS Elastic MapReduce job requires 3 steps: upload input data to an S3 bucket/folder, run an EMR job (e.g. Hive, Pig, custom java), and download the output from an S3 folder.

The S3 storage looks like this for our test :there is a mobilesubscriptions bucket, then there are two folders: one for hive-scripts and one for mobilesubs data (folder). In the mobilesubs folder there is an input folder where we upload the mobile_subscriptions.csv file. The output will be created under s3://mobilesubscriptions/mobilesubs/output folder in csv format.

Its format is like:

Country Name,Country Code,2010
Afghanistan,AFG,37.80718336
Albania,ALB,141.8972543
Algeria,DZA,92.42180275
American Samoa,ASM,
Andorra,AND,77.17642345
Angola,AGO,46.68902631....

(2010 is the last year where we had data)

The hive script the we use for data processing is – this will show the top 100 countries with the highest number of subscriptions:

CREATE EXTERNAL TABLE mobilesubs (
    country_name STRING, country_code STRING, subscriptions FLOAT
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 's3://mobilesubscriptions/mobilesubs/input/';

CREATE TABLE top100_mobilesubs (
    country_name STRING, country_code STRING, subscriptions FLOAT
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

INSERT OVERWRITE TABLE top100_mobilesubs
SELECT country_name, country_code, subscriptions
FROM  mobilesubs
ORDER BY subscriptions DESC
LIMIT 100;

INSERT OVERWRITE DIRECTORY 's3://mobilesubscriptions/mobilesubs/output/'
SELECT * from top100_mobilesubs;

The job that will process the data using AWS EMR is configured as follows:

Once we run the job, it will create a 000000_0 file under s3://mobilesubscriptions/mobilesubs/output directory.

This ouput files needs to be downloaded and processed to replaces the SOH characters with comma (,), in order to be able to publish it with Roambi Mobile BI analytics. This can be done by any text processing tool (e.g. notepad++)

Roambi Analytics

Roambi Analytics has a cloud based publishing services and a mobile BI visualizer tool available for iPad and iPhones. The application can be installed on the mobile devices from Apple AppStore for free.

The Roambi publisher has 3 versions: Roambi Lite that is free and has limited functionality (support for csv, excel and html format), Roambi Pro (with additional Google docs and salesforce.com support) and Roambi Enterprise (with Oracle, SAP BusinessObjects, SAS, Microsoft, IBM Cognos, etc support).

This demo is based on Roambi Lite. First you need to create an account or login using Google Account (OpenID) at https://secure.roambi.com:

Then click on Publish:

Select the approriate view (e.g. CataList) and import data (this will be the mobilesubs_result.csv that we downloaded from AWS EMR s3://mobilesubscriptions/mobilesubs/output folder and prepared for Roambi Analytics as described above.

You can refine the data if you wish and then publish it:

The file will be pushed to the mobile devices (iPad or iPhone). In case of Roambi Lite e.g.  you can push it to your own device.

Roambi Analytics Visualizer

On the handset you can retrieve the result using Roambi Analytics Visualiser. You can create an email or screenshot from the report, you can add it to favorites, etc.

iPhone screenhots:

iPad screenshot:

Email sent from Roambi Analytics Visualizer:

As you can see, mobile BI and BigData in the cloud can free users from being a desktop slave: no need for datacenter infrastructure and no need for traditional desktop – just the joy of mobility spiced with the power of cloud computing.