10.2 Implementing Real-time BI Solutions
10.2.2 Integrating Stream Analytics and Power BI
Now that you’ve learned about real-time BI, let me walk you through a sample solution that demonstrates how Azure Stream Analytics and Power BI can help you implement real-time dashboards. Suppose that Adventure Works is interested in analyzing customer sentiment from messages that are posted on Twitter. This is immediate feedback from its customer base, which can help the company improve its products and services. And so Adventure Works wants to monitor the average customer sentiment about specific topics in real time. Figure 11.11 shows you the process flow diagram.
Figure 10.11 This solution demonstrates how you can integrate Stream Analytics with Power BI.
Instead of building the entire solution from scratch, I decided to use the Real-time Twitter
sentiment analysis sample by Microsoft. You can download the code from GitHub (https://github.com/Azure/azure-stream-
analytics/tree/master/DataGenerators/TwitterClient) and the read the documentation from https://github.com/Azure/azure-content/blob/master/articles/stream-analytics/stream- analytics-twitter-sentiment-analysis-trends.md. You can also read the documentation online at https://azure.microsoft.com/en-us/documentation/articles/stream-analytics- twitter-sentiment-analysis-trends.
NOTE This sample demonstrates how remarkably simple it is to implement real-time cloud solutions with Stream Analytics. You only need to write custom code to send events to Events Hub. By contrast, a similar StreamInsight-based application would require much more coding on your part, as the Big Data Twitter Demo
(http://twitterbigdata.codeplex.com) demonstrates. That’s because you’d need to write the plumbing code for observers, adapters, sinks, and more.
Understanding the client application
Designed as a C# console application, the client app uses the Twitter APIs to filter tweets for specific keywords that you specify in the app.config file. To personalize the demo for our fictitious bike manufacturer, (Adventure Works), I used the keywords “Bike” and
“Adventure”. In the same file, you must specify the Twitter OAuth settings that you obtain when you register a custom application with Twitter. For more information about
registering an application with Twitter and about obtaining the security settings, read the
“Tokens from dev.twitter.com” topic at https://dev.twitter.com/oauth/overview/application- owner-access-tokens.
Note that the client app (as coded by Microsoft) doesn’t have any error handling. If you don’t configure it correctly, it won’t show any output and won’t give you any indication what’s wrong. To avoid this and to get the actual error, I recommend that you re-throw errors in every catch block in the EventHubObserver.cs file.
catch (Exception ex) {
throw ex;
}
The application integrates with an open source tool (Sentiment140) to assign a sentiment value to each tweet (0: negative, 2: neutral, 4: positive). Then the tweet events are sent to the Azure Event Hubs. Therefore, to test the application successfully, you must first set up an event hub and configure Stream Analytics. If all is well, the application shows the stream of tweets in the console window as they’re sent to the event hub.
Understanding Stream Analytics setup
The documentation that accompanies the sample provides step-by-step instructions to configure the Azure part of the solution. You can perform the steps using old Azure portal (https://manage.windowsazure.com) or the new Azure portal (http://portal.azure.com).
Instead of reiterating the steps, I’ll just emphasize a few points that might not be immediately clear:
1.Before setting up a new Stream Analytics job, you must create an event hub that ingests that data stream.
2.After you create the hub, you need to copy the connection information from the hub registration page and paste it in the EventHubConnectionString setting in the client application app.config file. This is how the client application connects to the event hub.
3.When you set up the Stream Analytics job, you can use sample data to test the standing query. You must have run the client application before this step so that the event hub has some data in it. In addition, make sure that the date range you specify for sampling actually returns tweet events.
4.This is the standing query that I used for the Power BI dashboard:
SELECT System.Timestamp as Time, Topic, COUNT(*), AVG(SentimentScore), MIN(SentimentScore), Max(SentimentScore), STDEV(SentimentScore)
FROM TwitterStream TIMESTAMP BY CreatedAt GROUP BY TUMBLINGWINDOW(s, 5), Topic
This query divides the time in intervals of five seconds. A tumbling window is one of the windows types that’s supported by both Stream Analytics and SQL Server StreamInsight read about at https://msdn.microsoft.com/en-us/library/azure/dn8350110.aspx). Within each interval, the query groups the incoming events by topic and calculates the event count, minimum, maximum, average sentiment score, and the standard deviation. Because stream analytics queries are described in a SQL-like grammar, you can leverage your SQL query skills.
NOTE Unlike SQL SELECT queries, which execute once, Stream Analytics queries are standing. To understanding this, imagine that the stream of events passes through the query. As long the Stream Analytics job is active, the query is active and it’s always working. In this case, the query divides the stream in five-second intervals and calculates the aggregates on the fly.
Understanding Power BI setup
A Stream Analytics job can have multiple outputs, such as to save the query results to a durable storage for offline analysis and to display them on a real-time dashboard. Figure 10.12 shows the available outputs that Stream Analytics currently supports. Power BI is just one of these outputs. Follow these steps to configure Stream Analytics to send the output to Power BI.
1. In the “Add an output to your job” step, select Power BI.
1.In the “Authorize Connection” step, sign in to Power BI.
Figure 10.12 Stream Analytics supports different types of outputs for sending query results.
2.In the “Microsoft Power BI Settings” step, name the output and specify the name of the Power BI dataset and table names (see Figure 10.13). You can send results from different queries to separate tables within the same dataset. If the table with the same name exists, it’ll be overwritten.
Figure 10.13 Configure which dataset and table the query results will be sent to.
Creating a real-time dashboard
Now that all the setup steps are behinds us, we’re ready to have fun with the data:
1. Run the Stream Analytics job. It might take a while for the job to initiate, so monitor its progress on the Stream Analytics dashboard page.
1.Once the job is started, it’ll send the query results to Power BI, assuming there are incoming events that match the query criteria. Power BI will create a dataset with the name you specified.
2.Now you log in to Power BI Service and explore the dataset.
3.To show the data in real time, you need to create a dashboard by pinning the report visualization. The dashboard tile in Figure 10.14 is based on a report visualization that uses a Combo Chart. It shows the count of events as columns and the average sentiment score as a line. Time is measured on the shared axis.
Figure 10.14 Power BI generates a real-time dashboard when Stream Analytics Service streams the events.
4.Watch the dashboard updates itself as new data is coming in, and you gain real-time insights!