- Part 1 of 2: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 1 of 2)
- Take a csv file, analyze with an U-SQL script in Azure Data Lake
- Part 2 of 2: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 2 of 2)
- Visualize processed data in Azure Data Lake with Power BI Desktop
- Publish Power BI report into SharePoint Online page with Power BI Web Part (preview)
Applicable Business Scenario
Marketing or data analysts who need to review sentiments and key phrases of a very large data set of consumer-based movie reviews.
Applied Technologies
- Azure Data Lake Store
- Azure Data Lake Analytics
- Visual Studio with Azure Data Lake Analytics Tools
- Power BI Desktop & Power BI Service
- SharePoint Online site and preview of Power BI Web Part
Power BI Desktop
Use Power BI Desktop as the report authoring tool.
Data Source
Get Data from Azure Data Lake Store. Retrieve the output of the U-SQL script executed in Azure Data Lake Analytics in part 1 of this blog series.
Data Source to Azure Data Lake Store
Point to the folder containing the .tsv (tab delimited) files which was the output of the U-SQL script execution.
Provide credentials to an account that has permissions to the Azure Data Lake Store. In this case, it was an Azure AD account.
Queries
Create a query for each .TSV file
Relationships
Define 1 to many relationship based on the ID of each movie review.
Reports/Visualization
Sentiment confidence value for each of the 2000 movie reviews
Publish
Click on ‘Publish’ to upload your report to Power BI Service in the MS cloud. You can view in http://app.powerbi.com with your Office 365 or Microsoft account.
SharePoint Online
If you want to publish and share this report to a wide audience via a SharePoint online site, you can leverage the new Power BI Web Part (currently preview as of Feb 2017). I have displayed this report in the latest SPO modern page experience as a publishing page. For each user that views the report must have a Power BI Pro license which is not free.
To configure, you need to create a modern publishing page displaying the power BI report via Power BI Web Part (preview).
Web Part in Edit Mode
Enter the report link which you get from Power BI Service at http://app.powerbi.com
Options to further extend this solution
- For the movie reviews .csv file, one can add date/time, movie being reviewed, genre, location and any other descriptive metadata. Thus, supporting more reporting and insights.
- Overlay this data set against other data sent for correlation such as related news events, weather, popular movies trending, other movie reviews sources, etc. This is to find any cause and effect relationships for diagnostic insights – “Why is this happening?”.
- To get data from internal network to Azure Data Lake or any Azure storage account, an option is to use the Data Management Gateway. This is installed within the internal network to allow to transfer files and data from other internal data sources with no to little corporate firewall changes.Move data between on-premises sources and the cloud with Data Management Gateway
Closing Remarks
Azure Cognitive Services built into Azure Data Lake Analytics is a suitable option for very high volume, unstructured and complex processing of data. This is such that the scalable computing power is needed. In addition, this priced in a pay-per use model in making it cost-effective in many scenarios. The agility of Azure services allows to experiment, iterate quickly and fail-fast in finding the right technical solution and applying the right techniques and approach. This article highlights how data can be ingested, analyzed/processed, modeled, visualized and then published to a business audience.
Pingback: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 1 of 2) | Roy Kim on SharePoint, Azure, BI, Office 365