- Part 1 of 2: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 1 of 2)
- Take a csv file, analyze with an U-SQL script in Azure Data Lake
- Part 2 of 2: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 2 of 2)
- Visualize processed data in Azure Data Lake with Power BI Desktop
- Publish Power BI report into SharePoint Online page with Power BI Web Part (preview)
Applicable Business Scenario
Marketing or data analysts who need to review sentiments and key phrases of a very large data set of consumer-based movie reviews.
Applied Technologies
- Azure Data Lake Store
- Azure Data Lake Analytics
- Visual Studio with Azure Data Lake Analytics Tools
- Power BI Desktop & Power BI Service
- SharePoint Online site and preview of Power BI Web Part
Azure Data Lake Store
Upload .csv file of 2000 movie reviews to a folder in Azure Data Lake Store
Azure Data Lake Analytics
Execute the following U-SQL script in either the Azure Portal > Azure Data Lake Analytics > Jobs > New Jobs or Visual Studio with Azure Data Lake Analytics Tools.
This script makes reference to the Cognitive Services assemblies. They come out of the box in the Azure Data Lake master database.
U-SQL Script
The following script reads the moviereviews.csv file in Azure Data Lake Store and then analyzes for sentiment and key phrase extraction. Two .tsv files are produced, one with the sentiment and key phrases for each movie review and another for a list of each individual key phrase with a foreign key ID to the parent movie review.
REFERENCE ASSEMBLY [TextCommon];
REFERENCE ASSEMBLY [TextSentiment];
REFERENCE ASSEMBLY [TextKeyPhrase];
@comments = EXTRACT Text string FROM @"/TextAnalysis/moviereviews.csv" USING Extractors.Csv(); @sentiment = PROCESS @comments PRODUCE Text, Sentiment string, Conf double READONLY Text USING new Cognition.Text.SentimentAnalyzer(true); @keyPhrases = PROCESS @sentiment PRODUCE Text, Sentiment, Conf, KeyPhrase string READONLY Text, Sentiment, Conf USING new Cognition.Text.KeyPhraseExtractor(); @keyPhrases = SELECT *, ROW_NUMBER() OVER () AS RowNumber FROM @keyPhrases; OUTPUT @keyPhrases TO "/TextAnalysis/out/MovieReviews-keyPhrases.tsv" USING Outputters.Tsv(); // Split the key phrases. @kpsplits = SELECT RowNumber, Sentiment, Conf, T.KeyPhrase FROM @keyPhrases CROSS APPLY new Cognition.Text.Splitter("KeyPhrase") AS T(KeyPhrase); OUTPUT @kpsplits TO "/TextAnalysis/out/MovieReviews-kpsplits.tsv" USING Outputters.Tsv();
Azure Portal > Azure Data Lake Analytics U-SQL execution
Create a new job to execute a U-SQL script.
Visual Studio Option
You need the Azure Data Lake Tools for Visual Studio. Create a U-SQL project and paste the script. Submit the U-SQL script to the Azure Data Lake Analytics for execution. The following shows the successful job summary after the U-SQL script has been submitted.
Click here to Part 2 of 2 of this blog series
Pingback: Text Analytics of Movie Reviews using Azure Data Lake, Cognitive Services and Power BI (part 2 of 2) | Roy Kim on SharePoint, Azure, BI, Office 365