Generative AI and Power BI: A Powerful Duo for Data Analysis
What Is Generative AI?
Generative AI is an umbrella term for different models, capable of producing original outputs, based on the information provided. Hence the moniker “generative”. Outputs can be anything from new texts (e.g., ChatGPT), new visuals (e.g., DALL·E2), new audio (e.g., AudioLM), or even new code (e.g., GitHub Copilot).
In the simplest terms, generative AI models are trained on large data samples. The goal of the training: teach the model to classify various inputs, based on the labels provided by the researcher (e.g., a model learns that an apple is “red”, “round”, “juicy”, etc.). The data sets need to be substantial. For example, GPT-3 was trained on 45 terabytes of text data and 175 billion parameters.
The latest generative AI models are powered by neural networks — a machine learning method that uses interconnected nodes (neurons) in a layered structure, similar to the human brain.
Basically, a neural network gives generative AI models the “wits” to produce creative outputs, make data-backed decisions, and perform a variety of other tasks.
The big boon of generative AI models is that they can be programmed with natural language (type-in queries), rather than complex code. You ask for a new recipe and ChatGPT gives you one in several seconds.
Because of the ease of use and the speed of outputs, generative AI models can massively improve workers’ productivity and deliver substantial economic benefits.
Generative AI models can take over a number of routine, low-value tasks. For example, in the business intelligence domain, generative AI models can help with data querying, analysis, and visualization. In software engineering, generative tools can help with code reviews and refactoring, plus a wide range of infrastructure management tasks.
Present-day generative AI models can automate work activities that consume 60%-70% of employees’ time.
Generative AI can also make data analytics more accessible to a wider audience. Users with different backgrounds and skill levels can interact with data using text-based commands to receive personalized results. Likewise, generative AI models can be tasked to build data visualizations and dashboards, plus provide extra commentary on the analyzed data sample(s), data sources, and the statistical methods it used to arrive at certain conclusions.
Such explainable AI (XAI) helps human users to better understand the model’s reasoning to mitigate the risks of biases or inconsistencies.
Since generative AI models are trained on vast amounts of data, they are more capable of noticing unique patterns and correlations. Then present overlooked or completely novel insights to human users as predictive or prescriptive insights.
This creative “shtick” can enhance the innovative capabilities of your business. For example, Simcenter has a generative AI tool that helps discover the optimal system architectures. The model scans through thousands of possibilities based on the input product characteristics and then suggests the best-fit system architecture pattern.
Microsoft has a substantial track record in AI innovation. The company’s Cloud AI developer services have been recognized among the best by Gartner Magic Quadrant for four consecutive years. The company’s AI services help with custom model development, deployment, and management in production.
Here is a summary of three main options for deploying AI in Power BI.
Copilot is an integrated generative AI model, developed, trained, and operated by Microsoft. Copilot brings the power of language models into every application, helping users get more work done with smart prompts and task automation.
Copilot mode was added to Power BI in May 2023. With the new AI assistant, users can use spoken commands to obtain data, edit data analysis expression (DAX) calculations, and generate reports, or data visualizations. Copilot also provides conversational answers to questions about data and can create narrative summaries for reporting.
Developers can also make a direct integration between ChatGPT and Power BI to get assistance from the famous OpenAI model. ChatGPT can help build complex calculations and advanced queries for Power BI models, troubleshoot errors, and optimize report generation.
Overall, ChatGPT can come in handy to users who need extra support when working with Power Query M functions. ChatGPT can help you troubleshoot issues with such functions. It can also help generate data prep steps from scratch and even write Power Query code using descriptive text instructions.
The most advanced AI for Power BI are algorithms from Azure Cognitive Services. Cognitive Services are pre-trained, customizable AI models, which are packaged as application programming interfaces (APIs). Deployable to any cloud or edge application with containers, Cognitive services provide advanced analytical capabilities to products.
Power BI offers an option to enrich existing dataflows with available Cognitive Services models via a graphical interface. At present, Cognitive Services in Power BI support the following actions:
- Automatic language detection and text recognition in 120 languages
- Keywords and phrases extraction from unstructured texts
- Sentiment analysis for smaller text documents
- Image tagging, capable of identifying over 2,000 objects
Cognitive services help data analysts handle large datasets more effectively: spend less time on data cleansing, labeling, and preparation for self-service analytics usage.
Power BI also lets you build fully custom machine learning models to run against your data using automated machine learning (AutoML) tools from Azure Machine Learning service. AutoML provides tools for deploying supervised learning techniques such as binary prediction, classification, and regression models.
You do not need an Azure subscription to use AutoML in Power BI since the tool entirely managed the process of training and hosting ML models.
Generative AI Use Cases in Data Analytics and BI
Whether you are using Power BI or another self-service BI tool, generative AI models can come in handy on many occasions.
Data silos remains a pervasive problem, as does its quality. On average, data analysts spend 80% of their productive time on data discovery, cleansing, and preparation and only 20% on actual model development and analysis.
Generative AI models can help with the labor-intensive tasks of data classification, tagging, anonymization, segmentation, and enrichment. Moreover, auxiliary products for data cataloging and management can help improve data lineage and accessibility. Microsoft, for example, launched Microsoft Fabric together with Copilot mode for Power BI. Fabrics is a SaaS data management platform, which provides convenient tools for data consolidation, governance, and distribution across all deployed analytics models.
Most analytics use cases pursue one goal: the discovery of new intel. The problem, however, is that traditional models often inherit the “thinking process” of their creators. For example, your domain experts may have preconceived notions and biases that would be incorporated into the model. Likewise, some users may struggle to ask the right questions or be open to the idea of analyzing the data from a new angle.
Well-trained generative AI models can uncover new data dimensions and correlations and present them to business users for consideration. Generative AI models can ideate at fast speeds by building thousands of associations in a matter of seconds and pitch various new concepts for evaluation.
Generative AI models can deliver concise data summaries from larger reports or even write the entire copy, using the data provided. Such models are great for contextualizing findings and conveying them to others in a short, succinct manner. Data visualizations are another great task generative AI models can handle at much faster speeds than the average human.
The latest generation of models can assess the feasibility and consequences of completing certain actions to reach the desired outcomes. Using historical records, generative models can create accurate forecasts by juggling multiple variables and evaluating different scenarios. These insights can then be communicated to business users as feedback on their ideas or transformed into prescriptive recommendations.
Making Generative AI a Reality for Data Analysts
Generative AI can bring game-changing performance gains to data teams. However, despite all the hype and surging interest, many IT leaders are also wary of the potential risks.
Main Concerns with Generative AI
To profit from all the benefits of present-day generative AI capabilities, as well as future model improvements, organizations need to implement several best practices:
Focus on Data Security
Many commercial generative AI models use the input data for model training purposes, which may not be ideal for privacy-focused industries. Likewise, analysts may include sensitive data in proprietary models by mistake.
To mitigate data privacy and security risks of GenAI usage, organizations need to:
- Establish auditable trails on Gen AI data collection, storage, and processing practices.
- Use data anonymization and/or aggregation to mask sensitive data, which is shared with AI models.
- Implement granular access controls to restrict data access to authorized users or processes.
- Employ secure user authentication methods and role-based authorization for different types of data manipulations.
In other words, organizations need to create a strong data governance process — such that allows establishing full data traceability across the organization.
Each created dataset needs to have a clear owner and a list of users with access/modification permissions. It should also be validated against the organization's compliance rules. Modern solutions like Microsoft Purview and Microsoft Fabrics help establish a clear data ownership structure, paired with scalable, secure data-sharing practices. These tools can help control which data is consumed by GenAI models and prevent accidental disclosures.
Implement Quality Assurance Processes for Developed Models
Generative AI models gain their “knowledge” from their creators. Issues at the design level lead to subpar model performance and biased results. Without extensive quality assurance and model observability, unconscious biases will enter the new models.
Amazon once launched an AI resume rating tool, which unfavorably ranked all female candidates because of their gender. An early version of a Google Photo image recognition algorithm discriminated against black people. In data analytics, it can result in plain wrong calculations or data interpretation, like it happened with Bing AI that presented inaccurate analyses of earnings reports for selected companies.
To avoid biases in AI model design, it is a good practice to:
- Select the best-fit learning model for the selected use case. GenAI models can be built with unsupervised or semi-supervised learning techniques. Each has its pros and cons when it comes to the quality and accuracy of produced outputs.
- Train models on datasets, representative of your organization and industry. The data provided must be comprehensive and balanced. It’s a good idea to train models on proprietary data rather than public datasets.
- Implement model observability to analyze the model’s behavior, data, and performance across its lifecycle. Observability helps detect and investigate drifts in performances and anomalies in a timely fashion.
Build a Corporate Culture of AI Usage
AI makes certain people uncomfortable for one reason or another. Without a clear understanding of the benefits and use cases of GenAI, adoption will always remain an uphill battle.
At the technology level, organizations will need to establish a better data management infrastructure for continuous dataset creation and self-service access to insights. This step alone requires major transformations both in terms of supporting infrastructure and in supporting processes.
At the processes level, leaders will need to identify the problems, which can be effectively solved with the GenAI. The adoption process should be centered on solving actual business challenges, not adopting expectations that AI will be an end unto itself.
At the people level, your employees will need to be educated on the purpose, benefits constraints, and risks of using the available AI solutions, as well as de-briefed on security and privacy best practices.