7.2 Managing Schema and Data Changes
7.2.1 Managing Connections and Tables
It’s not uncommon for a model to have several tables connected to different data sources so that you can integrate data from multiple places. As a modeler, you need to understand how to manage connections and tables.
Managing data sources
Suppose you need to import additional tables from a data source that you’ve already set up a connection to. One option is to use Get Data again. However, increasing the number of connections will increase the effort required to manage them. For example, if the security credentials change, you’ll need to update multiple connection definitions. A better way is to use the Recent Sources button in the ribbon’s Home tab (see Figure 7.6).
Figure 7.6 Use the Data Source Settings window to manage the data source credentials and encryption options in one place.
If you connect to a data source that has multiple entities, such as a relational database, when you click the data source in Recent Sources, Power BI Desktop will bring you
straight to the Navigator window so that you can select and import another table. And you
can use the File ð Options and Settings ð Data Source Settings menu to manage the security credentials in one place. The Data Source Settings window (see Figure 7.7) shows the data sources that you connected to from Power BI Desktop on your computer.
NOTE Readers familiar with Excel data modelling might recall that Power Pivot has an Existing Connections window that allow you to manage connections to data sources in one place. The Data Source Settings window fulfills a similar purpose but it doesn’t allow you to change the server and database names. If the server name or database name changes, you need to update the queries in the Query Editor accordingly.
You can select a data source and click the Edit button to change connection credentials or encryption options if the data source supports encryptions. These changes will apply to all the queries and Power BI Desktop models that use this data source. That’s because Power BI Desktop encrypts the connection credentials and stores them in the local AppData folder on your computer.
Figure 7.7 Click the Recent Sources button to access the data sources you’ve already used.
Deleting data sources
Every time you use Get Data and connect to a data source, the data source is added to the list for each server and database name. For example, if you don’t use a database name, a data source with only the server name is added. If you use both localhost and (local) to connect, you’ll end up with two data sources (although only one of them might be used by a query). This can clutter the list of data sources in the Data Sources Settings window.
To tidy the list up, you can delete data sources. When you do so, Power BI Desktop deletes the associated encrypted credentials. Unfortunately, the Data Source Settings window doesn’t show which data sources are used by queries. If you delete a data source that’s used by a query, nothing gets broken. However, the next time you refresh, you’ll be asked to specify credentials and encryption options as you did the first time you used Get Data to connect to that data source. Then the data source will be added to the Recent Sources list.
Changing server and database names
Another limitation of the Data Source Settings window is that it doesn’t allow you to change the server and database names, such as when you move from Development to Production environment. To do so, you must go to Query Editor and double-click the Source step in the Query Settings pane, as shown in
Figure 7.8. If you have left the database name empty, you can also use the Navigation step to change the database if you need to.
Figure 7.8 You can double-click the Source query step to change the server and database names.
Importing additional tables
Besides wholesale data, the Adventure Works data warehouse stores retail data for direct sales to individual customers. Suppose that you need to extend the Adventure Works
model to analyze direct sales to customers who placed orders on the Internet. Follow these steps to import three additional tables:
NOTE Other self-service tools on the market restrict you to analyzing single datasets only. If that’s all you need, feel free to skip this exercise as the model has enough tables and complexity already. However, chances are that you might need to analyze data from different subject areas side by side. This requires you to import multiple fact tables and join them to common dimensions. And this is where Power BI excels because it allows you to implement self-service models whose features are on a par with professional models. So, I encourage you to stay with me as the complexity cranks up and master these features so you never say “I can’t meet this requirement”.
1.In the ribbon’s Home tab, expand the Recent Sources button, and then click the SQL Server instance that hosts the AdventureWorksDW database.
2.In the Navigator window, expand the AdventureWorksDW database, and then check the DimCustomer, DimGeorgraphy, and FactInternetSales tables. In the AdventureWorksDW
database, the DimGeography table isn’t related directly to the FactInternetSales table.
Instead, DimGeography joins DimCustomer, which joins FactInternetSales. This is an example of a snowflake schema, which I covered in Chapter 5.
3.Click the Edit button. In the Queries pane of the Query Editor, select DimCustomer and change the query name to Customer.
4.In the Queries pane, select DimGeography and change the query name to Geography.
5.Select the FactInternetSales query and change its name to InternetSales. Use the Choose Columns transformation to exclude the RevisionNumber, CarrierTrackingNumber, and CustomerPONumber columns.
6.Click “Close & Apply” to add the three tables to the Adventure Works model and to import the new data.
7.In the Data View, select the Customer table. Hide the CustomerKey and GeographyKey columns. Rename the CustomerAlternateKey column to CustomerID.
8.Select the Geography table, and hide the GeographyKey column.
Select the InternetSales table, and hide the first eight columns (the ones with “Key”
suffix).