Introduction
This article aims to clarify how to effectively use Power BI Append and Merge functions, which are vital tools in Power Query. Both functions facilitate the joining of multiple datasets into one, but they serve distinct purposes. In summary, Merge adds new columns to your dataset, while Append adds more rows. Let's delve into each function for a better understanding.
Power BI Merge
Basic Merge
A Merge functions similarly to a Vlookup in Excel. To perform a merge, you'll need two tables with at least one pair of columns that holds matching values. Using the Adventureworks2017 database as an example, I have one table named HumanResources Department and another called HumanResources Employee. Both contain a column labeled Department ID.
Now, if you'd like to replace the Department ID in the Employee table with the actual department name, Merge is the solution. Start the merge by clicking the “Merge Query” button on the “Home” ribbon. After clicking it, a new menu will appear asking you to choose the second table. At this point, you can select the Department table.
Next, select the columns you want to merge by clicking on them. Finally, you'll need to choose the type of merge. The various merge types are a comprehensive subject that warrants a separate discussion, which I have detailed in another post! Once you click OK, a new column will appear on the right, signaling a successful merge.
Click the “Expand” icon to see which columns you can add to your dataset. If you check the box for “use original column name as prefix,” the new columns will be prefixed with “HumanResources Department.” If you leave this box unchecked, the resulting column names won't have the prefix.
Caution Regarding Duplicates! One common pitfall I encountered when first using the merge function was that the number of rows could increase after expanding the merged table. For instance, there were 290 rows initially, but this jumped to 296 afterward. This typically occurs when duplicates exist in the table being merged. For example, some employees might belong to more than one department due to transitions.
To mitigate this, make sure the column you select for merging in the second table contains unique values. If there are duplicates, clean them out prior to merging to avoid this issue.
Merging on Multiple Columns
You aren’t limited to merging just based on one column from each table. You can also merge using multiple columns! In my example, I created an Excel file that includes columns for Territory ID, StartDate, and EndDate, and I wish to import the Name column into my SalesTerritoryHistory table.
To do this:
Initiate the merge using the merge button.
Hold CTRL to select multiple columns.
As a result, the Name column will be successfully included in your dataset.
Merge with Aggregate
You'll notice an option to aggregate values instead of just expanding them, as seen in the previous examples. I’ll illustrate this function using a different table called Sales_SalesPerson, where you have a salesperson ID, their sales quota, and commission.
When expanding the SalesOrderHeader Table, you can switch from “Expand” to “Aggregate.” At this point, select the columns you wish to aggregate, and choose the aggregation type using the downward arrow. For this instance, I will select the count of customers and the total sales value. After clicking OK, the added columns will be visible in your data.
Power BI Append
Basic Append
Append is utilized when you have two datasets with identical structures and wish to combine them into a single one. For instance, consider the sales data for SalesPersonID = 279 and the similar data for SalesPersonID = 274.
To combine these, hit the append button. A menu will prompt you to select the second table for appending. After this, click OK, and a new step will appear in the applied steps menu, indicating both salespeople's data is now in a single dataset!
Beware of Column Name Discrepancies! Let’s look at the two datasets again. The one for SalesPersonID = 279 has a column labeled “SalesPersonID,” while the other one for SalesPersonID = 276 features a column named “SalesPerson.” If we append these, you’ll end up with two columns - “SalesPersonID” and “SalesPerson” - each will contain empty values for half of the rows.
To prevent this issue, ensure that your column names are consistent. Rename the “SalesPerson” column in the dataset for SalesPersonID = 276 to “SalesPersonID.” This will resolve the problem.
Append Three or More Tables
When you are in the original Order Header - 279 tables and click “append,” you can opt for “Three or more tables.” You will then be able to select additional tables to append from the left. Click “Add” to include these tables, and they will appear on the right. After confirming with OK, the tables will be successfully appended, and you’ll see all SalesPersonIDs consolidated into one table.
Append as New
You have the option to append queries as new. By selecting this option, you will encounter the same menu as before. However, once you finish your append, a new query titled “Append1” will appear in your Power Query. This new table will contain the combined data from all the tables you chose to append.
Power BI Append vs Merge Summary
To conclude, it's essential to remember that the merge function focuses on adding new columns to your dataset, while append concentrates on including new rows. I hope this article aids you in utilizing the Power BI Append and Merge functions effectively!
