Tableau Prep continues to be fantastically helpful with the visual aids and data profiling, even on something as seemingly mundane as applying a join. Now, I want to join spaceshipmfginfo to this freshly joined sales and customer data. If fields have mismatched names, Tableau will create a new field, simply adding nulls to entries where no data exists for that field name. This can be rectified by highlighting the mismatch fields and tell Tableau to merge them together. It will automatically name the new merged field as a combination of the two. Tableau Desktop Answer The following instructions use the Sample - Superstore data source. Drag Region to Columns. Drag SUM(Sales) to Rows. Drag Measure Names to Filters and select Profit, Order Quantity and Shipping Cost, then click OK. Drag Measure Values to Rows. Tableau - Combine Lecture By: Mr. Pavan Lalwani, Tutorials Point India Private LimitedGet FLAT 10% on latest Tableau certication course(Use Coupon 'YOUTUBE').
Aggregate, join, or union your data to group or combine data for analysis.
Note: Starting in version 2020.4.1, you can now create and edit flows in Tableau Server and Tableau Online. The content in this topic applies to all platforms, unless specifically noted. For more information about authoring flows on the web, see Tableau Prep on the Web.
Sometimes you’ll need to adjust the granularity of some data, either to reduce the amount of data produced from the flow, or to align data with other data you might want to join or union together. For example, you might want to aggregate sales data by customer before joining a sales table with a customer table.
If you need to adjust the granularity of your data, use the Aggregate option to create a step to aggregate or group data. Whether data is aggregated or grouped depends on the data type (string, number, or date).
In the Flow pane, click the plus icon, and select Aggregate. A new aggregation step displays in the Flow pane and the Profile pane updates to show the aggregate and group profile.
To group or aggregate fields, drag them from the left pane to one of the columns in the right pane.
You can also:
Drag and drop fields between the two panes.
Search for fields in the list and select only the fields you want to include in your aggregation.
Double-click a field to add it to the left or right pane.
Change the function of the field to automatically add it to the appropriate pane.
Click Add All or Remove All to bulk apply or remove fields.
Apply certain cleaning operations to fields. For more information abut which cleaning options are available, see About cleaning operations(Link opens in a new window).
Fields are distributed between the Grouped Fields and Aggregated Fields columns based on their data type. Click the group or aggregation type (for example, AVG or SUM) headings to change the group or aggregation type.
In the data grids below the aggregation and group profile, you can see a sample of the members of the group or aggregation.
Any cleaning operations that are made to the fields are tracked in the Changes pane.
The data that you want to analyze is often made up of a collection of tables that are related by specific fields. Joining is a method for combining the related data on those common fields. The result of combining data using a join is a table that’s typically extended horizontally by adding fields of data.
Joining is an operation you can do anywhere in the flow. Joining early in a flow can help you understand your data sets and expose areas that need attention right away.
Tableau Prep supports the following join types:
|Left||For each row, includes all values from the left table and corresponding matches from the right table. When a value in the left table doesn't have a corresponding match in the right table, you see a null value in the join results.|
|lnner||For each row, includes values that have matches in both tables.|
|Right||For each row, includes all values from the right table and corresponding matches from the left table. When a value in the right table doesn't have a corresponding match in the left table, you see a null value in the join results.|
|leftOnly||For each row, includes only values from the left table that don't match any values from the right table. Field values from the right table show as null values in the join results.|
|rightOnly||For each row, includes only values from the right table that don't match any values from the left table. Field values from the left table show as null values in the join results.|
|notInner||For each row, includes all of the values from the right and the left table that don't match.|
|Full||For each row, includes all values from both tables. When a value from either table doesn't have a match with the other table, you see a null value in the join results.|
To create a join, do the following:
Join two tables using one of the following methods:
Note: If you connect to a table that has table relationships defined and includes related fields, you can select Join and select from a list of related tables. Tableau Prep creates the join based on the fields that make up the relationship between the two tables.
For more information about connectors with table relationships, see Join data in the Input step (Link opens in a new window).
A new join step is added to the flow and the profile pane updates to show the join profile.
To review and configure the join, do the following:
Review the Summary of Join Results to see the number of fields included and excluded as a result of the join type and join conditions.
Under Join Type, click in the Venn diagram to specify the type of join you want.
Under Applied Join Clauses, click the plus icon or, on the field chosen for the default join condition, specify or edit the join clause. The fields you selected in the join condition are the common fields between the tables in the join.
You can also click the recommended join clauses shown under Join Clause Recommendations to add the clause to the list of applied join clauses.
The summary in the join profile shows metadata about the join to help you validate that the join includes the data you expect.
Applied Join Clauses: By default, Tableau Prep defines the first join clause based on common field names in the tables being joined. Add or remove join clauses as needed.
Join Type: By default, when you create a join, Tableau Prep uses an inner join between the tables. Depending on the data that you connect to, you might be able to use left, inner, right, leftOnly, rightOnly,notInner, or full joins.
Summary of Join Results: The Summary of Join Results shows you the distribution of values that are included and excluded from the tables in the join.
Click each Included bar to isolate and see the data in the join profile included in the join.
Click each Excluded bar to isolate and see the data in the join profile that are excluded from the join.
Click any combination of the Included and Excluded bars to see a cumulative perspective of the data.
Join Clause Recommendations: Click the plus icon next to the recommended join clause to add it to the Applied Join Clauses list.
Join Clauses pane: In the Join Clauses pane, you can see the values in each field in the join clause. The values that don't meet the criteria for the join clause are displayed in red text.
Join Results pane: If you see values in the Join Results pane that you want to change, you can edit the values in this pane.
If you don't see the results you expect after joining your data, you may need to do some additional cleaning on your field values. The following issues will result in Tableau Prep reading the values as not matching and exclude them from the join:
Different capitalization: My Sales and my sales
Different spelling: Hawaii and Hawai'i
Mispelling or data entry errors: My Company Health and My Company Heath
Name changes: John Smith and John Smith Jr.
Abbreviations: My Company Limited and My Company Ltd
Extra separators: Honolulu and Honolulu (Hawaii)
Extra spaces: This includes extra space between characters, tabbed spaces or extra leading or trailing spaces
Inconsistent use of periods: Returned, not needed. and Returned, not needed.
The good news is that if your field values have any of these issues, you can fix the field values directly in the Join Clauses or work with excluded values by clicking in the Excluded bars in the Summary of Join Results and use the cleaning operations in the profile card menu.
For more information about the different cleaning options available in the Join step, see About cleaning operations(Link opens in a new window).
You can fix mismatched fields right in the join clause. Double-click or right-click the value and select Edit Value from the context menu on the field that you want to fix and enter a new value. Your data changes are tracked and added to the Changes pane right in the Join step.
You can also select multiple values to keep, exclude or filter in the Join Clauses panes, or apply other cleaning operations in the Join Results pane. Depending on which fields you change and where they are in the join process, your change is applied either before or after the join to give you the corrected results.
For more information about cleaning fields see Apply cleaning operations (Link opens in a new window).
Union is a method for combining data by appending rows of one table onto another table. For example, you might want to add new transactions in one table to a list of past transactions in another table. Make sure the tables you union have the same number of fields, the same field names, and the fields are the same data type.
Tip: To maximize performance a single union can have a maximum of 10 inputs. If you need to union more than 10 files or tables, try unioning files in the Input step. For more information about this type of union, see Union files and database tables in the Input step(Link opens in a new window).
Similar to a join, you can use the union operation anywhere in the flow.
To create a union, do the following:
After you add at least two tables to the flow pane, select and drag a related table to the other table until you see the Union option. You can also click the icon and select Union from the menu. A new union step is added in the Flow pane, and the Profile pane updates to show the union profile.
Add additional tables to the union by dragging tables toward the unioned tables until you see the Add option.
In the union profile, review the metadata about the union. You can remove tables from the union as well as see details about any mismatched fields.
After you create a union, inspect the results of the union to validate that the data in the union is what you expect. To validate your unioned data, check the following areas:
Review the union metadata: The union profile shows some metadata about the union. Here you can see the tables that make up the union, the resulting number of fields and any mismatched fields.
Review the colors for each field: Next to each field listed in the Union summary and above each field in the union profile, is a set of colors. The colors correspond to each table in the union.
If all table colors show for that field, then the union performed correctly for that field. A missing table color indicates that you have mismatched fields.
Mismatched fields are fields that might have similar data but are different in some way. You can see the list of fields that don't match in the Union summary and the tables where they came from. If you want to take a closer look at the data in the fields, select the Show only mismatched fields check box to isolate the mismatched fields in the Union profile.
To fix these field, follow one of the suggestions in the Fix fields that don’t match section below.
When tables in a union don’t match, the union produces extra fields. The extra fields are valid data being excluded from their appropriate context.
To resolve a field mismatch issue, you must merge the mismatched fields together.
There are a number of reasons why fields might not match.
Corresponding fields have different names: If corresponding fields between tables have different names, you can use union recommendations, manually merge fields in the Mismatched Fields list, or rename the field in the union profile to merge the mismatched fields together.
To use union recommendations, do the following:
in the Mismatched Fields list, click on a mismatched field. If a suggested match exists, the matching field is highlighted in yellow.
Suggested matches are based on fields with similar data types and field names.
Hover on the highlighted field and click the plus button to merge the fields.
To manually merge fields in the Mismatched Fields list, do the following:
Select one or more fields in the list.
Right-click or Ctrl-click (MacOS) a selected field and if the merge is valid, the Merge Fields menu option appears.
If you see No options available when you right-click the field, this is because the fields are not eligible to merge. For example trying to merge two fields from the same input.
Click Merge Fields to merge the selected fields.
To rename the field in the union profile pane, right-click the field name and click Rename Field.
Corresponding fields have the same name but are a different type: By default, when the name of corresponding fields match but the data type of the fields don’t, Tableau Prep changes the data type of one of the fields so they are compatible with each other. If Tableau Prep makes this change, it’s noted at the top of the merged field by the Change Data Type icon.
In some cases, Tableau Prep might not pick the correct data type. If that happens and you want to undo the merge, right-click or Ctrl-click (MacOS) the Change Data Type icon and select Separate Inputs with Different Types.
You can then merge the fields again by first changing the data type of one of the fields and then using the suggestions in Additional merge field options.
Corresponding tables have different number of fields: To union tables, each table in the union must contain the same number of fields. If a union results in extra fields, merge the field into an existing field.
In addition to the methods described in the above section for merging fields you can also use one of the following methods to merge fields. You can merge fields in any step, except for the Output step.
For information about how to merge fields in the same file, see Merge fields.
To merge fields, do one of the following:
Drag and drop one field onto another. A Drop to merge fields indicator displays.
Select multiple fields and right-click within the selection to open the context menu, and then click Merge Fields.
Select multiple fields, and then click Merge Fields on the context-sensitive toolbar.
You can union your data to combine two or more tables by appending values (rows) from one table to another. To union your data in Tableau data source, the tables must come from the same connection.
If your data source supports union, the New Union option displays in the left pane of the data source page after you connect to your data. Supported connectors may vary between Tableau Desktop and Tableau Server and Tableau Online.
For best results, the tables that you combine using a union must have the same structure. That is, each table must have the same number of fields, and related fields must have matching field names and data types.
For example, suppose you have the following customer purchase information stored in three tables, separated by month. The table names are 'May2016,' 'June2016,' and 'July2016.'
A union of these tables creates the following single table that contains all rows from all tables.
Use this method to manually union distinct tables. This method allows you to drag individual tables from the left pane of the Data Source page and into the Union dialog box.
On the data source page, double-click New Union to set up the union.
Drag a table from the left pane to the Union dialog box.
Select another table from the left pane and drag it directly below the first table.
Tip: To add multiple tables to a union at the same time, press Shift or Ctrl (Shift or Command on a Mac), select the tables you want to union in the left pane, and then drag them directly below the first table.
Click Apply or OK to union.
Use this method to set up search criteria to automatically include tables in your union. Use the wildcard character, which is an asterisk (*), to match a sequence or pattern of characters in the Excel workbook and worksheet names, Google Sheets workbook and worksheet names, text file names, JSON file names, .pdf file names, and database table names.
When working with Excel, text file data, JSON file, .pdf file data, you can also use this method to union files across folders, and worksheets across workbooks. Search is scoped to the selected connection. The connection and the tables available in a connection are shown on the left pane of the Data source page.
On the data source page, double-click New Union to set up the union.
Click Wildcard (automatic) in the Union dialog box.
Enter the search criteria that you want Tableau to use to find tables to include in the union.
For example, you can enter *2016 in the Include text box to union tables in Excel worksheets that end with '2016' in their names. Search criteria like this will result in the union of May2016, June2016, and July2016 tables (Excel worksheets), from the selected connection. In this case, the connection is called Sales, and the connection made to the Excel workbook containing the worksheets you wanted was in the quarter_3 folder in the sales directory (e.g., Z:salesquarter_3).
Click Apply or OK to union.
The tables initially available to union are scoped to the connection you've selected. If you want to union more tables that are located outside of the current folder (for Excel, text, JSON, .pdf files) or in a different workbook (for Excel worksheets), select one or both check boxes in the Union dialog box to expand your search.
For example, suppose you want to union all Excel worksheets that end with '2016' in its name outside of the current folder. The initial connection is made to an Excel workbook located in the same directory in the above example, Z:salesquarter_3.
Include: If you enter *2016 in the Include text box and leave the remaining search criteria of the dialog as is, Tableau looks for all Excel worksheets that end with '2016' in its name inside the current folder.
In the diagram below, the yellow highlighted item represents the current location, that is, the Excel workbook that you created a connection to in the 'quarter_3'. The green box represents the tables belonging to workbooks and sheets that are unioned as result of this search criteria.
Include + Expand search to subfolders: If you enter *2016 in the Include text box and select the Expand search to subfolders check box, Tableau does the following:
Looks for all Excel worksheets that end with '2016' in their names inside the current folder.
Looks for additional Excel worksheets that end with '2016' in their names that are located in Excel workbooks in subfolders of the 'quarter_3' folder.
In the diagram below, the yellow highlighted item represents the current location, that is, the Excel workbook that you created a connection to in the 'quarter_3' folder. The green box represents the tables belonging to workbooks and worksheets that are unioned as a result of this search criteria.
Include + Expand search to parent folder: If you enter *2016 in the Include text box and select the Expand search to parent folder check box, Tableau does the following:
Looks for all Excel worksheets that end with '2016' in their names inside the current folder, 'quarter_3.'
Looks for additional Excel worksheets that end with '2016' in their names that are located in parallel folders of the 'quarter_3' folder. In this example, 'quarter_4' is the parallel folder.
In the diagram below, the yellow highlighted item represents current location, that is, the Excel workbook that you created a connection to in the 'quarter_3' folder. The green boxes represent the tables belonging to the workbook and worksheets that are unioned as a result of this search criteria.
Looks for all Excel worksheets that end with '2016' in their namesinside the current folder, 'quarter_3.'
Looks for additional Excel workbooks that are located in the subfolders of the current folder, 'quarter_3.'
Looks for additional Excel workbooks that are located in parallel folders and subfolders of the 'quarter_3' folder. In this example, 'quarter_4' is the parallel folder.
In the diagram below, the yellow highlighted item represents the current location, that is, the Excel workbook that you created a connection to. The green box represents the tables belonging to the workbook and worksheets that are unioned as a result of this search criteria.
Note: When working with Excel data, wildcard search includes named ranges but excludes tables found by Data Interpreter.
Perform basic union tasks directly in the canvas of the Data Source page.To rename a union
Double-click the logical table that contains unioned physical tables.
Double-click the union table on the physical layer canvas.
Enter a new name for the union.
Double-click the logical table that contains unioned physical tables.
Click the union drop-down arrow and then select Edit Union.
You can drag additional tables that you want to union from the left pane, or hover over a table until the remove icon displays and then click the icon to remove the table.
Click Apply or OK to complete the task.
Double-click the logical table that contains unioned physical tables, and then click the union drop-down arrow and select Remove.
Tables in a union are combined by matching field names. When working with Excel, Google Sheets, text file, JSON file or .pdf file data, if there are no matching field names (or your tables do not contain column headers), you can tell Tableau to combine tables based on the order of the fields in the underlying data by creating the union and then selecting Generate field names automatically option from the union drop-down menu.
After you create a union, additional fields about the union are generated and added to the grid. The new fields provide information about where the original values in the union come from, including the sheet and table names. These fields are useful when unique information that is critical to your analysis is embedded in the sheet or table name.
For example, the tables used in the example above have unique month and year information stored in the table name instead of in the data itself. In this case, you can use the Table Name field that is generated by the union to access this information and use it in your analysis.
If a named range is used in a union, null values display under the Sheet field.
Note: You can use the fields generated by a union, such as Sheetor Table Name, as join keys. You can use a unioned table in a join with another table or unioned table.
When field names in the union do not match, fields in the union contain null values. You can merge the non-matching fields into a single field using the merge option to remove the null values. When you use the merge option, the original fields are replaced by a new field that displays the first non-null value for each row in the non-matching fields.
You can also create your own calculation or, if possible, modify the underlying data to combine the non-matching fields.
For example, suppose a fourth table, 'August2016', is added to the underlying data. Instead of the standard 'Customer' field name, it contains an abbreviated version called 'Cust.'
A union of these tables creates a single table that contains all rows from tables, with several null values. You can use the merge option to combine the related customer fields into a single field.
Union (with null values)
Union (with columns that have been merged)
After you merge fields, you can use the field generated from the merge in a pivot or split, or use the field as a join key. You can also change the data type of the field generated from a merge.To merge mismatched fields
Select two or more columns in the grid.
Click the column drop-down arrow, and then select Merge mismatched fields.
A unioned table can be used in a join.
A unioned table can be used in a join with another unioned table.
The fields generated by a union, Sheet and Table name, can be used as the join key.
If a named range is used in union, null values display under the Sheet field.
The field generated from a merge can be used in a pivot.
The field generated from a merge can be used as a join key.
The data type of the field generated from a merge can be changed.
Union tables from within the same connection. That is, you cannot union tables from different databases.
When working with Excel data, wildcard search includes named ranges but excludes tables found by Data Interpreter.
The field generated from a merge can be used in a pivot or split.
To union a JSON file, it must have a .json, .txt, or .log extension. For more information about working with JSON data, see JSON File.
When using wildcard search to union tables in a .pdf file, the result of the union is scoped to the pages that were scanned in the initial .pdf file you connected to. For more information about working with data in .pdf files, see PDF File.
Stored procedures cannot be unioned.
When working with database data, you can convert your union into custom SQL.