Merging Data Files in SPSS For merging SPSS data files with similar cases but different variables, use MATCH FILES. Make sure your case identifier -if any- doesn't contain duplicate values and the files are sorted ascendingly on it. The result contains all cases from both files (like a full outer join in SQL). Apr 12, 2021 Navigate the SPSS interface using the drop-down menus or syntax. Create a new dataset or import data from a file. Section 2: Working with Data covers data manipulation and cleaning of all kinds. In this section, you'll learn how to: Create, modify, or compute new variables. Manipulate a dataset by splitting, merging, or transposing techniques. About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features Press Copyright Contact us Creators. Merge the active dataset with another open dataset or IBM® SPSS® Statistics data file containing the same variables but different cases. Merge the active dataset with another open dataset or IBM SPSS Statistics data file containing the same cases but different variables. From the menus choose: Data Merge Files.

  1. Define Data Files
  2. How To Merge Multiple Spss Files
  3. Spss Merge Cases
  4. Spss Merging Data Files With Different Cases In Excel
  5. Data Files Definition
  6. Spss Merging Data Files With Different Cases Online

1. Introduction

When you have two data files, you can combine them by merging them side by side, matching up observations based on an identifier. For example, below we have a file containing dads and we have a file containing faminc. We would like to match merge the files together so we have the dads observation on the same line with the faminc observation based on the key variable famid.

After match merging the dads and faminc, the data would look like this.

2. One-to-one merge

Let’s start by creating the files that we will be merging. Below we create the files dads.sav and faminc.sav.

The output of these statements is shown below, confirming that we have read the data properly.

There are three steps to match merge dads.sav with faminc.sav. (Note that this is a one to one merge because there is a one to one correspondence between the dads and faminc records.) These three steps are illustrated below.

  1. Use SORT CASES to sort dads on famid and save that file (we will call it dads2.sav)
  2. Use SORT CASES to sort faminc on famid and save that file (we will call it faminc2.sav)
  3. Use MATCH FILES to merge the dads2.sav and faminc2.sav files based on famid

Below we show the commands for performing the merge.

The output below shows that the match merge worked properly.

3. One-to-many merge

The next example considers a one to many merge where one observation in one file may have multiple matching records in another file. Imagine that we had a file with dads like we saw in the previous example, and we had a file with kids where a dad could have more than one kid. You see why this is called a one to many merge since you are matching one dad observation to one or more (many) kids observations. Remember that the dads file is the file with one observation, and the kids file is the one with many observations. Below, we create the data file for the dads and for the kids.

As you see below, the steps for doing a one to many merge is similar to the one to one merge that we saw above.

  1. Use SORT CASES BY to sort dads on famid and save that file (we will call it dads2)
  2. Use SORT CASES BY to sort kids on famid and save that file (we will call it kids2)
  3. Use MATCH FILES to merge the dads2 and kids2 files. However, since the dads file is the file with one observation, use /TABLE='dads2.sav', not /FILE='dads2.sav' to specify the dads file.

The output below shows that this merge worked as we hoped.

The key difference between a one to one merge and a one to many merge is that you need to use /TABLE='dads2.sav' instead of /FILE='dads2.sav'. For your data, when you do a one to many merge, ask yourself which file plays the role of one (in one to many). For that file, use /TABLE= instead of /FILE=.

Let’s intentionally make an error and use /FILE='dads2.sav'and see what SPSS does.

The first thing we notice is that SPSS gives us the warning shown below. This is telling us that there are multiple kids for a given dad.

As SPSS advises, we will inspect the results carefully. Indeed, we see the results are not what we desired. When there were multiple kids per dad, it only merged the dad with the first kid, and then the following kids with the same dads were assigned missing values for the dads information (name and inc). When we used the /TABLE= subcommand in the previous example, SPSS carried the dads information across all of the kids.

Spss merging data files with different cases free

4. Ordering the variables in the new file

You can use the /MAP subcommand with the ADD FILES command to see the order of the variables in the new file, as illustrated below. If you would like to rearrange the order of the variables in the new file, you can also add the /KEEP subcommand to the ADD FILES command. The variables will be ordered in the new file in the order that you list them on the /KEEP subcommand. If you do not list all of the variables on the /KEEP subcommand, the variables not listed will not be present in the new file. Also note that you can list the first few variables if they are the only ones that need to be reordered, and then use the keyword ALL to have the rest of the variables included in the new file. The variables not specified on the /KEEP subcommand will remain the order in which they are in the original files.

As you can see, the variables in the new file are now in the order name, famid inc.

5. Problems to look out for

Define Data Files

5.1 Mismatching records in one-to-one merge

How To Merge Multiple Spss Files

The two data files have may have records that do not match. Below we illustrate this by including an extra dad (Karl in famid 4) who does not have a corresponding family, and there are two extra families (5 and 6) in the family file that do not have a corresponding dad.

As you see above, we use /IN=fromdad to create a 0/1 variable that indicates whether the resulting file contains a record with data from the dads file. Likewise, we use /IN=fromfam to indicate if the resulting file has a record from the faminc file. The LIST and CROSSTABS then show us about the mismatching records.

The output from the LIST command shows us that when there were mismatching records. For famid 4, the value of fromdad is 1 and fromfam is 0, as we would expect since there was data from dads for famid 4, but no data from faminc. Also, as we expect, this record has valid data for the variables from the dads file (name and inc) and missing data for the variables from faminc (faminc96 faminc97 and faminc98). We see the reverse pattern for famid 5 and 6.

If we look at the fromdad and fromfam variables, we can see that there are three records that have matching data, one that has data from the dads only, and two records that have data from the faminc file only. The crosstab below shows us the same results, and is an easier way of tallying the matching than manually tallying the matching.

When matching files, we suggest that you use this strategy to check the matching of the two files. If there are unexpected mismatched records, then you should investigate to understand the cause of the mismatched records.

You can use SELECT IF to eliminate some of the non-matching records. For example, if you wanted to keep just the records where the dads matched with the family information, you could type

The results are shown below, including just the three matching records.

Spss Merge Cases

5.2 Mismatching records in one-to-many merge

SPSS handles the inclusion of mismatched records in a one to-many merge differently than a one-to-one merge. Remember that in a one-to-many merge, there is a file that has one observation that matches to many observations in the other file; let us refer to these as the one file and the many file. If there are observations in the one file that do not match to the many file, then these observations will not appear in the merged file at all. If there are observations in the many file that do not match the one file, those records will appear in the merged file. If this is what you desire, then you can merge the files as illustrated in Section 3, and use the /IN= as illustrated in the prior section to track the matching. However, if you would like mismatched records from the one and many file to both appear in the merged file, then you can use the matching strategy outlined below.

Below we use our example to merge dads with kids, and in this example we have mismatched records in both files. Below we match the files to include all mismatched records in the merged file. The parts that are different are indicated in red.

Different

The section in red adds an extra step to the matching. The purpose of this step is to add any values of famid that are only in the dads file to the kids file. It does by doing a one-to-one merge between dadid and the kids and saves that file as temp. Since dadid just the famid of all of the dads, this merge basically adds observations for any famid that is in the dads file but not in the kids file, and saves this as temp. Then, we can then merge temp with dads2 and temp will have a famid for every observation in the dads2 file. This assures that the resulting file will include all observations from the dads file, even if they do not have a matching record in the kids file. The result is shown below. Indeed, the file contains the observation for the dad Karl who does not have any matching kids. If we omitted the extra code in this step, that record would not have been included in this file.

5.3 Variables with the same name, but different information

Below we have the files with the information about the dads and family, but look more closely at the names of the variables. In the dads file, there is a variable called inc98, and in the family file there are variables inc96, inc97 and inc98. Let’s go ahead and merge these files and see what SPSS does.

The results are shown below. As you see, the variable inc98 has the data from the dads file, the file that appeared first in the MATCH FILES command. When you match files that have the same variable, SPSS will use the values from the file that appears earliest in the MATCH FILES command.

There are a couple of ways you can solve this problem.

Solution #1. The most obvious solution is to choose variable names in the original files that will not conflict with each other. However, you may receive files where the names have already been chosen.

Solution #2. You can rename the variables in the MATCH FILES command (which renames the variables before doing the matching). This allows you to select variable names that do not conflict with each other, as illustrated below.

As you can see below, the variables were renamed as we specified.

Spss Merging Data Files With Different Cases In Excel

5.4 The same variables with different dictionary information

This problem is similar to the one outlined above. In this example, we have two variables with the same name and the same information, but with different dictionary information associated with them. This dictionary information could include value labels and/or variable labels. As with the example above, SPSS will take the information from the file listed first in the MATCH FILES command. No error or warning message will be issued to let you know that the information from the variable in the later file has been lost. The solution to this problem is to list first in the MATCH FILES command the file with the dictionary information that you want in the resulting file.

5.5 You have run the ADD FILES command, and nothing happened

Data Files Definition

If you run just the ADD FILES command, as shown below, SPSS will not do anything. However, you will see a note in the lower right corner of the data editor saying 'transformation pending'.

Solution: The solution is to add either the execute command or a procedure command that will force the execution of the transformation, such as the list command or the crosstab command.

Spss merging data files with different cases online

6. For more information

Spss Merging Data Files With Different Cases Online

  • For more information about Match Merging data files, see the MATCH FILES command in SPSS Syntax Reference Guide.
  • For information on concatenating data files, see the SPSS Learning Module on Concatenating Data Files in SPSS

Coments are closed

Most Viewed Posts

  • Sheet Merge In Excel
  • Tor Browser 8.5
  • Invoice Example
  • Iphone 5c Ios
  • Microsoft Teams Emoji

Scroll to top