In Microsoft Excel, I have used many different solutions to compare lists. Sometimes the solution finds duplicates across one or more columns; sometimes the solution focuses on finding unique values. But what if you want to compare two sets of data in Excel to determine if a record is missing from one of the sets?
TO SEE: Microsoft Excel: Become an expert with this online training (TechRepublic Academy)
For example, in Excel, you want to compare a set of inbound purchase orders against a comprehensive list of distributed purchase orders. Fortunately, it’s not as difficult as you might think. In this tutorial, I’ll show you how to use VLOOKUP() in Excel to find missing records in one dataset compared to a comprehensive list.
I use Microsoft 365 desktop on a Windows 10 64-bit system, but you can use any version and Excel for the web supports VLOOKUP(). For your convenience, you can: download the demonstration .xlsx and .xls files†
What is the problem we are trying to solve in Excel?
If your company offers continuous support, you are probably familiar with the purchase order system. When a customer calls and makes a request, grab a blank purchase order, fill in the details, and then proceed to fulfill the request. When the employee complies with the request, he completes the purchase order form and returns it to the system. I use the term open to describe a pending purchase order and closed to describe a completed purchase order. You can think of the closed list as a subset of the open list. With most systems now online, the paper purchase order is a thing of the past, but it gives us a simple example.
Image A shows two table objects. PODistributed, on the left, keeps track of the purchase orders that have been distributed to the staff – the open purchase orders. POReceived, on the right, is a list of purchase orders and their return date, meaning the job is done – these are closed purchase orders.
Image A
As it is, there is no reconciliation between the two tables. We need a way to mark the open purchase orders if there is no matching number in the closed list, which means the purchase order and customer request are still in play. Even with such short lists, you’re likely to make a mistake if you’re only using it visually.
Now that you understand what it takes, let’s add the VLOOKUP() function in Excel to reveal the open purchase orders.
TO SEE: 98 Excel Tips Every User Should Master (TechRepublic)
Using VLOOKUP() to compare two sets of data in Microsoft Excel
The VLOOKUP() function has been the cornerstone of many Excel apps. Because tables do not support dynamic array functions, such as the newer XLOOKUP(), this solution uses the older VLOOKUP().
Excel’s VLOOKUP() function uses the following syntax:
VLOOKUP(lookupvalue, tablearray, colindex, [rangelookup])
The lookupvalue argument identifies the value to match, tablearray identifies the range to search for lookupvalue, colindex identifies the return value when the function finds a match, and the optional rangelookup argument lets you specify an exact match. When FALSE is used for this optional argument, VLOOKUP() returns a #N/A value if lookupvalue doesn’t match, which is what we’re counting on.
To continue, enter one of the following in D3 on the Distributed Purchase Order sheet, the list that opens, and copy to the remaining cells:
=VLOOKUP([@[PO Number]],'PO Received'!$B$3:$B$13,1,FALSE)
=VLOOKUP('PO Distributed'!B3,'PO Received'!$B$3:$B$13,1,FALSE)
Use the latter if you are not using a Table object. The !$B$3:$B$13 reference in both must be absolute.
Excel automatically expands the table object (Figure B† There are several records where the function in the closed purchase order list did not match the master purchase order list. These records are still open. Knowing which purchase orders are important, especially if it has been open for a long time.
Figure B
The function is a bit complicated, so let’s see how it evaluates using the record in row 4:
=VLOOKUP(‘PO distributed’!B3,’PO received’!$B$3:$B$13,1,FALSE)
= VLOOKUP(102, {103; 101; 106; 104; 110; 109; 107; 108; 113; 115; 111}, 1, FALSE)
= VLOOKUP(102, {103; 101; 106; 104; 110; 109; 107; 108; 113; 115; 111}, 102, FALSE)
The closed purchase order array does not contain the value 102, so the FALSE argument forces the function to return #N/A. If the value were 102 in the array, the function would return 102.
You can easily stop if you know what the #N/A values mean; however, we can add a conditional formatting rule based on the results of Excel’s VLOOKUP() function to highlight records where the VLOOKUP() function returns this error. Follow these steps to add the conditional formatting rule.
- Select B3:D19, these are the open purchase orders in the PODistributed table. Do not include the header cells.
- On the Home tab, click Conditional Formatting, and then choose New Rule from the drop-down list.
- In the resulting dialog box, click the Use a formula to determine which cells to format option in the top pane.
- In the lower pane, enter the function: = ISNA ($D3) and click Format. The column reference must be absolute ($D).
- Click the Fill tab, select red, and then click OK once. Figure C shows the function and format.
- Click OK to return to the sheet. The result is a visual focus on the open purchase orders.
Figure C
You can decide to hide the helper column, but I recommend that you don’t. It’s easy to forget hidden values later, making dependency errors difficult to fix. Because both lists are tables, the system is automatically updated when you enter new records in the two lists.