Amazon

Interview

Perform an Intersection on Two Unsorted Lists

Intersect two unsorted linked lists and return it as a new set. The new set should be only what the two lists have in common. ( Source: AWS Interviewer)

Perform an Intersection on Two Unsorted Lists, Solutions

Solution #1: Classic Java Iterator Design Pattern, Immutable parameters

The following image is a snap shot of the solution developed using the Eclipse Mars.2 IDE using my iMac and the results shown on collabedit:

Right-click and "Save Image as ..." to get a better look at this image!

Despite getting lots of sleep and having prepared for the interview for three days, when it finally time for the interview my mind froze. It happens, usually when your asked a question in terms not immediately familiar to you. Such was the occasion this time around. I was asked by a lady with a degree in computer science to perform an intersection on two linked lists. The term 'intersection' through me for a loop but not because I didn't know what an intersection meant, but because I was, once again, expecting a far more difficult question such as 'rebuild a binary tree algorithm' from scratch. Once the woolies left my mind we were on our way. All she was really looking for was how I would implement a solution to finding a subset on a linked list. That being said, in my mind I had to take into consideration the speed of execution on the method depending upon the size of the linked list. In my mind, if the linked list was sorted, you could substantially reduce the amount of execution time merely be accessing the source list using an index as opposed to sequential access on every item in the query set.

I chose to implement the solution in Java but I was still thinking in terms of C++ and STL because in C++ and STL when you perform a sort on a list, the list is accessible for an indexed search. In Java, things work differently. Just because you sort a link list in Java, does not mean that the 'contains' test will make use of an index. Hence, my initial solution, while correct in terms of C++ and STL, and while it executes correctly, only utilizes a sequential access. However, the original problem does not require an index or a sort, so my solution again met the spec however the interviewer's primary design question ensued: "What are the primary design considerations for such a solution?" (paraphrased). I think she was looking for the keyword 'log' as in 'logarithmic' as in Computer Science degree terms, that as the larger the list, the longer it would take to complete the search. I used the word 'exponential' which means the same thing, and I think she heard that and that qualified me to move on to the next stage of the interview process, "You should hear from the recruiters in two to three days, if not, be sure to contact to follow up." (paraphrased)

Here is my initial solution:

List<Integer> findIntersection(List<Integer> list1, List<Integer> list2) {

List<Integer> result = new ArrayList<Integer>();

Collections.sort(list1);

Iterator<Integer> iterator = list2.iterator();

while (iterator.hasNext()) {

Integer i = iterator.next();

if ( list1.contains(i) )

result.add(i);

}

return result;

}

Essentially what we are doing is reading each item of the query list (list2) and testing to see if the search list (list1) contains that item. This met the spec, the original question just fine but I was concerned about performance on large, very large lists as was the interviewer. My solution for that was to index the search list (list1). However, as mentioned, I forgot and I didn't have time to verify this issue properly with the time allotted in a short interview, all I could do is demonstrate what needed to be done. So, in some parallel universe, if Java's Collection.sort() method caused the 'contains' method of the sorted link list to be searched using an index then my solution would be correct.

Remember earlier I said that even though I implemented the initial solution in Java, I was still think in terms of C++ & STL. Turns out the interviewer wasn't all that familiar with C++ or the STL however her preferred language just happened to be Java, so I lucked out there. But I don't know if she was aware that doing a Collections.sort doesn't not necessarily mean that an index will be used internally any search operations on the list. I only came to realize this after wards and as soon as I inquired by way of the Internet, sure enough, I found out that I was wrong! Conceptually correct yes, but Java's implementation of Collection.sort() works differently than the behaviour of C++ and STL. So now in this article I want to see what would be required to do in Java what is being handled in C++ and STL. In essence I need to make use of a binary tree. Fortunately Java does have a built in binary tree but it's implementation is not as straight forward as C++'s STL.

She also wanted to know about issues related to insertion on link list performance, for that I referred to Bjarne Stroustrup's most excellent lecture on precisely this issue (see Test Suite).

Solution #2: Classic HashMap Solution, Immutable parameters

Fortunately for me, the interviewer was able to clearly indicate what she was looking for by specifying the interface of the function. I was looking for something a bit more complicated taking into consideration the possible size of the link list (into the millions). For most day to day operations, link lists can be fairly small and a simple iteration through the list will suffice, but in the case of millions of items, the performance of the function will slow down exponentially or (logarithmically) as she indicated:

Right-click and "Save Image as ..." to get a better look at this image!

Hence the need for an indexed solution. In Java, this indexed capability is readily available using the built-in Java HashMap container. The second solution assumes the link lists passed to the function are not sorted, therefore has to make use of it's own mapping which requires space on the heap:

List<Integer> findIntersection(List<Integer> list1, List<Integer> list2) {

Map<Integer,Integer> searchList = new HashMap<Integer,Integer>();

for (Integer value : list1) {

searchList.put(value, value);

}

List<Integer> result = new ArrayList<Integer>();

for (Integer test : list2) {

if ( searchList.containsKey(test))

result.add(test);

}

// Performance optimization secret:

// Drop a hint to the GC to remove this

// item from the heap as soon as possible!

searchList = null;

return result;

}

There is also a Java performance hint dropped to the garbage collector (GC) to remove the HashMap as soon as possible. These secrets are the type you pick up after years of working with Java. In my case, it was a secret learned when working on a video conferencing package purely in Java to work over the Internet. I would discover that when you assign a variable to null, it drops a hint to Java's GC to remove it from it's internal storage.

Right-click and "Save Image as ..." to get a better look at this image!

So that's all there really is to this solution. I could go to talk able providing a solution if dynamically allocating a HashMap were not an option but that would be rambling.

Right-click and "Save Image as ..." to get a better look at this image!

Solution #3: C++/STL Solution, Mutable parameters

An STL solution to this problem is actually a variation to the merge solution.

Synopsis:

The section shows a C++ / STL coding equivalent to the Java HashMap coding solution. Below you will see the merge solution used in the Bloomberg interview:

// Merge Solution #1

class Solution {

public:

ListNode* mergeTwoLists(ListNode* l1, ListNode* l2) {

std::list<int> intList;

ListNode* h1 = l1, *h2 = l2;

while ( h1!=NULL ) {

intList.push_back(h1->val);

h1 = h1->next;

}

while ( h2!=NULL ) {

intList.push_back(h2->val);

h2 = h2->next;

}

ListNode* listNode = new ListNode(0), *p = listNode, *result = listNode;

intList.sort();

std::list<int>::const_iterator iterator;

for (iterator = intList.begin(); iterator != intList.end(); ++iterator) {

int value = *iterator;

p = listNode;

listNode->val = value;

listNode = new ListNode(0);

p->next = listNode;

}

p->next = NULL;

if ( p == listNode ) {

delete listNode;

return NULL;

}

return result;

}

};

All we have to do here is change the merge operations to one of an insertion operation, returning null if the result set is empty. We perform an optional sort on the result set for clarity purposes:

// Intersection Solution #1

struct ListNode {

int val;

ListNode *next;

ListNode(int x) : val(x), next(NULL) {}

};

class Solution {

public:

ListNode* intersectTwoLists(ListNode* l1, ListNode* l2) {

if ( l1 == NULL )

return NULL;

std::map<int,ListNode*> searchList;

ListNode* h1 = l1, *h2 = l2;

while ( h1!=NULL ) {

searchList.insert ( std::pair<int,ListNode*>(h1->val,h1) );

h1 = h1->next;

}

std::list<int> intList;

while(h2 != NULL)

{

if(searchList.find(h2->val) != searchList.end()) {

intList.push_back(h2->val);

}

h2 = h2->next;

}

if ( intList.size() == 0 )

return NULL;

intList.sort(); // optional

ListNode* listNode = new ListNode(0), *p = listNode, *result = listNode;

std::list<int>::const_iterator iterator;

for (iterator = intList.begin(); iterator != intList.end(); ++iterator) {

int value = *iterator;

p = listNode;

listNode->val = value;

listNode = new ListNode(0);

p->next = listNode;

}

p->next = NULL;

return result;

}

};

Even though the STL C++ solution works (and to the technically inclined works as expected), it's easy to see that the Java solution shines in it's coding equivalent and simplicity.

Right-click and "Save Image as ..." to get a better look at this image!

Intersection Test Suite

Now along with a coding solution Claire also requested a test suite to run against the solution to demonstrate that the solution worked in against a host of parameters:

Find the intersection of two integer arrays.

[1, 2, 3], [4, 5, 6] -> []

[1, 2], [2] -> [2]

Using the notation that she provided I described typical parameters that would be the foundation for an exhaustive test against the proposed solution:

Find the intersection of two integer arrays.

[1, 2, 3], [4, 5, 6] -> []

[1, 2], [2] -> [2]

[1, 2, 3], [4, 5, 6] -> []

[1, 2, 3], [-1, -2, -3] -> []

[1, 2, 3], [1, 2, 3] -> [1,2,3]

[1, 2, 3], [] -> []

[], [1,2,3] -> []

[2, 1, 3], [] -> []

[2, 1, 3], [1,2,3] -> [1,2,3]

Again it was important to pass in unsorted test lists to ensure that our binary solution worked as expected:

Find the intersection of two integer arrays.

[2, 1, 3], [1,2,3] -> [1,2,3]

In Java a JUnit test case could be implemented as follows:

import java.util.ArrayList;

import java.util.List;

import org.junit.Assert;

import org.junit.Before;

import org.junit.Test;

public class SolutionTest {

List<Integer> list1;

List<Integer> list2;

@Before

public void setUp() throws Exception {

list1 = new ArrayList<Integer>();

list2 = new ArrayList<Integer>();

}

public void loadLists(int a[], int b[]) {

for (int x : a) {

list1.add(x);

}

for (int y : b) {

list2.add(y);

}

@Test

public final void testFindIntersectionTC01() {

int a[] = { 1, 3, 7 };

int b[] = { 2, 3, 8 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==1);

Assert.assertTrue(list3.get(0)==3);

}

// Find the intersection of two integer arrays.

// [1, 2, 3], [4, 5, 6] -> []

@Test

public final void testFindIntersectionTC02() {

int a[] = { 1, 2, 3 };

int b[] = { 4, 5, 6 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==0);

}

// [1, 2], [2] -> [2]

@Test

public final void testFindIntersectionTC03() {

int a[] = { 1, 2 };

int b[] = { 2 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==1);

Assert.assertTrue(list3.get(0)==2);

}

// [1, 2, 3], [-1, -2, -3] -> []

@Test

public final void testFindIntersectionTC04() {

int a[] = { 1, 2, 3 };

int b[] = { -1, -2, -3 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==0);

}

// [1, 2, 3], [1, 2, 3] -> [1,2,3]

@Test

public final void testFindIntersectionTC05() {

int a[] = { 1, 2, 3 };

int b[] = { 1, 2, 3 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==3);

Assert.assertTrue(list3.get(0)==1);

Assert.assertTrue(list3.get(1)==2);

Assert.assertTrue(list3.get(2)==3);

}

// [1, 2, 3], [] -> []

@Test

public final void testFindIntersectionTC06() {

int a[] = { 1, 2, 3 };

int b[] = { };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==0);

}

// [], [1,2,3] -> []

@Test

public final void testFindIntersectionTC07() {

int a[] = { };

int b[] = { 1, 2, 3 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==0);

}

// [2, 1, 3], [] -> []

@Test

public final void testFindIntersectionTC08() {

int a[] = { 2, 1, 3 };

int b[] = { };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==0);

}

// [2, 1, 3], [1,2,3] -> [1,2,3]

@Test

public final void testFindIntersectionTC09() {

int a[] = { 2, 1, 3 };

int b[] = { 1, 2, 3 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==3);

Assert.assertTrue(list3.get(0)==1);

Assert.assertTrue(list3.get(1)==2);

Assert.assertTrue(list3.get(2)==3);

}

// [2, 1, 3], [2,1,3] -> [1,2,3]

@Test

public final void testFindIntersectionTC10() {

int a[] = { 2, 1, 3 };

int b[] = { 2, 1, 3 };

loadLists(a,b);

Solution s = new Solution();

List<Integer> list3 = s.findIntersection(list1, list2);

Assert.assertTrue(list3.size()==3);

Assert.assertTrue(list3.get(0)==1);

Assert.assertTrue(list3.get(1)==2);

Assert.assertTrue(list3.get(2)==3);

}

What's important to note about using a test framework, especially when developing a solution for a customer, is that it is so important to get the tiny details correct where possible!

About the last test case: [2, 1, 3], [2,1,3] -> [1,2,3]

In this case, while writing a test case for this method, such a tiny detail was highlighted in the 10th use case. One where the result set had to be sorted. The original Java solution highlighted in the first tab, overlooked this requirement. But a close look at the test case specified showed that the result on the last test has to be in sorted arrangement. Hence a need to change the original HashMap solution:

List<Integer> findIntersection(List<Integer> list1, List<Integer> list2) {

Map<Integer,Integer> searchList = new HashMap<Integer,Integer>();

for (Integer value : list1) {

searchList.put(value, value);

}

List<Integer> result = new ArrayList<Integer>();

for (Integer test : list2) {

if ( searchList.containsKey(test))

result.add(test);

}

// Performance optimization secret:

// Drop a hint to the GC to remove this

// item from the heap as soon as possible!

searchList = null;

Collections.sort(result);

return result;

}

This test case is by no means complete but for the purposes of this blog it clearly demonstrates the need for test cases in professional software development environments.

Final thoughts!

Adding this section to my blog was a good investment of my time! As showing two different implementations of a solution in both Java and C++ gives us a chance to see the pros and cons of both languages. It is in my humble opinion that Java provides a cleaner implementation for basic but common link list management tools such as List<T>, Hashmap<T> and Collections.

One other thing to keep in mind however is to look out for brain freeze before an important interview. About a 1/2 hour before this interview I was handed a Tim Horton's Maple Ice Capp and perhaps this was the cause of me having recollection issues with respect to the HashMap class.

Perhaps ...

Why you should avoid Linked Lists, Bjarne Stroustrup
Day 1 Keynote, Bjarne Stroustrup: C++11 Style

What you have seen here is a typical day in software development for me. You utilize the tools you have available combined with what you understand as current and correct feedback.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form

PERRY ANDERSON: IT CONSULTANT PORTFOLIO