find an element in a sorted matrix [duplicate]

2019-01-08 14:37发布

问题:

This question already has an answer here:

  • How do I search for a number in a 2d array sorted left to right and top to bottom? 19 answers

Problem: Given a matrix in which each row and each column is sorted, write a method to find an element in it.

It is a classic interview question, here is my solution

boolean F(int[][] matrix, int hs, int he, int ws, int we)
{
    if (hs > he || ws > we) 
        return false; 

    int m = (hs + he) / 2; 
    int n = (ws + we) / 2;

    if (matrix[m][n] == t)
    {
        return true;
    }
    else if (matrix[m][n] < t)
    {
        // find the ele in the same row, right to [m][n]
        F(m, m, n + 1, we);

        // find the ele in the same col, upper to [m][n]
        F(m + 1, he, n, n);

        // find the ele in the area, where i>m,j>n 
        F(m + 1, he, n + 1, we);       
    } 
    else if (matrix[m][n] > t)
    {
        // very similar to previous part
    }
}

The running time of the algorithm is log(m) + log(n). I am looking for an algorithm that is more efficient, or with concise code.

Having more comments, I come up with following code:

// return target recurrence in the matrix
int F(int[][] m, int rs, int re, int cs, int ce, int t){
   int r1 = rs, r2 = re;
   int c1 = cs, c2 = ce;
   int r=0 , c = c1;

   while( r1 < r2 && c1 < c2 ){
   // find the last element that <= t in column c
     r  = FlastLess( r1, r2, c, t)

     if( r == -1 ) break;

     else{
       // find the first ele in the row that is >=t
       c = FfirstGreater( r, c1, c2, t);

       if( c == -1)  break;
       else{
         r2 = r; 
         c1 = c; 
       }// else    
     }// else 
   }// while
}// f

Here is the link to function F1 and F2 Find the first element in a sorted array that is greater than the target

void FlastLess(int s, int e, int t){
  int l = s, h = e;
  while( l != h ){
     int mid = (l+h)/2;
     if( mid >=  t) high = mid - 1; 
     else {
       if( high < t) low= mid + 1;
       else low = mid;
     } 
  }

 void FfirstGreater(int s, int e, int t){
  while(l < h){
    mid = (l+h)/2;
    if ( mid <=  t) low = mid+1;
    else high = mid;
  }
 }

}

回答1:

Start in the bottom-left corner of your matrix. Then go to the right until you find the exact number (done), or until you find a number that is bigger.

Then you go upwards in the matrix until you find the exact number (done), or until you find a number that is too small.

Then again you move to the right, ... and so on until you found the number or until you reach the right-side or top of your matrix.

The following images contain some examples, using an Excel table showing the target number in green, and the path that is followed in yellow.

In the last example we look for 207, which isn't in the matrix:

This is just the algorithm. The coding is left for you as an exercise :-)

EDIT: When starting on the bottom row, a binary search might give a better starting point. For the rest of the algorithm it probably doesn't matter.



回答2:

Your algorithm may be O(log m + log n), but it also gives the wrong answer.

Suppose you search for "4" in the following matrix (where the upper-left is row=0, col=0):

0 1 4
1 2 5
2 3 6

Your algorithm starts by looking at the 2 in the center. Since 4 is greater than 2, you proceed to search the same row (not there), same column (not there), and the lower-right corner (not there). Whoops.

The constraint that each row and column is sorted is actually pretty weak. In particular, the elements along the diagonal could be in any order.

I think the correct approach is to do a binary search on the first and last column to narrow down a range of possible rows. Then do binary search on the first and last of those rows to narrow down the possible columns. And so forth.

Not sure how to analyze the performance of this one...



回答3:

Here's what I would try. Given an m by n matrix A, compare the value X with the entry A(m/2,n/2) (use floors if necessary).

If A(m/2,n/2) == X, done.

If A(m/2,n/2) < X, then there are 3 smaller matrices to check:

A(1:m/2, 1:n/2) 
A(1:m/2, n/2+1:n) 
A(m/2+1:m, 1:n/2) 

If A(m/2,n/2) > X, , then there are 3 smaller matrices to check:

A(m/2:m, n/2:n) 
A(1:m/2, n/2+1:n) 
A(m/2+1:m, 1:n/2) 

You can eliminate two of them (not always) by comparing the value to the smallest value in the corresponding matrix (the upper left value). Then you recursively try to find the value in each of the remaining matrices.

The complexity of this is approximately O((n*m)^0.8).


A row and column sorted matrix is a special case of a Young tableau. I did a google search for searching a Young tableau and found this article which gives 3 nice approaches (the first (and worst) of which is the one I described above).



回答4:

For a comparison based algorithm, O(lg(m) + lg(n)) queries is optimal.

Proof

For a comparison based query, each query can only have two results: true or false. An obvious extension of this is that for N queries you can have at most 2N results. Therefore, using N queries, you can only locate elements in a matrix with at most 2N elements.

How many queries then are required to search an m x n matrix? Just solve for N.

2N = mn
lg(2N) = lg(mn)
N = lg(m) + lg(n)

Therefore lg(m) + lg(n) queries is optimal.

Non-comparison based queries

That proof is conclusive, but only for comparison based queries. If you query the matrix in a way that doesn't involve comparisons then you can get near-constant time if you know the distribution of values. I won't give you an algorithm, but I would suggest looking at Radix sort as it contains the kind of non-comparison based techniques that are required to beat the lg(m) + lg(n) lower bound.



回答5:

Reading the previous comments I came up with this algorithm. It basically suppose that by starting in the upper-right corner, the matrix can be used as a BST with some "loops" (we don't care about this loops).

1 4 9
5 6 10

     9
    /  \
   4   10
  / \  /
 1   6
  \ /
   5

This algorithm is the same as searching in a BST and is really easy to understand. The worst case run time is O( n + m ).

public static boolean search( int[][] matrix, int value )
{   
    int rows = matrix.length;
    int columns = matrix[0].length;

    int i = 0;
    int j = columns - 1;

    while( i < rows
           && j >= 0 )
    {   
        if( matrix[i][j] == value )
        {
            System.out.println( "Found at " + i + " " + j );
            return true;
        }

        if( matrix[i][j] < value )
        {
            j--;
        }
        else
        {
            i++;
        }
    }

    return false;
}


回答6:

boolean FindElem(int[][] mat, int elem, int M, int N) {
 int row = 0;
 int col = N-1;
 while (row < M && col >= 0) {
   if (mat[row][col] == elem) {
     return true;
   } else if (mat[row][col] > elem) {
     col--;
   } else {
     row++;
   }
 }
   return false;
}


回答7:

I believe that your algorithm does not have the time bounds that you believe that it does.

To see this, for simplicity let's assume that your grid is an n x n square (let's call this size m). If I can derive a different time bound than O(log n) in this case, I can argue that what you have should not be correct.

In particular, notice that in the worst case you make three recursive calls to problems that are of size (n / 2) x (n / 2) = m / 4. This means that we have the recurrence

T(1) = 1
T(m) = 3T(m / 4) + O(1)

Using the Master Theorem, the runtime for this function would be O(mlog43) = O(n2 log43) = O(n log49) ≈ O(n1.5849625). This is ω(log n + log m); that is, it's asymptotically strictly greater.

As many other people have posted, there are several well-known algorithms that run in O(m + n) that are based by walking one step in the right direction at each step. Consequently, correctness aside, I would not advise using the algorithm you've posted.



回答8:

Given elements in 2d array(be a[n][m]) are increasing horizontally and vertically . So for the given question we need to find the index of the element first. So if we can find the element in quicker way then we can optimize the solution . The question is how do we find it in efficient way. One approach is take the middle element of the matrix and check given element with it

if given element is lesser than middle element then our solution lies in matrix a[0][0] to a[n/2][m/2] because all elements to right and below are greater than middle (since given element is less than middle element) so we have reduced our search space from a[n][m] to a[n/2][m/2] which is one fourth of original size.

if given element is greater than middle element then our solution doesnt lies in matrices a[0][0] to a[n/2][m/2] because all elements to left and above are lesser than middle (since given element is greater than middle element) so our search space is total array minus a[0][0] to a[n/2][m/2] which is three-fourth of original size. Total array minus a[0][0] to a[n/2][m/2] means ,there will be three recursive call with array index

--------->a[0][m/2](start index) to a[n/2][m](end index)  
--------->a[n/2][0](start index) to a[n][m/2](end index)
--------->a[n/2][m/2](start index) to a[n][m](end index)

Now recursively call the same function based on our search space.

Time complexity of our function will as follows . NOTE: In time function n represents total number of element but not no of rows as mentioned .n=(no_of_rows)*(no_of_columns)

                _________________T(n/4)  if given element is less than middle of the array.
               / 
              /
T(n)==========------------------- 1 if n=1 (if element found)
              \
               \_________________3T(n/4) if given element is greater than middle element of array

so out time function will

T(n)=3T(n/4) or T(n)=T(n/4)

In worst case T(n)=3T(n/4)
               T(n)=3{3T(n/4)}
               T(n)=3power(i)T(n/(4)poweri)     equation------> (1) 

But T(1)=1 (guessing given elemet is found in array)

so  n/(4power(i))=1
====> n=2power(2*i)
====> n=2power(2*i)
Talking log to base 2 on both sides (log[n])/2=i ====> i=log(sqrt(n))

substituting equation 1 we get

T(n)=3power(log[sqrt(n)])
T(n)∈ θ( nlog sqrt(3) )..

To my knowledge this is the algorithm that takes minimum number of comparisons.



回答9:

JavaScript solution:

//start from the top right corner
//if value = el, element is found
//if value < el, move to the next row, element can't be in that row since row is sorted
//if value > el, move to the previous column, element can't be in that column since column is sorted

function find(matrix, el) {

  var row = 0; //first row
  var col = matrix[0].length - 1; //last column

  while (row < matrix.length && col >= 0) {
    if (matrix[row][col] === el) { //element is found
      return true;
    } else if (matrix[row][col] < el) {
      row++; //move to the next row
    } else {
      col--; //move to the previous column
    }
  }

  return false;

}