Given a matrix of size mxn
containing 0's and 1's only. I need to find the largest sub-matrix which has equal number of 1's and 0's in it. Brute force approach would be O(m^2*n^2)
Can we do any better than this?
I tried applying dynamic programming, but couldn't find any optimal substructure to it.
I believe a similar one-dimensional version of this problem was discussed here:
Space-efficient algorithm for finding the largest balanced subarray?
which has an O(n)
solution using some extra space.
This algorithm assumes that we search a sub-matrix with contiguous rows and columns and with the largest possible product of height and width.
Start with the following pre-processing:
A = substitute_zero_with_minus_one(InputMatrix)
B = prefix_sum_for_each_column(A)
C = prefix_sum_for_each_row(B)
Now for each pair of rows (i, j) do the folowing:
for each column k:
d = C[k, j] - C[k, i]
if h[d] not empty:
if (k - h[d]) * (j - i) is greater than best result:
update best result
else:
h[d] = k
Time complexity is O(N2 * M), extra space is O(N * M).
We assume m < n, and we can have an O(M * M * N) algorithm.
And if we replace all 0 by -1, we just find the largest submaxtrix whose sum is 0.
- Get the sum of segments(i, j) in each row, we define them c1, c2,
c3....,cn, we can run O(n) algorithm on it by the algorithm you
referenced.
- We should do step 1 M * M times to get the largest submaxtrix whose sum is 0, so the complexity is O(M * M * N).
I assume a submatrice is formed using only contiguous rows\columns of the
original matrix (ie by removing first\last row or columns).
This way, a matrix can be represented as
Mat = {origin(row,col), rowCount, columnCount}
If the original matrix is of dimension M x N, then
rowCount = M - row
columnCount = N - col
Mat = {origin(row,col), M - row, N - col}.
The variable row
and col
has respectively M
and N
possible values, which means
there are O(MxN)
of such matrices.
Idea of Algorithm
- pop matrix
(m, n)
from queue
- Test . If success, output matrix
- Construct all matrices
(m, n-1)
and (m-1, n)
and put on queue
- go back to 1.
Now there are two points:
- When you reduce dimensions there are only 4 possible matrices (2 by removing rows, 2 by removing columns)
- You construct a sub matrix by remove either the first and last row\column. You just need remove the count for the row\column you removed, which takes
O(n)
or O(m)
time. this is the dynamic programming step.
Which mean the complexity is O(max(M,N)MN)
I created a small application that demonstrates an search algorithm optimization. Please let me know if this is what you are looking for.
Notes:
- The program creates a square matrix
- For the easiness of reading, I'm using collections to work with the data. I believe that there is an overload in the processing, but I'm just trying to point out the principle.
Here it is:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
class Program
{
class Matrix
{
public int[][] JaggedInteger2DMatrix { get; set; }
public List<MatrixCell> Cells { get; set; }
public List<MatrixLine> Lines { get; set; }
public int Width { get; set; }
public int Height { get; set; }
public Matrix(int size, int seed)
{
var r = new Random(seed);
int[][] jaggedInteger2DMatrix = new int[size][];
for (int i = 0; i < size; i++)
{
jaggedInteger2DMatrix[i] = new int[size];
for (int j = 0; j < size; j++)
{
jaggedInteger2DMatrix[i][j] = r.Next(2);
//Console.Write(jaggedInteger2DMatrix[i][j]+" ");
}
//Console.Write("\r\n");
}
InitializeMatrix(jaggedInteger2DMatrix);
}
public Matrix(int[][] jaggedInteger2DMatrix)
{
InitializeMatrix(jaggedInteger2DMatrix);
}
private void InitializeMatrix(int[][] jaggedInteger2DMatrix)
{
JaggedInteger2DMatrix = jaggedInteger2DMatrix;
Height = jaggedInteger2DMatrix.GetLength(0);
Width = jaggedInteger2DMatrix[0].GetLength(0);
Cells = new List<MatrixCell>();
Lines = new List<MatrixLine>();
int horizontalLineCounter = 0;
MatrixCell matrixCell = null;
foreach (var horizontalLine in jaggedInteger2DMatrix)
{
int verticalLineCounter = 0;
foreach (var cell in horizontalLine)
{
matrixCell = new MatrixCell()
{
HorizontalLineIndex = horizontalLineCounter,
Value = cell,
VerticalLineIndex = verticalLineCounter
};
Cells.Add(matrixCell);
if (Lines.Where(line => line.LineType == Line.Vertical && line.LineIndex == verticalLineCounter).Count() == 0)
{
var line = new MatrixLine()
{
LineType = Line.Vertical,
LineIndex = verticalLineCounter
};
Lines.Add(line);
}
Lines.Where(line => line.LineType == Line.Vertical && line.LineIndex == verticalLineCounter).FirstOrDefault().Cells.Add(matrixCell);
if (Lines.Where(line => line.LineType == Line.Horizontal && line.LineIndex == horizontalLineCounter).Count() == 0)
{
var line = new MatrixLine()
{
LineType = Line.Horizontal,
LineIndex = horizontalLineCounter
};
Lines.Add(line);
}
Lines.Where(line => line.LineType == Line.Horizontal && line.LineIndex == horizontalLineCounter).FirstOrDefault().Cells.Add(matrixCell);
verticalLineCounter++;
}
horizontalLineCounter++;
}
}
}
class MatrixCell
{
public int Value { get; set; }
public int VerticalLineIndex { get; set; }
public int HorizontalLineIndex { get; set; }
}
class MatrixLine
{
public Line LineType { get; set; }
public int LineIndex { get; set; }
public List<MatrixCell> Cells { get; set; }
public MatrixLine()
{
Cells = new List<MatrixCell>();
}
}
enum Line
{
Horizontal,
Vertical
}
private static void Search(Matrix matrix, bool optimizeCellCount, out IEnumerable<MatrixCell> optimizedSelection, out int iterations)
{
optimizedSelection = null;
var count = 0;
iterations = 0;
for (int i = 0; i < matrix.Width; i++)
{
for (int j = 1; j <= matrix.Width; j++)
{
var selectedVerticalLines = matrix.Lines.Where(line => line.LineType == Line.Vertical).Skip(i).Take(j);
for (int k = 0; k < matrix.Height; k++)
{
for (int l = 1; l <= matrix.Height; l++)
{
/**
* Here's where the search is optimized
**********************************************************************************************
*/
if (optimizeCellCount)
{
//if the submatrix cell count is smaller than the current count, break the iteration
if (count > Math.Min(Math.Abs(matrix.Height - k), l) * Math.Min(Math.Abs(matrix.Height - i), j))
{
continue;
}
}
/*
**********************************************************************************************
*/
iterations++;
var selectedHorizontalLines = matrix.Lines.Where(line => line.LineType == Line.Horizontal).Skip(k).Take(l);
var horizontalCells = selectedHorizontalLines.Aggregate<MatrixLine, List<MatrixCell>>(new List<MatrixCell>(), (a, b) =>
{
a.AddRange(b.Cells);
return a;
});
var verticalCells = selectedVerticalLines.Aggregate<MatrixLine, List<MatrixCell>>(new List<MatrixCell>(), (a, b) =>
{
a.AddRange(b.Cells);
return a;
});
var cells = horizontalCells.Intersect(verticalCells);
if (cells.Count() > count)
{
var sum = cells.Sum(t => t.Value);
var cellsCount = cells.Count();
if (sum != 0)
{
if (cellsCount / (double)sum == 2)
{
//match
optimizedSelection = cells;
count = cellsCount;
}
}
}
}
}
}
}
}
private static float GetLineCost(int width, int startPosition, int length)
{
float cost = 0;
for (int i = startPosition; i < length; i++)
{
cost += Math.Min(Math.Abs(width - i), i + 1);
}
return cost;
}
static void Main(string[] args)
{
Matrix matrix = new Matrix(20, 1);
bool optimizeCellCount = true;
IEnumerable<MatrixCell> optimizedSelection;
int iterations;
var watch = new System.Diagnostics.Stopwatch();
//optimized search
watch.Start();
Search(matrix, optimizeCellCount, out optimizedSelection, out iterations);
watch.Stop();
Console.WriteLine("Full Optimized Search");
Console.WriteLine("position: [{0},{1}],[{2},{3}] size : {4} search time : {5} iterations: {6}",
optimizedSelection.Min(cell => cell.VerticalLineIndex),
optimizedSelection.Min(cell => cell.HorizontalLineIndex),
optimizedSelection.Max(cell => cell.VerticalLineIndex),
optimizedSelection.Max(cell => cell.HorizontalLineIndex),
optimizedSelection.Count(),
watch.Elapsed,
iterations
);
watch.Reset();
//no optimization
watch.Start();
Search(matrix, !optimizeCellCount, out optimizedSelection, out iterations);
watch.Stop();
Console.WriteLine("Non-Optimized Search");
Console.WriteLine("position: [{0},{1}],[{2},{3}] size : {4} search time : {5} iterations: {6}",
optimizedSelection.Min(cell => cell.VerticalLineIndex),
optimizedSelection.Min(cell => cell.HorizontalLineIndex),
optimizedSelection.Max(cell => cell.VerticalLineIndex),
optimizedSelection.Max(cell => cell.HorizontalLineIndex),
optimizedSelection.Count(),
watch.Elapsed,
iterations
);
watch.Reset();
//Console Output:
/***************************************************************************************
* Full Optimized Search
* position: [9,1],[18,19] size : 190 search time : 00:00:02.3963657 iterations: 19108
* Non-Optimized Search
* position: [9,1],[18,19] size : 190 search time : 00:00:05.8385388 iterations: 160000
****************************************************************************************/
}
}