So I've been trying to solve a Leetcode Question, "Given a string, find the length of the longest substring without repeating characters."
For example
Input: "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.
Currently I optimized my algorithm when it comes to figuring out if the substring is unique by using a hash table. However my code still runs in O(n^2) runtime, and as a result exceeds the time limit during submissions.
What i try to do is to essentially go through every single possible substring and check if it has any duplicate values. Am I as efficient as it gets when it comes to the brute force method here? I know there's other methods such as a sliding window method but I'm trying to get the brute force method down first.
# @param {String} s
# @return {Integer}
def length_of_longest_substring(s)
max_length = 0
max_string = ""
n = s.length
for i in (0..n-1)
for j in (i..n-1)
substring = s[i..j]
#puts substring
if unique(substring)
if substring.length > max_length
max_length = substring.length
max_string = substring
end
end
end
end
return max_length
end
def unique(string)
hash = Hash.new(false)
array = string.split('')
array.each do |char|
if hash[char] == true
return false
else
hash[char] = true
end
end
return true
end
Approach
Here is a way of doing that with a hash that maps characters to indices. For a string s
, suppose the characters in the substring s[j..j+n-1]
are unique, and therefore the substring is a candidate for the longest unique substring. The next element is therefore e = s[j+n]
We wish to determine if s[j..j+n-1]
includes e
. If it does not we can append e
to the substring, keeping it unique.
If s[j..j+n-1]
includes e
, we determine if n
(the size of the substring) is greater than the length of the previously-known substring, and update our records if it is. To determine if s[j..j+n-1]
includes e
, we could perform a linear search of the substring, but it is faster to maintain a hash c_to_i
whose key-value pairs are s[i]=>i
, i = j..j_n-1
. That is, c_to_i
maps the characters in the substring to their indices in full string s
. That way we can merely evaluate c_to_i.key?(e)
to see if the substring contains e
. If the substring includes e
, we use c_to_i
to determine its index in s
and add one: j = c_to_i[e] + 1
. The new substring is therefore s[j..j+n-1]
with the new value of j
. Note that several characters of s
may be skipped in this step.
Regardless of whether the substring contained e
, we must now append e
to the (possibly-updated) substring, so that it becomes s[j..j+n]
.
Code
def longest_no_repeats(str)
c_to_i = {}
longest = { length: 0, end: nil }
str.each_char.with_index do |c,i|
j = c_to_i[c]
if j
longest = { length: c_to_i.size, end: i-1 } if
c_to_i.size > longest[:length]
c_to_i.reject! { |_,k| k <= j }
end
c_to_i[c] = i
end
c_to_i.size > longest[:length] ? { length: c_to_i.size, end: str.size-1 } :
longest
end
Example
a = ('a'..'z').to_a
#=> ["a", "b",..., "z"]
str = 60.times.map { a.sample }.join
#=> "ekgdaxxzlwbxixhlfbpziswcoelplhobivoygmupdaexssbuuawxmhprkfms"
longest = longest_no_repeats(str)
#=> {:length=>14, :end=>44}
str[0..longest[:end]]
#=> "ekgdaxxzlwbxixhlfbpziswcoelplhobivoygmupdaexs"
str[longest[:end]-longest[:length]+1,longest[:length]]
#=> "bivoygmupdaexs"
Efficiency
Here is a benchmark comparison to @mechnicov's code:
require 'benchmark/ips'
a = ('a'..'z').to_a
arr = 50.times.map { 1000.times.map { a.sample }.join }
Benchmark.ips do |x|
x.report("mechnicov") { arr.sum { |s| max_non_repeated(s)[:length] } }
x.report("cary") { arr.sum { |s| longest_no_repeats(s)[:length] } }
x.compare!
end
displays:
Comparison:
cary: 35.8 i/s
mechnicov: 0.0 i/s - 1198.21x slower
From your link:
Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
That means you need first non-repeated substring.
I suggest here is such method
def max_non_repeated(string)
max_string = string.
each_char.
map.with_index { |_, i| string[i..].split('') }.
map do |v|
ary = []
v.each { |l| ary << l if ary.size == ary.uniq.size }
ary.uniq.join
end.
max
{
string: max_string,
length: max_string.length
}
end
max_non_repeated('pwwkew')[:string] #=> "wke"
max_non_repeated('pwwkew')[:length] #=> 3
In Ruby < 2.6 use [i..-1]
instead of [i..]