Find the longest common starting substring in a se-第5页回答

This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.

This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.

For example, the longest substring in [interspecies, interstelar, interstate] is "inters". However, I don't need to find "ific" in [specifics, terrific].

I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:

function common_substring(data) {
  var i, ch, memo, idx = 0
  do {
    memo = null
    for (i=0; i < data.length; i++) {
      ch = data[i].charAt(idx)
      if (!ch) break
      if (!memo) memo = ch
      else if (ch != memo) break
    }
  } while (i == data.length && idx < data.length && ++idx)

  return (data[0] || '').slice(0, idx)
}

This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:

$ git clone git://gist.github.com/257891.git substring-challenge

I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.

I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the & operator on String:

# works with Ruby 1.8.7 and above
class String
  def &(other)
    difference = other.to_str.each_char.with_index.find { |ch, idx|
      self[idx].nil? or ch != self[idx].chr
    }
    difference ? self[0, difference.last] : self
  end
end

class Array
  def common_substring
    self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
  end
end

Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.

Update: my favorite solutions

I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.

Other great solutions:

"regex greed" by FM takes a minute or two to grasp, but then the elegance of it hits you. Yehuda Katz also made a regex solution, but it's more complex
commonprefix in Python — Roberto Bonvallet used a feature made for handling filesystem paths to solve this problem
Haskell one-liner is short as if it were compressed, and beautiful
the straightforward Ruby one-liner

Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).

标签： javascript python ruby haskell longest-prefix

30条回答

我命由我不由天

2楼-- · 2020-01-25 04:27

Yet another way to do it: use regex greed.

words = %w(interspecies interstelar interstate)
j = '='
str = ['', *words].join(j)
re = "[^#{j}]*"

str =~ /\A
    (?: #{j} ( #{re} ) #{re} )
    (?: #{j}    \1     #{re} )*
\z/x

p $1

And the one-liner, courtesy of mislav (50 characters):

p ARGV.join(' ').match(/^(\w*)\w*(?: \1\w*)*$/)[1]

0人赞添加讨论(0) 举报

虎瘦雄心在

3楼-- · 2020-01-25 04:27

Python 2.6 (r26:66714, Oct  4 2008, 02:48:43) 

>>> a = ['interspecies', 'interstelar', 'interstate']

>>> print a[0][:max(
        [i for i in range(min(map(len, a))) 
            if len(set(map(lambda e: e[i], a))) == 1]
        ) + 1]

inters

i for i in range(min(map(len, a))), number of maximum lookups can't be greater than the length of the shortest string; in this example this would evaluate to [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
len(set(map(lambda e: e[i], a))), 1) create an array of the i-thcharacter for each string in the list; 2) make a set out of it; 3) determine the size of the set
[i for i in range(min(map(len, a))) if len(set(map(lambda e: e[i], a))) == 1], include just the characters, for which the size of the set is 1 (all characters at that position were the same ..); here it would evaluate to [0, 1, 2, 3, 4, 5]
finally take the max, add one, and get the substring ...

Note: the above does not work for a = ['intersyate', 'intersxate', 'interstate', 'intersrate'], but this would:

 >>> index = len(
         filter(lambda l: l[0] == l[1], 
             [ x for x in enumerate(
                 [i for i in range(min(map(len, a))) 
                     if len(set(map(lambda e: e[i], a))) == 1]
         )]))
 >>> a[0][:index]
 inters

0人赞添加讨论(0) 举报

趁早两清

4楼-- · 2020-01-25 04:27

My Javascript solution:

IMOP, using sort is too tricky. My solution is compare letter by letter through looping the array. Return string if letter is not macthed.

This is my solution:

var longestCommonPrefix = function(strs){
    if(strs.length < 1){
        return '';
    }

    var p = 0, i = 0, c = strs[0][0];

    while(p < strs[i].length && strs[i][p] === c){
        i++;
        if(i === strs.length){
            i = 0;
            p++;
            c = strs[0][p];
        }
    }

    return strs[0].substr(0, p);
};

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

5楼-- · 2020-01-25 04:30

Just for the fun of it, here's a version written in (SWI-)PROLOG:

common_pre([[C|Cs]|Ss], [C|Res]) :-
  maplist(head_tail(C), [[C|Cs]|Ss], RemSs), !,
  common_pre(RemSs, Res).
common_pre(_, []).

head_tail(H, [H|T], T).

Running:

?- S=["interspecies", "interstelar", "interstate"], common_pre(S, CP), string_to_list(CPString, CP).

Gives:

CP = [105, 110, 116, 101, 114, 115],
CPString = "inters".

Explanation:

(SWI-)PROLOG treats strings as lists of character codes (numbers). All the predicate common_pre/2 does is recursively pattern-match to select the first code (C) from the head of the first list (string, [C|Cs]) in the list of all lists (all strings, [[C|Cs]|Ss]), and appends the matching code C to the result iff it is common to all (remaining) heads of all lists (strings), else it terminates.

Nice, clean, simple and efficient... :)

0人赞添加讨论(0) 举报

ゆ、 Hurt°

6楼-- · 2020-01-25 04:30

Oftentimes it's more elegant to use a mature open source library instead of rolling your own. Then, if it doesn't completely suit your needs, you can extend it or modify it to improve it, and let the community decide if that belongs in the library.

diff-lcs is a good Ruby gem for least common substring.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

7楼-- · 2020-01-25 04:30

This is by no means elegant, but if you want concise:

Ruby, 71 chars

def f(a)b=a[0];b[0,(0..b.size).find{|n|a.any?{|i|i[0,n]!=b[0,n]}}-1]end

If you want that unrolled it looks like this:

def f(words)
  first_word = words[0];
  first_word[0, (0..(first_word.size)).find { |num_chars|
    words.any? { |word| word[0, num_chars] != first_word[0, num_chars] }
  } - 1]
end

0人赞添加讨论(0) 举报

上一页 1 2 3 4 5

Find the longest common starting substring in a se

Update: my favorite solutions

Ruby, 71 chars

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间