Find the longest common starting substring in a se

2020-01-25 03:41发布

This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.

This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.

For example, the longest substring in [interspecies, interstelar, interstate] is "inters". However, I don't need to find "ific" in [specifics, terrific].

I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:

function common_substring(data) {
  var i, ch, memo, idx = 0
  do {
    memo = null
    for (i=0; i < data.length; i++) {
      ch = data[i].charAt(idx)
      if (!ch) break
      if (!memo) memo = ch
      else if (ch != memo) break
    }
  } while (i == data.length && idx < data.length && ++idx)

  return (data[0] || '').slice(0, idx)
}

This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:

$ git clone git://gist.github.com/257891.git substring-challenge

I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.

I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the & operator on String:

# works with Ruby 1.8.7 and above
class String
  def &(other)
    difference = other.to_str.each_char.with_index.find { |ch, idx|
      self[idx].nil? or ch != self[idx].chr
    }
    difference ? self[0, difference.last] : self
  end
end

class Array
  def common_substring
    self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
  end
end

Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.

Update: my favorite solutions

I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.

Other great solutions:

Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).

30条回答
你好瞎i
2楼-- · 2020-01-25 04:13

A ruby version based on @Svante's algorithm. Runs ~3x as fast as my first one.

 def common_prefix set 
   i=0
   rest=set[1..-1]
   set[0].each_byte{|c|
     rest.each{|e|return set[0][0...i] if e[i]!=c}
     i+=1
   }
   set
 end
查看更多
放荡不羁爱自由
3楼-- · 2020-01-25 04:14

My solution in Java:

public static String compute(Collection<String> strings) {
    if(strings.isEmpty()) return "";
    Set<Character> v = new HashSet<Character>();
    int i = 0;
    try {
        while(true) {
            for(String s : strings) v.add(s.charAt(i));
            if(v.size() > 1) break;
            v.clear();
            i++;
        }
    } catch(StringIndexOutOfBoundsException ex) {}
    return strings.iterator().next().substring(0, i);
}
查看更多
一纸荒年 Trace。
4楼-- · 2020-01-25 04:16

The accepted solution is broken (for example, it returns a for strings like ['a', 'ba']). The fix is very simple, you literally have to change only 3 characters (from indexOf(tem1) == -1 to indexOf(tem1) != 0) and the function would work as expected.

Unfortunately, when I tried to edit the answer to fix the typo, SO told me that "edits must be at least 6 characters". I could change more then those 3 chars, by improving naming and readability but that feels like a little bit too much.

So, below is a fixed and improved (at least from my point of view) version of the kennebec's solution:

function commonPrefix(words) {
  max_word = words.reduce(function(a, b) { return a > b ? a : b });
  prefix   = words.reduce(function(a, b) { return a > b ? b : a }); // min word

  while(max_word.indexOf(prefix) != 0) {
    prefix = prefix.slice(0, -1);
  }

  return prefix;
}

(on jsFiddle)

Note, that it uses reduce method (JavaScript 1.8) in order to find alphanumeric max / min instead of sorting the array and then fetching the first and the last elements of it.

查看更多
够拽才男人
5楼-- · 2020-01-25 04:16

While reading these answers with all the fancy functional programming, sorting and regexes and whatnot, I just thought: what's wrong a little bit of C? So here's a goofy looking little program.

#include <stdio.h>

int main (int argc, char *argv[])
{
  int i = -1, j, c;

  if (argc < 2)
    return 1;

  while (c = argv[1][++i])
    for (j = 2; j < argc; j++)
      if (argv[j][i] != c)
        goto out;

 out:
  printf("Longest common prefix: %.*s\n", i, argv[1]);
}

Compile it, run it with your list of strings as command line arguments, then upvote me for using goto!

查看更多
手持菜刀,她持情操
6楼-- · 2020-01-25 04:19

It's a matter of taste, but this is a simple javascript version: It sorts the array, and then looks just at the first and last items.

//longest common starting substring in an array

function sharedStart(array){
    var A= array.concat().sort(), 
    a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;
    while(i<L && a1.charAt(i)=== a2.charAt(i)) i++;
    return a1.substring(0, i);
}

DEMOS

sharedStart(['interspecies', 'interstelar', 'interstate'])  //=> 'inters'
sharedStart(['throne', 'throne'])                           //=> 'throne'
sharedStart(['throne', 'dungeon'])                          //=> ''
sharedStart(['cheese'])                                     //=> 'cheese'
sharedStart([])                                             //=> ''
sharedStart(['prefix', 'suffix'])                           //=> ''
查看更多
可以哭但决不认输i
7楼-- · 2020-01-25 04:21

Javascript clone of AShelly's excellent answer.

Requires Array#reduce which is supported only in firefox.

var strings = ["interspecies", "intermediate", "interrogation"]
var sub = strings.reduce(function(l,r) { 
    while(l!=r.slice(0,l.length)) {  
        l = l.slice(0, -1);
    }
    return l;
});
查看更多
登录 后发表回答