This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.
This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.
For example, the longest substring in [interspecies, interstelar, interstate]
is "inters". However, I don't need to find "ific" in [specifics, terrific]
.
I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:
function common_substring(data) {
var i, ch, memo, idx = 0
do {
memo = null
for (i=0; i < data.length; i++) {
ch = data[i].charAt(idx)
if (!ch) break
if (!memo) memo = ch
else if (ch != memo) break
}
} while (i == data.length && idx < data.length && ++idx)
return (data[0] || '').slice(0, idx)
}
This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:
$ git clone git://gist.github.com/257891.git substring-challenge
I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.
I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the &
operator on String:
# works with Ruby 1.8.7 and above
class String
def &(other)
difference = other.to_str.each_char.with_index.find { |ch, idx|
self[idx].nil? or ch != self[idx].chr
}
difference ? self[0, difference.last] : self
end
end
class Array
def common_substring
self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
end
end
Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.
Update: my favorite solutions
I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.
Other great solutions:
- "regex greed" by FM takes a minute or two to grasp, but then the elegance of it hits you. Yehuda Katz also made a regex solution, but it's more complex
commonprefix
in Python — Roberto Bonvallet used a feature made for handling filesystem paths to solve this problem- Haskell one-liner is short as if it were compressed, and beautiful
- the straightforward Ruby one-liner
Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).
My Haskell one-liner:
EDIT: barkmadley gave a good explanation of the code below. I'd also add that haskell uses lazy evaluation, so we can be lazy about our use of
transpose
; it will only transpose our lists as far as necessary to find the end of the common prefix.Fun alternative Ruby solution:
It might help speed if you used
chars = strings.sort_by(&:length).map(&:chars)
, since the shorter the first string, the shorter the arrays created byzip
. However, if you cared about speed, you probably shouldn't use this solution anyhow. :)It's not code golf, but you asked for somewhat elegant, and I tend to think recursion is fun. Java.
Here's a solution using regular expressions in Ruby:
Instead of sorting, you could just get the min and max of the strings.
To me, elegance in a computer program is a balance of speed and simplicity. It should not do unnecessary computation, and it should be simple enough to make its correctness evident.
I could call the sorting solution "clever", but not "elegant".
Here's an efficient solution in ruby. I based the idea of the strategy for a hi/lo guessing game where you iteratively zero in on the longest prefix.
Someone correct me if I'm wrong, but I think the complexity is O(n log n), where n is the length of the shortest string and the number of strings is considered a constant.
And some checks that it works: