可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I got a list of files and directories List<string> pathes
. Now I'd like to calculate the deepest common branch every path is sharing with each other.
We can assume that they all share a common path, but this is unknown in the beginning.
Let's say I have the following three entries:
- C:/Hello/World/This/Is/An/Example/Bla.cs
- C:/Hello/World/This/Is/Not/An/Example/
- C:/Hello/Earth/Bla/Bla/Bla
This should get the result: C:/Hello/ as Earth is breaking this "chain" of subdirectories.
Second example:
- C:/Hello/World/This/Is/An/Example/Bla.cs
- C:/Hello/World/This/Is/Not/An/Example/
-> C:/Hello/World/This/Is/
How would you proceed? I tried to use string.split(@"/") and start with the first string and check if every part of this array is contained in the other strings. However, this would be a very expensive call as I'm iterating (list_of_entries)^list_of_entries. Is there any better solution available?
My current attempt would be something like the following (C# + LINQ):
public string CalculateCommonPath(IEnumerable<string> paths)
{
int minSlash = int.MaxValue;
string minPath = null;
foreach (var path in paths)
{
int splits = path.Split('\\').Count();
if (minSlash > splits)
{
minSlash = splits;
minPath = path;
}
}
if (minPath != null)
{
string[] splits = minPath.Split('\\');
for (int i = 0; i < minSlash; i++)
{
if (paths.Any(x => !x.StartsWith(splits[i])))
{
return i >= 0 ? splits.Take(i).ToString() : "";
}
}
}
return minPath;
}
回答1:
A function to get the longest common prefix may look like this:
public static string GetLongestCommonPrefix(string[] s)
{
int k = s[0].Length;
for (int i = 1; i < s.Length; i++)
{
k = Math.Min(k, s[i].Length);
for (int j = 0; j < k; j++)
if (s[i][j] != s[0][j])
{
k = j;
break;
}
}
return s[0].Substring(0, k);
}
Then you may need to cut the prefix on the right hand. E.g. we want to return c:/dir
instead of c:/dir/file
for
c:/dir/file1
c:/dir/file2
You also may want to normalize the paths before processing. See Normalize directory names in C#.
回答2:
I dont know whether this is the best performing solution (probably not), but it surely is very easy to implement.
- Sort your list alphabetically
- compare the first entry in that sorted list to the last in that list, character by character, and terminate when you find a difference (the value before the termination is the longest shared substring of both those strings)
Sample Fiddle
Sample code:
List<string> paths = new List<string>();
paths.Add(@"C:/Hello/World/This/Is/An/Example/Bla.cs");
paths.Add(@"C:/Hello/World/This/Is/Not/An/Example/");
paths.Add(@"C:/Hello/Earth/Bla/Bla/Bla");
List<string> sortedPaths = paths.OrderBy(s => s).ToList();
Console.WriteLine("Most common path here: {0}", sharedSubstring(sortedPaths[0], sortedPaths[sortedPaths.Count - 1]));
And that function of course:
public static string sharedSubstring(string string1, string string2)
{
string ret = string.Empty;
int index = 1;
while (string1.Substring(0, index) == string2.Substring(0, index))
{
ret = string1.Substring(0, index);
index++;
}
return ret;
} // returns an empty string if no common characters where found
回答3:
I would iterate over each character in the first path, comparing it with every character in every path (except the first) in the collection of paths:
public string FindCommonPath(List<string> paths)
{
string firstPath = paths[0];
bool same = true;
int i = 0;
string commonPath = string.Empty;
while (same && i < firstPath.Length)
{
for (int p = 1; p < paths.Count && same; p++)
{
same = firstPath[i] == paths[p][i];
}
if (same)
{
commonPath += firstPath[i];
}
i++;
}
return commonPath;
}
You could iterate through the list first to find the shortest path and possibly improve it slightly.
回答4:
First sort the list with the paths to inspect. Then you can split and compare the first and the last item - if they are same proceed to the next dimension until you find a difference.
So you just need to sort once and then inspect two items.
回答5:
To return c:/dir
for
c:/dir/file1
c:/dir/file2
I would code it this way:
public static string GetLongestCommonPrefix(params string[] s)
{
return GetLongestCommonPrefix((ICollection<string>)s);
}
public static string GetLongestCommonPrefix(ICollection<string> paths)
{
if (paths == null || paths.Count == 0)
return null;
if (paths.Count == 1)
return paths.First();
var allSplittedPaths = paths.Select(p => p.Split('\\')).ToList();
var min = allSplittedPaths.Min(a => a.Length);
var i = 0;
for (i = 0; i < min; i++)
{
var reference = allSplittedPaths[0][i];
if (allSplittedPaths.Any(a => !string.Equals(a[i], reference, StringComparison.OrdinalIgnoreCase)))
{
break;
}
}
return string.Join("\\", allSplittedPaths[0].Take(i));
}
And here are some tests for it:
[TestMethod]
public void GetLongestCommonPrefixTest()
{
var str1 = @"C:\dir\dir1\file1";
var str2 = @"C:\dir\dir1\file2";
var str3 = @"C:\dir\dir1\file3";
var str4 = @"C:\dir\dir2\file3";
var str5 = @"C:\dir\dir1\file1\file3";
var str6 = @"C:\dir\dir1\file1\file3";
var res = Utilities.GetLongestCommonPrefix(str1, str2, str3);
Assert.AreEqual(@"C:\dir\dir1", res);
var res2 = Utilities.GetLongestCommonPrefix(str1, str2, str3, str4);
Assert.AreEqual(@"C:\dir", res2);
var res3 = Utilities.GetLongestCommonPrefix(str1, str2, str3, str5);
Assert.AreEqual(@"C:\dir\dir1", res3);
var res4 = Utilities.GetLongestCommonPrefix(str5, str6);
Assert.AreEqual(@"C:\dir\dir1\file1\file3", res4);
var res5 = Utilities.GetLongestCommonPrefix(str5);
Assert.AreEqual(str5, res5);
var res6 = Utilities.GetLongestCommonPrefix();
Assert.AreEqual(null, res6);
var res7 = Utilities.GetLongestCommonPrefix(null);
Assert.AreEqual(null, res7);
}