I got a list of files and directories List<string> pathes
. Now I'd like to calculate the deepest common branch every path is sharing with each other.
We can assume that they all share a common path, but this is unknown in the beginning.
Let's say I have the following three entries:
- C:/Hello/World/This/Is/An/Example/Bla.cs
- C:/Hello/World/This/Is/Not/An/Example/
- C:/Hello/Earth/Bla/Bla/Bla
This should get the result: C:/Hello/ as Earth is breaking this "chain" of subdirectories.
Second example:
- C:/Hello/World/This/Is/An/Example/Bla.cs
- C:/Hello/World/This/Is/Not/An/Example/
-> C:/Hello/World/This/Is/
How would you proceed? I tried to use string.split(@"/") and start with the first string and check if every part of this array is contained in the other strings. However, this would be a very expensive call as I'm iterating (list_of_entries)^list_of_entries. Is there any better solution available?
My current attempt would be something like the following (C# + LINQ):
public string CalculateCommonPath(IEnumerable<string> paths)
{
int minSlash = int.MaxValue;
string minPath = null;
foreach (var path in paths)
{
int splits = path.Split('\\').Count();
if (minSlash > splits)
{
minSlash = splits;
minPath = path;
}
}
if (minPath != null)
{
string[] splits = minPath.Split('\\');
for (int i = 0; i < minSlash; i++)
{
if (paths.Any(x => !x.StartsWith(splits[i])))
{
return i >= 0 ? splits.Take(i).ToString() : "";
}
}
}
return minPath;
}
A function to get the longest common prefix may look like this:
Then you may need to cut the prefix on the right hand. E.g. we want to return
c:/dir
instead ofc:/dir/file
forYou also may want to normalize the paths before processing. See Normalize directory names in C#.
I dont know whether this is the best performing solution (probably not), but it surely is very easy to implement.
Sample Fiddle
Sample code:
And that function of course:
To return
c:/dir
forI would code it this way:
And here are some tests for it:
First sort the list with the paths to inspect. Then you can split and compare the first and the last item - if they are same proceed to the next dimension until you find a difference.
So you just need to sort once and then inspect two items.
I would iterate over each character in the first path, comparing it with every character in every path (except the first) in the collection of paths:
You could iterate through the list first to find the shortest path and possibly improve it slightly.