Lexicographical sorting

2019-03-13 02:21发布

问题:

I'm doing a problem that says "concatenate the words to generate the lexicographically lowest possible string." from a competition.

Take for example this string: jibw ji jp bw jibw

The actual output turns out to be: bw jibw jibw ji jp

When I do sorting on this, I get: bw ji jibw jibw jp.

Does this mean that this is not sorting? If it is sorting, does "lexicographic" sorting take into consideration pushing the shorter strings to the back or something?

I've been doing some reading on lexigographical order and I don't see any point or scenarios on which this is used, do you have any?

回答1:

It seems that what you're looking for is a better understanding of the question, so let me just make it clear. The usual sorting on strings is lexicographic sorting. If you sort the strings [jibw, ji, jp, bw, jibw] into lexicographic order, the sorted sequence is [bw, ji, jibw, jibw, jp], which is what you got. So your problem is not with understanding the word "lexicographic"; you already understand it correctly.

Your problem is that you're misreading the question. The question doesn't ask you to sort the strings in lexicographic order. (If it did, the answer you got by sorting would be correct.) Instead, it asks you to produce one string, got by concatenating the input strings in some order (i.e., making one string without spaces), so that the resulting single string is lexicographically minimal.

To illustrate the difference, consider the string you get by concatenating the sorted sequence, and the answer string:

bwjijibwjibwjp //Your answer
bwjibwjibwjijp //The correct answer

Now when you compare these two strings — note that you're just comparing two 14-character strings, not two sequences of strings — you can see that the correct answer is indeed lexicographically smaller than your answer: your answer starts with "bwjij", while the correct answer starts with "bwjib", and "bwjib" comes before "bwjij" in lexicographic order.

Hope you understand the question now. It is not a sorting question at all. (That is, it is not a problem of sorting the input strings. You could do sorting on all possible strings got by permuting and concatenating the input strings; this is one way of solving the problem if the number of input strings is small.)



回答2:

You can convert this into a trivial sorting problem by comparing word1 + word2 against word2 + word1. In Python:

def cmp_concetanate(word1, word2):
    c1 = word1 + word2
    c2 = word2 + word1
    if c1 < c2:
        return -1
    elif c1 > c2:
        return 1
    else:
        return 0

Using this comparison function with the standard sort solves the problem.



回答3:

I've been using F# in this Facebook hacker cup. Learned quite a bit in this competition. Since the documentation on F# on the web is still rare, I think I might as well share a bit here.

This problem requests you to sort a list of strings based on a customized comparison method. Here is my code snippet in F#.


    let comparer (string1:string) (string2:string) =
         String.Compare(string1 + string2, string2 + string1)

    // Assume words is an array of strings that you read from the input
    // Do inplace sorting there
    Array.sortInPlaceWith comparer words
    // result contains the string for output
    let result = Array.fold (+) "" words



回答4:

//Use this block of code to print lexicographically sorted characters of an array or it can be used in many ways.

  #include<stdio.h>
  #include<conio.h>

  void combo(int,int,char[],char[],int*,int*,int*);

  void main()
  {
      char a[4]={'a','b','c'};
      char a1[10];
      int i=0,no=0;
      int l=0,j=0;
      combo(0,3,a,a1,&j,&l,&no);
      printf("%d",no);
      getch();
  }
  void combo(int ctr,int n,char a[],char a1[],int*j,int*l,int*no)
  {
      int i=0;
      if(ctr==n)
      {
        for(i=0;i<n;i++)
            printf("%c",a1[i]);
        printf("\n");
        (*no)++;
        (*j)++;
        if((*j)==n)
        { 
            *l=0;
             *j=0;
        }
        else
        *l=1;       
        getch();
      }
      else
        for(i=0;i<n;i++)
        {
        if(*l!=1)
            *j=i;
        a1[ctr]=a[*j];
        combo(ctr+1,n,a,a1,j,l,no);
        }
    }


回答5:

The example you posted shows that mere sorting would not generate the lexicographically lowest string. For the given problem, you would need to apply some additional trick to determine which string should come before which(as of now, I can't think of the exact method)

The actual output does not violate the condition for lexicographically lowest word.



回答6:

The sort command on linux also does Lexicographic sorting and generates the output in the order bw ji jibw jibw jp



回答7:

Check what happened here:

If you just apply a lexicographic sort you'll get bw ji jibw jibw jp but if you analyze token by token you'll find that "bwjibw" (bw, jibw) is lexicographicaly lower than "bwjijibw" (bw, ji, jibw) that's why the answer is bw jibw jibw ji jp because first you should append bwjibwjibw and after that you could concatenate ji and jp to get the lowest string.



回答8:

A simple trick involving only sorting, which would work for this problem as the max string length is specified, would be to pad all strings up to max length with the first letter in the string. Then you sort the padded strings, but output the original unpadded ones. For ex. for string length 2 and inputs b and ba you would sort bb and ba which would give you ba and bb, and hence you should output bab.



回答9:

Prasun's trick works if you instead pad with a special "placeholder" character that could be weighted to be greater than "z" in a string sort function. The result would give you the order of lowest lexicographic combination.



回答10:

The contest is over so I am posting a possible solution, not the most efficient but one way of doing it

 #include <iostream>
 #include <fstream>
 #include <string>
    #include <algorithm>
    using namespace std;
   int main()
  {
ofstream myfile;
myfile.open("output.txt");
int numTestCases;
int numStrings;
string* ptr=NULL;
char*ptr2=NULL;
string tosort;
scanf("%d",&numTestCases);
for(int i=0;i<numTestCases;i++)
{
    scanf("%d",&numStrings);
    ptr=new string[numStrings];
    for(int i=0;i<numStrings;i++)
    {
        cin>>ptr[i];
    }
    sort(ptr,ptr+numStrings);
    for(int i=0;i<numStrings;i++)
    {
        next_permutation(ptr,ptr+numStrings);
    }
    tosort.clear();
    for(int i=0;i<numStrings;i++)
    {
        tosort.append(ptr[i]);
    }
    ptr2=&tosort[i];

    cout<<tosort<<endl;
    myfile<<tosort<<endl;   
    delete[]ptr;
}
return 0;
  }

I am using algorithms from the STL library in c++, the prev_permutation function simply generates a permutation sorted lexicographically