Justify text in Java

2020-02-14 20:53发布

问题:

I have to read in an integer which will be the length of the succeeding lines. (The lines of text will never be longer than the length provided).

I then have to read in each line of text and convert the spaces to an underscore as evenly as possible. For example:

I would enter the line length of 30. Then a line of text Hello this is a test string. Then all of the spaces will be converted to underscores and padded out so that the text fills the given line length like so: Hello__this__is__a_test_string. As you can see, the original text had a length of 27 characters, so to pad it out to 30 characters I had to add 3 extra spaces to the original text and then convert those spaces to the underscore character.

Please can you advise a way that I can go about this?

回答1:

What I do is split the sentence in to words. Then figure out how many spaces need to be added. Then iterate over the words and add a space to each one until you run out of spaces to add. If you have enough spaces where you need to add more than one to the words (like you have 5 words, but need to add 13 spaces), simply divide the number of spaces left by the number of words, and add that number to each word first. Then you can take the remainder and iterate across the words adding a space until you're done. Also make sure that you only add spaces to all but the last word in the sentence.



回答2:

I had to do something similar to this in Java recently. The code itself is relatively straightforward. What I found took the longest, was getting my head around the justification process.

I started by making a step by step process of how I would justify text manually.

  1. Find out how long the line is
  2. Find out how long the string is which is on said line
  3. Calculate the number of spaces required to add to the string to equal the line length
  4. Find out how many gaps there are between the words in the string
  5. Calculate how many spaces to add to each gap in the string
  6. Add result to each gap
  7. Calculate how many extra spaces there are to serially add to each gap (if the number of gaps is not divisible by the number of spaces to add. For example if you have 5 gaps but 6 spaces to add)
  8. Add extra spaces to gaps
  9. Convert spaces to underscores
  10. Return string

Doing this made coding the algorithm much simpler for me!

Finding out how long the line and the string on said line are

You said you have read in the line length and the text on the line so 1 and 2 you have already done. With 2 being a simple string.length() call.

Calculating the number of spaces required to add to the string to equal the line length is simply taking the line length and subtracting the length of the string.

lineLength - string.length() = noofspacestoadd;

Finding out how many gaps there are between all the words in the string

There is probably more than one way of doing this. I found that the easiest way of doing this was converting the string into a char[] and then iterating through the characters checking for ' ' and setting a count for when it does find a ' '

Calculating how many spaces to add to each gap

This is a simple division calculation!

noofgaps / noofspacestoadd = noofspacestoaddtoeachgap;

Note: You have to make sure you're doing this division with integers! As 5 / 2 = 2.5, therefore you KNOW you have to add 2 spaces to each gap between the words, and divisions using int's truncates the decimal number to form an integer.

Add the result to each gap

Before being able to add the number of strings required to add to each gap, you need to convert this number into a string of spaces. So you need to write a method for converting a given integer into a string of spaces equating to that given number. Again, this can be done in different ways. The way I did it was something like this

String s = "";
for(int i=noofspacestoaddtoeachgap; i>0; i--)
{
    s+= " ";
}

return s;

The way I did this was to convert the string into an array of substrings, with the substrings being each word in the array. If you look up the String class in the javadoc you should find the methods in the String class you can use to achieve this!

When you have your array of substrings, you can then add the string of spaces to the end of each substring to form your new substring!

Calculating how many extra spaces there are extra

This is again a simple calculation. Using the % operator you can do a remainder division similar to the division we did earlier.

noofgaps % noofspacestoadd = noofspacestoaddtoeachgap;

The result of this calculation gives us the number of extra spaces required to justify the text.

Add the extra spaces serially to each gap

This is probably the most difficult part of the algorithm, as you have to work out a way of iterating through each gap between the words and add an extra space until there are no more extra spaces left to add!

Return string

return String;


回答3:

Let's try to break the problem down:

Subtract the length of the string from 30 - that's the number of extra spaces you'll be adding somewhere (3 in this case).

Count the number of existing spaces (5 in this case).

Now you know that you need to distribute that first number of extra spaces into the existing spaces as evenly as possible (in this case, distribute 3 into 5).

Think about how you would distribute something like this in real life, say balls into buckets. You would probably rotate through your buckets, dropping a ball in each one until you ran out. So consider how you might achieve this in your java code (hint: look at the different kinds of loops).



回答4:

The way I would go about this is to use a loop with regular-expression replacements.

  1. Replace all spaces with underscores.
  2. For each char necessary to get the length up to the desired length, replace a single underscore with a two underscores. Use regular expressions to make sure that these replacements only happen where the desired number of underscores does not already exist. See JavaDoc for .ReplaceFirst(). You'll also need to account for the possibility that you have to replace double-underscores with triples.

After you do the initial replacement, I'd suggest you use a while loop, bounded on the length of the string being less than the target size. Initialize int numUnderscores = 1; outside of the while. Then the steps inside the loop will be:

  1. Build the replacement pattern. This should be something like "/[^_](_{" + numUnderscores + "})[^_]/" which says "any char that is not an underscore, followed by numUnderscores instances of the underscore char, followed by any char that is not an underscore"
  2. Call .ReplaceFirst() to perform the replacement
  3. Check to see if the string contains any remaining instances of the current number of underscores; if it does not, then you must increment numUnderscores

Obviously, since this is a homework problem, I'm leaving the actual process of writing the code as an exercise. If you have specific questions about some piece of it, or about some component of the logic structure I described, just ask in comments!

The benefit of doing things this way is that it will work for any size string, and is very configurable for different situations.



回答5:

The hardest thing about this problem is defining "as evenly as possible".

Your example:

 Hello__this__is__a_test_string

... makes all the longer gaps be at the left. Wouldn't:

 Hello__this_is__a_test__string

... fit the imprecise description of the problem better, with the longer gaps spread evenly through the output string?

However, let's solve it so it gives the sample answer.

  • First you need to know how many extra characters you need to insert -- numNewChars == lengthWanted minus inputString.length()
  • Next you need to count how many gaps there are to distribute these new characters between -- call that numGaps -- it's the number of words minus one.
  • In each space you will insert either n or n+1 new spaces. n is numNewChars / numGaps -- integer division; rounds down.
  • Now, how many times do you need to insert n+1 new spaces instead of n? It's the remainder: plusOnes = numNewChars % numGaps

That's all the numbers you need. Now using whatever method you've been taught (since this is evidently a homework problem, you don't want to use language features or libraries that haven't been covered in your lessons), go through the string:

  • For the first plusOnes spaces, insert n+1 spaces, in addition to the space that's already there.
  • For the remaining spaces, insert n spaces.

One very basic method would be as follows:

String output= "";
for(int i=0; i<input.length(); i++) {
    char c = input.charAt(i);
    if(c == ' ' {
        output += ...; // appropriate number of "_" chars
    } else {
        output += "" + c; // "" + just turns the char into a String.
    }
}


回答6:

I wrote a simple method to justify text. Its not 100% accurate, but works for the most part (since it ignores punctuations completely, and there might be some edge cases missing too). Also, Word justifies text in a richer manner (by not adding spaces to fill up the gap, but evenly distributing the width of a whitespace, which is tricky to do here).

public static void justifyText (String text) {
    int STR_LENGTH = 80;
    int end=STR_LENGTH, extraSpacesPerWord=0, spillOverSpace=0;
    String[] words;

    System.out.println("Original Text: \n" + text);
    System.out.println("Justified Text: ");

    while(end < text.length()) {

        if(text.charAt(STR_LENGTH) == ' ') {
            // Technically, this block is redundant
            System.out.println (text.substring(0, STR_LENGTH));
            text = text.substring(STR_LENGTH);
            continue;
        }

        end = text.lastIndexOf(" ", STR_LENGTH);
        words = text.substring(0, end).split(" ");
        extraSpacesPerWord = (STR_LENGTH - end) / words.length;
        spillOverSpace = STR_LENGTH - end + (extraSpacesPerWord * words.length);

        for(String word: words) {
            System.out.print(word + " ");
            System.out.print((extraSpacesPerWord-- > 0) ? " ": "");
            System.out.print((spillOverSpace-- > 0) ? " ": "");
        }
        System.out.print("\n");
        text = text.substring(end+1);

    }
    System.out.println(text);

}


回答7:

You just need to call fullJustify() method where in list of words needs to be passed along with the max width of each line you want in output.

public List<String> fullJustify(String[] words, int maxWidth) {
    int n = words.length;
    List<String> justifiedText = new ArrayList<>();
    int currLineIndex = 0;
    int nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
    while (currLineIndex < n) {
        StringBuilder line = new StringBuilder();
        for (int i = currLineIndex; i < nextLineIndex; i++) {
            line.append(words[i] + " ");
        }
        currLineIndex = nextLineIndex;
        nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
        justifiedText.add(line.toString());
    }
    for (int i = 0; i < justifiedText.size() - 1; i++) {
        String fullJustifiedLine = getFullJustifiedString(justifiedText.get(i).trim(), maxWidth);
        justifiedText.remove(i);
        justifiedText.add(i, fullJustifiedLine);
    }
    String leftJustifiedLine = getLeftJustifiedLine(justifiedText.get(justifiedText.size() - 1).trim(), maxWidth);
    justifiedText.remove(justifiedText.size() - 1);
    justifiedText.add(leftJustifiedLine);
    return justifiedText;
}

public static int getNextLineIndex(int currLineIndex, int maxWidth, String[] words) {
    int n = words.length;
    int width = 0;
    while (currLineIndex < n && width < maxWidth) {
        width += words[currLineIndex++].length() + 1;
    }
    if (width > maxWidth + 1)
        currLineIndex--;
    return currLineIndex;
}

public String getFullJustifiedString(String line, int maxWidth) {
    StringBuilder justifiedLine = new StringBuilder();
    String[] words = line.split(" ");
    int occupiedCharLength = 0;
    for (String word : words) {
        occupiedCharLength += word.length();
    }
    int remainingSpace = maxWidth - occupiedCharLength;
    int spaceForEachWordSeparation = words.length > 1 ? remainingSpace / (words.length - 1) : remainingSpace;
    int extraSpace = remainingSpace - spaceForEachWordSeparation * (words.length - 1);
    for (int j = 0; j < words.length - 1; j++) {
        justifiedLine.append(words[j]);
        for (int i = 0; i < spaceForEachWordSeparation; i++)
            justifiedLine.append(" ");
        if (extraSpace > 0) {
            justifiedLine.append(" ");
            extraSpace--;
        }
    }
    justifiedLine.append(words[words.length - 1]);
    for (int i = 0; i < extraSpace; i++)
        justifiedLine.append(" ");
    return justifiedLine.toString();
}

public String getLeftJustifiedLine(String line, int maxWidth) {
    int lineWidth = line.length();
    StringBuilder justifiedLine = new StringBuilder(line);
    for (int i = 0; i < maxWidth - lineWidth; i++)
        justifiedLine.append(" ");
    return justifiedLine.toString();
}

Below is the sample conversion where maxWidth was 80 characters: The following paragraph contains 115 words exactly and it took 55 ms to write the converted text to external file.

I've tested this code for a paragraph of about 70k+ words and it took approx 400 ms to write the converted text to a file.

Input

These features tend to make legal writing formal. This formality can take the form of long sentences, complex constructions, archaic and hyper-formal vocabulary, and a focus on content to the exclusion of reader needs. Some of this formality in legal writing is necessary and desirable, given the importance of some legal documents and the seriousness of the circumstances in which some legal documents are used. Yet not all formality in legal writing is justified. To the extent that formality produces opacity and imprecision, it is undesirable. To the extent that formality hinders reader comprehension, it is less desirable. In particular, when legal content must be conveyed to nonlawyers, formality should give way to clear communication.

Output

These  features  tend  to make legal writing formal. This formality can take the
form   of  long  sentences,  complex  constructions,  archaic  and  hyper-formal
vocabulary,  and  a  focus  on content to the exclusion of reader needs. Some of
this formality in legal writing is necessary and desirable, given the importance
of  some  legal documents and the seriousness of the circumstances in which some
legal  documents  are used. Yet not all formality in legal writing is justified.
To   the   extent  that  formality  produces  opacity  and  imprecision,  it  is
undesirable.  To  the  extent that formality hinders reader comprehension, it is
less   desirable.  In  particular,  when  legal  content  must  be  conveyed  to
nonlawyers, formality should give way to clear communication.                   


回答8:

I followed Shahroz Saleem's answer (but my rep is too low to comment :/) - however, I needed one minor change as it does not take into account words longer than the line length (such as URL's in the text.)

import java.util.ArrayList;
import java.util.List;

public class Utils {

    public static List<String> fullJustify(String words, int maxWidth) {

        return fullJustify(words.split(" "), maxWidth);
    }

    public static List<String> fullJustify(String[] words, int maxWidth) {
        int n = words.length;
        List<String> justifiedText = new ArrayList<>();
        int currLineIndex = 0;
        int nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
        while (currLineIndex < n) {
            StringBuilder line = new StringBuilder();
            for (int i = currLineIndex; i < nextLineIndex; i++) {
                line.append(words[i] + " ");
            }
            currLineIndex = nextLineIndex;
            nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
            justifiedText.add(line.toString());
        }
        for (int i = 0; i < justifiedText.size() - 1; i++) {
            String fullJustifiedLine = getFullJustifiedString(justifiedText.get(i).trim(), maxWidth);
            justifiedText.remove(i);
            justifiedText.add(i, fullJustifiedLine);
        }
        String leftJustifiedLine = getLeftJustifiedLine(justifiedText.get(justifiedText.size() - 1).trim(), maxWidth);
        justifiedText.remove(justifiedText.size() - 1);
        justifiedText.add(leftJustifiedLine);
        return justifiedText;
    }

    public static int getNextLineIndex(int currLineIndex, int maxWidth, String[] words) {
        int n = words.length;
        int width = 0;
        int count = 0;
        while (currLineIndex < n && width < maxWidth) {
            width += words[currLineIndex++].length() + 1;
            count++;
        }
        if (width > maxWidth + 1 && count > 1)
            currLineIndex--;

        return currLineIndex;
    }

    public static String getFullJustifiedString(String line, int maxWidth) {
        StringBuilder justifiedLine = new StringBuilder();
        String[] words = line.split(" ");
        int occupiedCharLength = 0;
        for (String word : words) {
            occupiedCharLength += word.length();
        }
        int remainingSpace = maxWidth - occupiedCharLength;
        int spaceForEachWordSeparation = words.length > 1 ? remainingSpace / (words.length - 1) : remainingSpace;
        int extraSpace = remainingSpace - spaceForEachWordSeparation * (words.length - 1);
        for (int j = 0; j < words.length - 1; j++) {
            justifiedLine.append(words[j]);
            for (int i = 0; i < spaceForEachWordSeparation; i++)
                justifiedLine.append(" ");
            if (extraSpace > 0) {
                justifiedLine.append(" ");
                extraSpace--;
            }
        }
        justifiedLine.append(words[words.length - 1]);
        for (int i = 0; i < extraSpace; i++)
            justifiedLine.append(" ");
        return justifiedLine.toString();
    }

    public static String getLeftJustifiedLine(String line, int maxWidth) {
        int lineWidth = line.length();
        StringBuilder justifiedLine = new StringBuilder(line);
        //for (int i = 0; i < maxWidth - lineWidth; i++)
        //    justifiedLine.append(" ");
        return justifiedLine.toString();
    }
}

Note I also commented out the spaces padding for the last line of each paragraph (in getLeftJustifiedLine) and made the methods static..



回答9:

The First part of this presentation contains a Dynamic Programming Algorithm for Justification of Text.