I have following input string
Lorem ipsum dolor sit amet consectetur adipiscing elit sed doeiusmod tempor incididunt ut Duis aute irure dolor in reprehenderit in esse cillum dolor eu fugia ...
Splitting rules by example
[
"Lorem ipsum dolor", // A: Tree words <6 letters
"sit amet", // B: Two words <6 letters if next word >6 letters
"consectetur", // C: One word >=6 letters if next word >=6 letters
"adipiscing elit", // D: Two words: first >=6, second <6 letters
"sed doeiusmod", // E: Two words: firs<6, second >=6 letters
"tempor" // rule C
"incididunt ut" // rule D
"Duis aute irure" // rule A
"dolor in" // rule B
"reprehenderit in" // rule D
"esse cillum" // rule E
"dolor eu fugia" // rule D
...
]
So as you can see string in array can have min one and max tree words. I try to do it as follows but doesn't work - how to do it?
let s="Lorem ipsum dolor sit amet consectetur adipiscing elit sed doeiusmod tempor incididunt ut Duis aute irure dolor in reprehenderit in esse cillum dolor eu fugia";
let a=[""];
s.split(' ').map(w=> {
let line=a[a.length-1];
let n= line=="" ? 0 : line.match(/ /g).length // num of words in line
if(n<3) line+=w+' ';
n++;
if(n>=3) a[a.length-1]=line
});
console.log(a);
UPDATE
Boundary conditions: if last words/word not match any rules then just add them as last array element (but two long words cannot be newer in one string)
SUMMARY AND INTERESTING CONCLUSIONS
We get 8 nice answer for this question, in some of them there was discussion about self-describing (or self-explainable) code. The self-describing code is when the person which not read the question is able to easy say what exactly code do after first look. Sadly any of answers presents such code - so this question is example which shows that self-describing is probably a myth
If we define words with length <6 to have size 1 and >=6 to have size 2, we can rewrite the rules to "if the next word would make the total size of the current row >= 4, start next line".
This sounds like a problem you would get during a job interview or on a test. The right way to approach this problem is to think about how to simplify the problem into something that we can understand and write legible code for.
We know that there are two conditions: smaller than six or not. We can represent each word in the string as a binary digit being 0(smaller than 6) or 1(larger than 6).
Turning the string of words into a string of binary will make it easier to process and understand:
Next we need to simplify the rules. Each rule can be thought of as a string of binary(a set of words). Since some rules are more complicated than others, adding the next word we will think of as part of the string:
For a string of numbers remaining, whichever rule fits at the beginning will be the next set of strings. This is a pretty simple logical operation:
YAY! We successfully converted a complex problem into something easy to understand. While this is not the shortest solution, it is very elegant, and there is still room for improvement without sacrificing readability(compared to some other solutions).
This way of conceptualizing the problem opens doors for more rules or even more complex states(0,1,2).
One option is to first create an array of rules, like:
Then iterate through the array of rules, finding the rule that matches, extracting the appropriate number of words to splice from the matching rule, and push to the output array:
Of course, the
.find
assumes that every input string will always have a matching rule for each position spliced.For the additional rule that any words not matched by the previous rules just be added to the output, put
[1]
into the bottom of therules
array:No tricks needed. This code traverses the array of words, and check the rules for each sequence of 3. The rules are applied trying to do less loops and creating less intermediary objects possible, resulting in a good performance and memory usage.
I write in short and faster (in terms of time complexity: I not calc sum by reduce in each loop iteration) version of idea proposed in BoltKey answer (if you want vote up please do it on his answer).
Main idea
ws
is word size where we have only two values 1 (short word) and 2 (long word)s
is current line size in loop (we iterate over each word size)l
, and it size to line sizes
l
to output arrayr
and cleanl
ands
l
to result ifl
is not emptyYou can express your rules as abbreviated regular expressions, build a real regex from them and apply it to your input:
If the rules don't change, the regex construction part is only needed once.
Note that this also handles punctuation in a reasonable way.