Problem:
I'd like to be able to count the number of lines in a Google Document. For example, the script must return 6 for the following text.
There doesn't seem to be any reliable method of extracting '\n' or '\r' characters from the text though.
text.findText(/\r/g) //OR
text.findText(/\n/g)
The 2nd line of code is not supposed to work anyway, because according to GAS documentation, 'new line characters are automatically converted to /r'
As you noted in the comments there is no API to do retrieve the number of lines in Google Docs. This happens because the document is rendered dynamically in the client side, so the server doesn't know this number.
One possible solution is scraping the HTML of the Google Doc, because each line is redered with it's own
div
with the "kix-lineview" class, however you will need to actually open the page in an iframe or headless browser and then scroll page by page to make them render and then be able to count the divsIf you are still looking for the solution, how about this answer? Unfortunately, I couldn't find the prepared methods for retrieving the number of lines in the Google Document. In order to do this, how about this workaround?
If the end of each line can be detected, the number of lines can be retrieved. So I tried to add the end markers of each line using OCR. I think that there might be several workarounds to solve your issue. So please think of this as one of them.
At Google Documents, when a sentence is over the page width, the sentence automatically has the line break. But the line break has no
\r\n
or\n
. When users give the line break by the enter key, the line break has\r\n
or\n
. By this, the text data retrieved from the document has only the line breaks which were given by users. In your case, it seems that your document has the line breaks for afterincididunt
andconsequat.
. So the number of lines doesn't become 6.I thought that OCR may be able to be used for this situation. The flow is as follows.
\r\n
or\n
were not added to the converted text data. So I used ocr.space. ocr.space can add the line breaks.\n
in the converted text data.The sample script for above flow is as follows. When you use this, please retrieve your apikey at "ocr.space". When you input your information and email to the form, you will receive an email including API key. Please use it to this sample script. And please read the quota of API. I tested this using Free plan.
Sample script :
Result :
When your sentences are used, 6 is obtained as the result of script.
Note :
\r\n
or\n
, the converted text data has\r\n
at the end of all lines.I tested this script for several documents. In my environment, the correct number of line can be retrieved. But I'm not sure whether this script works for your environment. If this script cannot be used for your environment, I'm sorry.