Is it “bad practice” to be sensitive to linebreaks

2019-04-23 12:13发布

I'm generating some XML documents and when it comes to the address part I have fragments that look like this:

<Address>15 Sample St
Example Bay
Some Country</Address>

The XSLT that I have for converting this to XHTML has some funky recursive template to convert newline characters within strings to <br/> tags.

This is all working fine; but is it considered "bad practice" to rely on linebreaks within XML documents? If so, is it recommended that I do this instead?

<Address><Line>15 Sample St</Line>
<Line>Example Bay</Line>
<Line>Some Country</Line></Address>

Seems like it'd be really awkward to wrap every place where my text may be multiple lines with tags like that..

12条回答
狗以群分
2楼-- · 2019-04-23 12:45

Few people have said that CDATA blocks will allow you to retain line breaks. This is wrong. CDATA sections will only make markup be processed as character data, they will not change line break processing.

<Address>15 Sample St
Example Bay
Some Country</Address>

is exactly the same as

<Address><![CDATA[15 Sample St
Example Bay
Some Country]]></Address>

The only difference is how different APIs report this.

查看更多
欢心
3楼-- · 2019-04-23 12:45

I don't see what's wrong with <Line> tags.
Apparently, the visualization of the data is important to you, important enough to keep it in your data (via line breaks in your first example). Fine. Then really keep it, don't rely on "magic" to keep it for you. Keep every bit of data you'll need later on and can't deduce perfectly from the saved portion of the data, keep it even if it's visualization data (line breaks and other formatting). Your user (end user of another developer) took the time to format that data to his liking - either tell him (API doc / text near the input) that you don't intend on keeping it, or - just keep it.

查看更多
Explosion°爆炸
4楼-- · 2019-04-23 12:49

I recommend you should either add the <br/> line breaks or maybe use line-break entity - &#x000D;

查看更多
我想做一个坏孩纸
5楼-- · 2019-04-23 12:51

This is probably a bit deceptive example, since address is a bit non-normalized in this case. It is a reasonable trade-off, however since address fields are difficult to normalize. If you make the line breaks carry important information, you're un-normalizing and making the post office interpret the meaning of the line break.

I would say that normally this is not a big problem, but in this case I think the Line tag is most correct since it explicitly shows that you don't actually interpret what the lines may mean in different cultures. (Remember that most forms for entering an address has zip code etc, and address line 1 and 2.)

The awkwardness of having the line tag comes with normal XML, and has been much debated at coding horror. http://www.codinghorror.com/blog/archives/001139.html

查看更多
兄弟一词,经得起流年.
6楼-- · 2019-04-23 12:52

The XML spec has something to say regarding whitespace and linefeeds and carriage returns in particular. So if you limit yourself to true linefeeds (x0A) you should be Ok. However, many editing tools will reformat XML for "better presentation" and possibly get rid of the special syntax. A more robust and cleaner approach than the "< line>< / line>" idea would be to simply use namespaces and embed XHTML content, e.g.:

<Address xmlns="http://www.w3.org/1999/xhtml">15 Sample St<br />Example Bay<br />Some Country</Address>

No need to reinvent the wheel when it comes to standard vocabularies.

查看更多
Summer. ? 凉城
7楼-- · 2019-04-23 13:00

It depends on how you're reading and writing the XML.

If XML is being generated automatically - if newlines or explicit \n flags are being parsed into
- then there's nothing to worry about. Your input likely doesn't have any other XML in it so it's just cleaner to not mess with XML at all.

If tags are being worked with manually, it's still cleaner to just have a line break, if you ask me.

The exception is if you're using DOM to get some structure out of the XML. In that case line breaks are obviously evil because they don't represent the heirarchy properly. It sounds like the heirarchy is irrelevant for your application, though, so line breaks sound sufficient.

If the XML just looks bad (especially when automatically generated), Tidy can help, although it works better with HTML than with XML.

查看更多
登录 后发表回答