How can i remove none valid chars from xml but keep standard
for example i want remove all < and " from attribute value inner strings
<log>
<data id="1" name="No Error" value="0" />
<data id="2" name="Error "1" between text" value="0" />
<data id="3" name="Error <2> between text" value="0" />
</log>
How can i daynamicly remove quotes surrounds "1"
and <> surrounds 2
that final out put shuld be
<log>
<data id="1" name="No Error" value="0" />
<data id="2" name="Error 1 between text" value="0" />
<data id="3" name="Error 2 between text" value="0" />
</log>
Thanks for the suppot
I was thinking of the following solution:
- Read the file as text
- Modify any string that starts with
<Name=>
and ends with
<value=>
- remove all
",<,>
- add
"
after <name=>
and add "
before <value=>
if this is correct, how can i do this with C#, the replace method will not work.
Thanks
for your information I found 2 different ways,
1-
public static void ReplaceInvalidCharFromAttribute(string filePath, string startElement, string nextElement, string[] removeStrings)
{
string tempFile = Path.GetTempFileName();
using (var sr = new StreamReader(filePath))
{
using (var sw = new StreamWriter(tempFile))
{
string line;
string temp;
while ((line = sr.ReadLine()) != null)
{
temp = RemoveInvalidCharFromAttribute(line, startElement, nextElement, removeStrings);
sw.WriteLine(temp??line);
}
}
}
File.Delete(filePath);
File.Move(tempFile, filePath);
}
public static string RemoveInvalidCharFromAttribute(string input, string startElement, string nextElement, string[] invalidChars)
{
if (input.IndexOf(startElement) < 0 || input.IndexOf(nextElement) < 0) return null;
int start =input.IndexOf(startElement) + startElement.Length;
int end = input.IndexOf(nextElement);
StringBuilder res = new StringBuilder(input.Substring(start, (end - start)));
StringBuilder resCopy = new StringBuilder(res.ToString());
foreach (string inv in invalidChars)
res.Replace(inv, "");
// return the result after surrounding the text with double
return
input.Replace(
resCopy.ToString(),
(String.Concat("\"", String.Concat(res.ToString().Trim(), "\" "))));
}
2- http://support.microsoft.com/kb/316063
so for so good, Thanks
in PHP I use the following to encode the data, before it goes into the XML:-
function xml_encode($string)
{
$string=preg_replace("/&/", "&", $string);
$string=preg_replace("/</", "<", $string);
$string=preg_replace("/>/", ">", $string);
$string=preg_replace("/\"/", """, $string);
$string=preg_replace("/%/", "%", $string);
return utf8_encode($string);
}
It will look like you suggest in a browser, until you actually look at the source.
At this point you would need to check for "& amp;" and hex/octal codes.
Hope that helps a little.