How check if a String is a Valid XML with-out Disp

2019-01-21 19:25发布

问题:

i was trying to check the validity of a string as xml using this simplexml_load_string()Docs function but it displays a lot of warning messages.

How can I check whether a string is a valid XML without suppressing (@ at the beginning) the error and displaying a warning function that expec

回答1:

Use libxml_use_internal_errors() to suppress all XML errors, and libxml_get_errors() to iterate over them afterwards.

Simple XML loading string

libxml_use_internal_errors(true);

$doc = simplexml_load_string($xmlstr);
$xml = explode("\n", $xmlstr);

if (!$doc) {
    $errors = libxml_get_errors();

    foreach ($errors as $error) {
        echo display_xml_error($error, $xml);
    }

    libxml_clear_errors();
}


回答2:

From the documentation:

Dealing with XML errors when loading documents is a very simple task. Using the libxml functionality it is possible to suppress all XML errors when loading the document and then iterate over the errors.

The libXMLError object, returned by libxml_get_errors(), contains several properties including the message, line and column (position) of the error.

libxml_use_internal_errors(true);
$sxe = simplexml_load_string("<?xml version='1.0'><broken><xml></broken>");
if (!$sxe) {
    echo "Failed loading XML\n";
    foreach(libxml_get_errors() as $error) {
        echo "\t", $error->message;
    }
}

Reference: libxml_use_internal_errors



回答3:

My version like this:

//validate only XML. HTML will be ignored.

function isValidXml($content)
{
    $content = trim($content);
    if (empty($content)) {
        return false;
    }
    //html go to hell!
    if (stripos($content, '<!DOCTYPE html>') !== false) {
        return false;
    }

    libxml_use_internal_errors(true);
    simplexml_load_string($content);
    $errors = libxml_get_errors();          
    libxml_clear_errors();  

    return empty($errors);
}

Tests:

//false
var_dump(isValidXml('<!DOCTYPE html><html><body></body></html>'));
//true
var_dump(isValidXml('<?xml version="1.0" standalone="yes"?><root></root>'));
//false
var_dump(isValidXml(null));
//false
var_dump(isValidXml(1));
//false
var_dump(isValidXml(false));
//false
var_dump(isValidXml('asdasds'));


回答4:

try this one

//check if xml is valid document
public function _isValidXML($xml) {
    $doc = @simplexml_load_string($xml);
    if ($doc) {
        return true; //this is valid
    } else {
        return false; //this is not valid
    }
}


回答5:

Here a small piece of class I wrote a while ago:

/**
 * Class XmlParser
 * @author Francesco Casula <fra.casula@gmail.com>
 */
class XmlParser
{
    /**
     * @param string $xmlFilename Path to the XML file
     * @param string $version 1.0
     * @param string $encoding utf-8
     * @return bool
     */
    public function isXMLFileValid($xmlFilename, $version = '1.0', $encoding = 'utf-8')
    {
        $xmlContent = file_get_contents($xmlFilename);
        return $this->isXMLContentValid($xmlContent, $version, $encoding);
    }

    /**
     * @param string $xmlContent A well-formed XML string
     * @param string $version 1.0
     * @param string $encoding utf-8
     * @return bool
     */
    public function isXMLContentValid($xmlContent, $version = '1.0', $encoding = 'utf-8')
    {
        if (trim($xmlContent) == '') {
            return false;
        }

        libxml_use_internal_errors(true);

        $doc = new DOMDocument($version, $encoding);
        $doc->loadXML($xmlContent);

        $errors = libxml_get_errors();
        libxml_clear_errors();

        return empty($errors);
    }
}

It works fine with streams and vfsStream as well for testing purposes.



回答6:

Case

Occasionally check availability of a Google Merchant XML feed.

The feed is without DTD, so validate() won't work.

Solution

// disable forwarding those load() errors to PHP
libxml_use_internal_errors(true);
// initiate the DOMDocument and attempt to load the XML file
$dom = new \DOMDocument;
$dom->load($path_to_xml_file);
// check if the file contents are what we're expecting them to be
// `item` here is for Google Merchant, replace with what you expect
$success = $dom->getElementsByTagName('item')->length > 0;
// alternatively, just check if the file was loaded successfully
$success = null !== $dom->actualEncoding;

length above contains a number of how many products are actually listed in the file. You can use your tag names instead.

Logic

You can call getElementsByTagName() on any other tag names (item I used is for Google Merchant, your case may vary), or read other properties on the $dom object itself. The logic stays the same: instead of checking if there were errors when loading the file, I believe actually trying to manipulate it (or specifically check if it contains the values you actually need) would be more reliable.

Most important: unlike validate(), this won't require your XML to have a DTD.