Remove everything else from an image link but keep

2019-09-10 14:18发布

问题:

I am trying to remove some atrtibutes from images but it removes only the name of atribute and keep the rest..

i have an image as shown bellow:

<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">

I want to remove everything except <img src="image path">

i tried the code bellow but it removes only the name of atribute.. for e.g srcset.

$html = "<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">";

$one = preg_replace('#(<img.+?)srcset=(["\']?)\d*\2(.*?/?>)#i', '$1$3', $html);
$two= preg_replace('#(<img.+?)sizes=(["\']?)\d*\2(.*?/?>)#i', '$1$3', $one);

回答1:

Try this:

$html = preg_replace("/(<img\\s)[^>]*(src=\\S+)[^>]*(\\/?>)/i", "$1$2$3", $html);

It doesn't replace the unnecessary attributes, it extracts the src attribute with the opening and closing of the image tag.

It should work for any number of <img> tags in your html.



回答2:

You can use the DOM extension to properly manipulate a HTML structure.

It might be fine to use a regular expression for very simple cases but it won't be a complete solution regardless of how sophisticated it looks.


Stripping all <img> attributes with exception of src:

$html = '<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">';

echo stripImageAttributes($html);

Output:

<img src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg">

Definition of stripImageAttributes():

(It's designed to process HTML fragments, not complete documents.)

/** 
 * @param string $html
 * @return string 
 */ 
function stripImageAttributes($html)
{
    // init document
    $doc = new DOMDocument();
    $doc->loadHTML('<!doctype html><html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head><body>' . $html . '</body></html>');

    // init xpath
    $xpath = new DOMXPath($doc);

    // process images
    $body = $xpath->query('/html/body')->item(0);

    foreach ($xpath->query('//img', $body) as $image) {
        $toRemove = null;

        foreach ($image->attributes as $attr) {
            if ('src' !== $attr->name) {
                $toRemove[] = $attr;
            }
        }

        if ($toRemove) {
            foreach ($toRemove as $attr) {
                $image->removeAttribute($attr->name);
            }
        }
    }

    // convert the document back to a HTML string
    $html = '';
    foreach ($body->childNodes as $node) {
        $html .= $doc->saveHTML($node);
    }

    return $html;
}


回答3:

I would suggest you the following approach.

Considering that every attribute have to be separated by space, you can split all attributes with a simple explode() function and then iterate to get the one you need and create you clean image tag.

function cleanImage($html) {
    $output = '';
    $image_components = explode(' ',$html);
    foreach($image_components as $component) {
        if(substr($component,0,4) == 'src=') {
            $output = '<img '.$component.">";
            break;
        }
    }
    return $output;
}


$html = '<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">';

$image = cleanImage($html);