SED Match/Replace URL and Update Serialized Array

2019-08-13 02:45发布

问题:

Below is an example snippet from a sql dump file. This specific row contains a meta_value of a Wordpress PHP serialized array. During database restores in dev., test., and qc. environments I'm using sed to replace URLs with the respective environment sub-domain.

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:\"background-image\";
          s:33:\"http://example.com/background.jpg\";
          s:23:\"mobile-background-image\";
          s:37:\"www.example.com/mobile-background.jpg\";
      }')
;

However, I need to extend this to correct the string length in the serialized arrays after replace.

sed -r -e "s/:\/\/(www\.)?${domain}/:\/\/\1${1}\.${domain}/g" "/vagrant/repositories/apache/$domain/_sql/$(basename "$file")" > "/vagrant/repositories/apache/$domain/_sql/$1.$(basename "$file")"

The result should look like this for dev.:

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:\"background-image\";
          s:37:\"http://dev.example.com/background.jpg\";
          s:23:\"mobile-background-image\";
          s:41:\"www.dev.example.com/mobile-background.jpg\";
      }')
;

I'd prefer to not introduce any dependancies other than sed.

回答1:

Your algorithm involves arithmetic. That makes sed a poor choice. Consider awk instead.

Consider this input file:

$ cat inputfile
  something...
  s:33:\"http://example.com/background.jpg\";
  s:37:\"www.example.com/mobile-background.jpg\";
  s:33:\"http://www.example.com/background.jpg\";
  more lines...

I believe that this does what you want:

$ awk -F'"' '/:\/\/(www[.])?example.com/ {sub("example.com", "dev.example.com"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' inputfile 
  something...
  s:37:\"http://dev.example.com/background.jpg\";
  s:37:\"www.example.com/mobile-background.jpg\";
  s:41:\"http://www.dev.example.com/background.jpg\";
  more lines...


回答2:

WP-CLI handles serialized PHP arrays during a search-replace http://wp-cli.org/commands/search-replace/. I wanted to try a native shell solution, but having WP-CLI was worth the extra overhead in the end.



回答3:

Thanks @John1024. @Fabio and @Seth, I not sure for perfomance, but these code work and without wp-cli:

localdomain=mylittlewordpress.local
maindomain=strongwordpress.site.ru
cat dump.sql | sed 's/;s:/;\ns:/g' | awk -F'"' '/s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' | sed ':a;N;$!ba;s/;\ns:/;s:/g' | sed "s/$maindomain/$localdomain/g" | mysql -u$USER -p$PASS $DBNAME

PHP serialized string exploded by ';s:' to multiline string and awk processed all lines by @John1024 solution.

cat dump.sql | sed 's/;s:/;\ns:/g'

Redirect output to awk

awk -F'"' '/^s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1'

After all lines processed, multiline implode to one line (as then exists in original dump.sql). Thanks @Zsolt https://stackoverflow.com/a/1252191

sed ':a;N;$!ba;s/;\ns:/;s:/g'

Addition sed replacement need for any other strings in wordpress database.

sed "s/$maindomain/$localdomain/g"

And load into main server DB

... | mysql -u$USER -p$PASS $DBNAME


回答4:

Here is a sample text file you asked for (it's a database export).

Original (https://www.example.com) :

LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','https://www.example.com','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:75:\"https://www.example.com/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:62:\"https://www.example.com/wp-content/uploads/2017/04/favicon.ico\";}','yes')
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;

Result needed (http://example.localhost) :

LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','http://example.localhost','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:76:\"http://example.localhost/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:64:\"https://example.localhost/wp-content/uploads/2017/04/favicon.ico\";}','yes');
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;

As you can see :

  • there is multiple occurence on the same line
  • escape characters aren't counted in length number (eg: "/")
  • some occurence aren't preceded by "s:" length number (no need to replace, it can be done after awk with a simple sed)

Thanks in advance !