Convert relative URL to absolute URL

Input:

Base URL: www.example.com/1/2/index.php
Relative URL: ../../index.php

Output:

Absolute URL: www.example.com/index.php

It would be perfect, of it would be done using sed.

As I understand, this regex should delete one somefolder/ in for every ../ in the URL.

标签： regex bash url

4条回答

Deceive 欺骗

2楼-- · 2020-02-15 04:12

If your only requirement is to turn .. into "up one level" then this is a possible solution. It doesn't use regular expressions or sed, or a JVM for that matter ;)

#!/bin/bash                                                                                                                                

domain="www.example.com"
origin="1/2/3/4/index.php"
rel="../../index.php"

awk -v rel=$rel -v origin=$origin -v file=$(basename $rel) -v dom=$domain '                                                                
BEGIN {                                                                                                                                    
    n = split(rel, a, "/")                                                                                                                 
    for(i = 1; i <= n; ++i) {                                                                                                              
        if(a[i] == "..") ++c                                                                                                               
    }                                                                                                                                      
    abs = dom                                                                                                                              
    m=split(origin, b, "/")                                                                                                                
    for(i = 1; i < m - c; ++i) {                                                                                                           
        abs=abs"/"b[i]                                                                                                                     
    }                                                                                                                                      
    print abs"/"file                                                                                                                       
}'

An alternative approach to using awk, credit to Edward for mentioning realpath -m:

#!/bin/bash                                                                                                                                

rel="../../index.php"
origin="www.example.com/1/2/index.php"

directory=$(dirname $origin)
fullpath=$(realpath -m "$directory/$rel")
echo ${fullpath#$(pwd)/}

0人赞添加讨论(0) 举报

家丑人穷心不美

3楼-- · 2020-02-15 04:21

You can't use a single regular expression for this, because regular expressions can't count.

You should use a real programming language instead. Even Java can do this easily.

0人赞添加讨论(0) 举报

时光不老，我们不散

4楼-- · 2020-02-15 04:27

realpath is a quick but slightly hacky way to do what you want.
(Actually, I'm surprised that it doesn't deal properly with URLs; it treats them as plain old filesystem paths.)
~$ realpath -m http://www.example.com/1/2/../../index.php => ~$ /home/username/http:/www.example.com/index.php
The -m (for "missing") says to resolve the path even if components of it don't actually exist on the filesystem.
So you'll still have to strip off the actual filesystem part of that (which will just be $(pwd). And note that the slash-slash for the protocol was also canonicalized to a single slash. So you might be better off to leave the "http://" off of your input and just prepend it to your output instead.
See man 1 realpath for the full story. Or info coreutils 'realpath invocation' for a more verbose full story, if you have the info system installed.

0人赞添加讨论(0) 举报

smile是对你的礼貌

5楼-- · 2020-02-15 04:28

Using sed inside bash

#!/bin/bash

base_url='www.example.com/1/2/index.php'
rel_url='../../index.php'

str="${base_url};${rel_url}"
str=$(echo $str | sed -r 's#/[^/]*;#/#')
while [ ! -z $(echo $str | grep '\.\.') ]
do
  str=$(echo $str | sed -r 's#\w+/\.\./##')
done
abs_url=$str

echo $abs_url

Output:

www.example.com/index.php

0人赞添加讨论(0) 举报

Convert relative URL to absolute URL

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间