I'd like to download some image galleries in bulk. The images are offered up for free with no permissions needed. I for the life of me cannot get it to work. This is what I have so far. The $pattern spit out is the whole HTML line, not just the image link. Is there any pointers you can give me? The loop is set to only run once for testing purposes. The loop, will go through all pages which are organized numerically.
# Variables
$i=1 # Webpage Counter
$j=1 # Image Counter
$rootDir = "http://website.com/sport/galleries/"
$saveDir = "C:\Users\user\Desktop\"
$webpagetxt = "C:\Users\user\Desktop\page.txt"
$links = "C:\Users\user\Desktop\links.txt"
$regex = "http://website.com/galleries/[0-9]*/[^\.]*.JPG"
# Create folder to download to
#New-Item -Name SiouxSportsGalleries -ItemType directory
# Start Web Client
$client = New-Object System.Net.WebClient
# Main loop to get image links and download
For($i=10; $i -le 10; $i++){
# Download source code of the web page.
$url = $rootDir+$i+'.htm'
$webclient = new-object System.Net.WebClient
$webpage = $webclient.DownloadString($url)
$webpage > "$webpagetxt"
# Parse web page and find image link.
$pattern = Get-Content $webpagetxt | Select-String -pattern $regex -Allmatches
echo "This is the link" $pattern
#$pattern > $links
}
You need to extract value that was a match.
Select-String
returns objects, and when youecho
it, what happends is$pattern.ToString()
.ToString()
returns the line, and not the match-value. This will return all the links only:Btw, instead of saving the webpage and reopen it with
get-content
, you can simply split the string on linebreaks to get an array(if that's was the only reason you saved it). :-)EDIT To download it, you could just extend it with another foreach-loop:
Select-String
returns you an object with properties. Send it toGet-Member
to see what goodies you have. You'll want to check out the matches property e.g.$pattern.matches
. Check out example 9 in the documentation.