If a webpage has many photos and you want to download all of them, manual download is not a good option as it can take several minutes. You need some script that can scrape images from the page. In this post, I have written a simple Python script that can download and save images from a webpage to your local machine. Some web servers do not allow scrapping without a user-agent. So, I have included a user agent in this code, which should not be rejected by web servers. If you still get a rejection from a web server, you can find different user agents on the internet and try them.
The following code saves images with the name ‘my-photo-*. You can modify it if you want different names. The code uses Python modules bs4, urllib, and requests for scrapping and saving images.
from bs4 import BeautifulSoup
import urllib.request
import requests
def save_image_file(ilink, filename):
"""
Download and save the image file from a URL.
"""
response = requests.get(ilink)
if response.status_code == 200:
with open(filename, 'wb') as f:
f.write(response.content)
else:
print("Bad response code for the link:", ilink)
def read_url_data(link, headers):
"""
Read the URL and create a beautifulsoup object
"""
request = urllib.request.Request(link, None, headers)
response = urllib.request.urlopen(request)
return BeautifulSoup(response, 'html5lib')
if __name__ == "__main__":
# variables
user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41'
headers = {'User-Agent': user_agent, }
url = 'https://cs.tellustheanswer.com/matplotlib-python-code-to-plot-bar-charts-with-error-bars/'
# read the URL
soup = read_url_data(url, headers)
# check all img tags and download images
photo_name = 'my-photo-'
i = -1
for tag in soup.find_all('img'): # get all img tag
if tag.attrs['src']:
ext = '.' + tag['src'].split('.')[-1] # capture the photo extension
filename = 'photos/' + photo_name + str(i) + ext
picurl = tag['src']
print('Downloading.....', picurl)
save_image_file(picurl, filename)
i += 1
else:
print('BAD TAG', tag)
Post your comments to let me know if this code works for you.