{
    "componentChunkName": "component---src-templates-blog-post-js",
    "path": "/python-scrap-data-from-url/",
    "result": {"data":{"site":{"siteMetadata":{"title":"CrewCode Solutions"}},"markdownRemark":{"id":"55d25ca7-2fd8-5d83-8882-53b080593b25","excerpt":"In this article i will explain you how you can scrap html data from any url like amazon, flipkart using BeautifulSoup Step 1: Install necessary python library…","html":"<p>In this article i will explain you how you can scrap html data from any url like amazon, flipkart using BeautifulSoup</p>\n<h4>Step 1: Install necessary python library which gonna help in scrapping data</h4>\n<ul>\n<li>BeautifulSoup: Our primary module that contain method to access webpage over HTTP</li>\n</ul>\n<pre><code class=\"language-py\">pip install bs4\n</code></pre>\n<ul>\n<li>Requests: It sends the http request to the url flawlessly</li>\n</ul>\n<pre><code class=\"language-py\">pip install requests\n</code></pre>\n<h4>Step 2: Add this code where you want to scrap the data using BeautifulSoup</h4>\n<ul>\n<li>\n<p>We make http request using requests library</p>\n<pre><code class=\"language-py\">r = requests.get(url, timeout=10)\n</code></pre>\n</li>\n<li>\n<p>Then we parse the content in the form of html using BeautifulSoup you can change the parser type to lxml as well</p>\n<pre><code class=\"language-py\">soup = BeautifulSoup(r.content, 'html.parser')\n</code></pre>\n</li>\n<li>\n<p>Then we extract data using tags</p>\n<pre><code class=\"language-py\">images = soup.select('img')\n</code></pre>\n</li>\n</ul>\n<pre><code class=\"language-py\">import requests\nfrom bs4 import BeautifulSoup\n\nurl=''\nr = \"\"\n\ntry:\n    r = requests.get(url, timeout=10)\n    print(r.content)\n\n    #you can change the parser type from html to lxml\n    soup = BeautifulSoup(r.content, 'html.parser')\n    list = []\n    images_list = []\n    images = soup.select('img')\n\n    # Extracting all the images from the url\n    for image in images:\n        src = image.get('src')\n        alt = image.get('alt')\n        images_list.append({\"src\": src, \"alt\": alt})\n\n    for image in images_list:\n        print(image)\n\n    # Extracting html tag and al the children html element and saving those html element in our local directory\n    for tag in soup.select('html'):\n       list.append(str(tag))\n    list2= (', '.join(list))\n    with open('scrap.html', 'w',encoding='UTF-8') as f:\n       f.write(list2)\n    r.raise_for_status()\nexcept requests.exceptions.HTTPError as errh:\n    print (\"Http Error:\",errh)\nexcept requests.exceptions.ConnectionError as errc:\n    print (\"Error Connecting:\",errc)\nexcept requests.exceptions.Timeout as errt:\n    print (\"Timeout Error:\",errt)\nexcept requests.exceptions.RequestException as err:\n    print (\"OOps: Something Else\",err)\n</code></pre>","fields":{"slug":"/python-scrap-data-from-url/"},"frontmatter":{"title":"Python how to scrap data from any url","date":"January 01, 2023","description":"In this article i will explain you step by step how you can scrap data from any url using BeautifulSoup","bannerimage":"https://crew-code-images.s3.us-east-1.amazonaws.com/blog_images/python.jpg"}},"previous":{"fields":{"slug":"/python-opencv-tutorial-croping-resizing-facedetection-captureimagefromvideo/"},"frontmatter":{"title":"Python tutorial, reading and croping image, face detection, capturing image from video","date":"December 31, 2022","bannerimage":"https://crew-code-images.s3.us-east-1.amazonaws.com/blog_images/python.jpg"}},"next":{"fields":{"slug":"/python-tutorial-converting-speech-to-text/"},"frontmatter":{"title":"Python tutorial converting speech to text using python script","date":"January 08, 2023","bannerimage":"https://crew-code-images.s3.us-east-1.amazonaws.com/blog_images/python.jpg"}}},"pageContext":{"id":"55d25ca7-2fd8-5d83-8882-53b080593b25","previousPostId":"2196f990-5637-5dd3-9a77-6eae13f365a9","nextPostId":"1af3684c-e658-5671-9a2a-561a3e565085"}},
    "staticQueryHashes": ["3860684146"]}