python - Check if the url link is correct -


I want to open several URLs (I open a URL, search for all the links on this webstream and open them Am or downloading images etc.) So first I wanted to know whether the URL is correct or not, so if I used the if statement:

  if urlparse.urlparse (link) .netloc: Return 'broken' Url ' 

But I noticed that some values ​​do not pass this statement. When a link looks like this, I found a website: // b.thumbs.redditmedia.com/7pTYj4rOii6CkkEC.jpg , but I had an error: ValueError: unknown url type: // b. Thumbs.redditmedia.com / 7pTYj4rOii6CkkEC.jpg , but if my statement does not capture it, if a URL works well then how can I see more accurately?

If you are not specific about the library used, you can do the following :

urlib2 import re def is_fully_alive (url, live_check = false): try: if urllib2.urlparse.urlparse (url) .netloc No: return false website = urllib2.urlopen (url ) Html = website read () if website. Code! = 200: Return all the links for the link in the return # re.findall ('"((| http | ftp) s ?: //.*?)' ', Html): url = link [0] if urllib2 .urlparse.urlparse (url) .Notloc: return is wrong live_check: website = urllib2.urlopen (url) if website.code! = 200: print "unsuccessful link:", false exception except url return, e: print "Error while trying to validate link:" url print e return false return true

Check your url:

  & gt;  

Check each one by opening the link:

  # Your half It takes some time pure speed and not the link in the page >> gt; & gt; Is_fully_alive ("http://www.google.com", true) is true  
< P> Check an invalid url: Error while trying to validate the link: > & gt; & gt; Is_fully_alive ("// www.google.com"): //www.google.com unknown url type: //www.google.com incorrect

Comments

Popular posts from this blog

Member with no value in F# -

java - Joda Time Interval Not returning what I expect -

c# - Showing a SelectedItem's Property -