regex - Perl Extract URI from site -
Code for a Perl script needs some help.
I am using the LWP library to get a website. Buy Now, I need to remove a URL from this HTML file.
I just need to remove the first URI in which the end "1500_.jpg" is included
I tried to use the URI: Find it and it worked very well. I had finished all the URIs "1500.JPG", but I realized that URIs are not in the correct order.
My code is
#! Use / usr / bin / perl strict; Use warnings; URI :: Search; LWP :: Simple; My $ url = 'example.com'; My $ html = get $ url; My% uris = (); My $ finder = URI :: Search-> New (& amp; amp; callback); My $ Found = $ Finder- & gt; Find (\ $ html); My @ uris =% uris; My @ match = grp (/1500_.jpg$/, @ uris); Print my $ uri (@ match) {Print "$ uri \ n"; } Go out(); Sub callback {my ($ uri_url, $ uri) = @_; $ URI {$ uri} ++; Return "--- Ersetzt durch XXXxx ---"; }
How can I remove the URI before a website, which finally got "1500_.jpg"?
Can anyone help me?
I will use it to:
#! Use / usr / bin / env perl 5.012; Use warnings; LWP :: Simple; Use HTML :: Query; My $ url = 'http://example.com/url'; My $ html = get $ url; My $ query = html :: query-> new (text = & gt; $ html); My @urls = Map {$ _- & gt; Entry ('href')} $ query- & gt; Query ('an [href]') - & gt; Meets; @urls = grep {$ _ = ~ qr / 1500 _ \. Jpg $ /} @urls; Use data: Dumper; Print dumper (\ @ urls);
Comments
Post a Comment