ruby on rails - Tried Webscraping with XPATH, Nokogiri, Mechanize -


I am trying to parse some information from a secure web site and it is getting to work.

If I can get the first value, then I can customize it to get relief ...

This example is returned to careers entity type should return next

source:

  http://safer.fmcsa.dot.gov/query.asp?searchtype=ANY&query_type=queryCarrierSnapshot&query_param = MC_MX & amp; query_string = 733709  

mechanization w / Hpricot

  require the need to require 'rubygems' is 'mechanization' Hpricot 'agent = Mechanize.new page = agent.get (' http: //safer.fmcsa? .dot.gov / Query.asp search = any & amp; QUERY_TYPE = queryCarrierSnapshot & amp; query_param = MC_MX & amp; QUERY_STRING = 733,709 ') @response = page.content Dock = Hpricot (@response) a = (doc / "/ html / Body / P / table / tbody / tr [2] / td / table / tbody / tr [2] / td / center [1] / table / tbody / tr [2] / td ") [0]. WinnerHTML A  

Notchory

  requir e 'nokogiri' is required 'open uri' doc = Nokogiri :: HTML (open ("http: /safer.fmcsa.dot.gov/query.asp?searchtype=ANY&query_type=queryCarrierSnapshot&query_param=MC_MX&query_string= 733,709 ")) EBIT = doc.at (" / html / body / p / table / tbody / tr [2] / td / table / tbody / tr [2] / td / center [1] / table / tbody / tr [2] / td "). The text entry ebit  

This value looks like a column, all have the same CSS class , So it is possible to search using it. It works for me.

  'nockery' is required 'Open-Yuri' Doctor = Dochory :: HTML (open ("http://safer.fmcsa.dot.gov/query.asp? search- = any & amp; QUERY_TYPE = queryCarrierSnapshot & amp; query_param = MC_MX & amp;.! QUERY_STRING = 733,709 ")) # Get the entity type field EBIT = doc.at ( '. queryfield') lesson # all white space ebit.gsub Get rid of ("\ U00A0", ""). Strip! Put ebit  

Comments

Popular posts from this blog

sqlite3 - UPDATE a table from the SELECT of another one -

c# - Showing a SelectedItem's Property -

javascript - Render HTML after each iteration in loop -