I want to copy some specific content from a website using ruby/rails. The content I need is inside a marquee html tag, divided by divs. How can I get access to this content using ruby? To be more precise - I want to use some kind of ruby gui (Preferably shoes). How do I do it?
相关问题
- Question marks after images and js/css files in ra
- Using :remote => true with hover event
- Eager-loading association count with Arel (Rails 3
- How to specify memcache server to Rack::Session::M
- Why am I getting a “C compiler cannot create execu
相关文章
- Ruby using wrong version of openssl
- Right way to deploy Rails + Puma + Postgres app to
- AWS S3 in rails - how to set the s3_signature_vers
- Difference between Thread#run and Thread#wakeup?
- how to call a active record named scope with a str
- How to add a JSON column in MySQL with Rails 5 Mig
- “No explicit conversion of Symbol into String” for
- form_for wrong number of arguments in rails 4
This isn't really a Rails question. It's something you'd do using Ruby, then possibly display using Rails, or Sinatra or Padrino - pick your poison.
There are several different HTTP clients you can use:
Open-URI comes with Ruby and is the easiest. Net::HTTP comes with Ruby and is the standard toolbox, but it's lower-level so you'd have to do more work. HTTPClient and Typhoeus+Hydra are capable of threading and have both high-level and low-level interfaces.
I recommend using Nokogiri to parse the returned HTML. It's very full-featured and robust.
If you need to navigate through login screens or fill in forms before you get to the page you need to parse, then I'd recommend looking at Mechanize. It relies on Nokogiri internally so you can ask it for a Nokogiri document and parse away once Mechanize retrieves the desired URL.
If you need to deal with Dynamic HTML, then look into the various WATIR tools. They drive various web browsers then let you access the content as seen by the browser.
Once you have the content or data you want, you can "repurpose" it into text inside a Rails page.
If I'm to understand correctly, you want a GUI interface to a website scraper. If that's so, you might have to build one yourself.
The easiest way to scrape a website is using nokogiri or mechanize gems. Basically, you will give those libraries the address of the website and then use their XPath capabilities to select the text out of the DOM.
https://github.com/sparklemotion/nokogiri
https://github.com/sparklemotion/mechanize (for the documentation)