top of page
Search

Ruby Scraping

  • Writer: Christina Williams
    Christina Williams
  • Jul 28, 2020
  • 2 min read

One thing

ree

I would remind anyone who is scraping to do is to not forget about the arrays


in some of the HTML. I got stuck at one point and could not figure out what I was scraping and was out-putting extra information that was on the same line as the info I wanted to capture. Then I realized that if I tried to break it down and number each item (array), I would be able to capture only what I needed.


For example:



school_xml.css("div.search-result-fact")[0].text, #acceptance_rate

school_xml.css("div.search-result-fact")[1].text, #cost


The first line above scrapes data for the acceptance rate for the top 25 colleges for theatre arts and the second line scrapes data for the yearly cost. These items appeared next to each other on the same line and when capturing the data for them, the HTML code looks identical. If you do not include the numbers for the array of items on the same line (and don't forget to start with zero), then your outp


ut will include everything on that line that is included in the same HTML code for that particular array. At first, I was not adding the [0] and [1] and I was getting the acceptance rate and cost together on each line of output. After I put the array numbers in, I got only the data I wanted on each line.


I used nokogiri for the scrape gem. I used the google chrome tools -View-Developer - Developer tools.

I then highlighted the information I wanted to scrape, right-clicked and chose to


inspect. I then scraped the code that was left highlighted and placed it in my written project code.


Example of what this looks like:

![](https://www.google.com/search?biw=1327&bih=596&tbm=isch&sa=1&ei=WRtPXP-CK8_i_AaQkYKIBQ&q=view+developer+developer+tools&oq=view+developer&gs_l=img.1.2.0i24l10.3549.10491..15561...2.0..0.95.770.12......1....1..gws-wiz-img.xHVvgMIUUhg#imgrc=v9vZrl9KwdL57M:http://)

My final project is available to view on GitHub here:

https://github.com/ChristinaXT/Niche_TS

 
 
 

Recent Posts

See All
Reverse a String Algorithm

There are several ways you can do the Reverse a String Algorithm. For the purpose of this blog, I am going to talk about this algorithm...

 
 
 

Comments


bottom of page