| Path: | README |
| Last Update: | Sat Feb 05 20:54:33 +0000 2011 |
An easy library to do the heavy lifting between you and Craigslist‘s posting database. Given a URL, libcraigscrape will follow links, scrape fields, and make ruby-sense out of the raw html from craigslist‘s servers.
For more information, head to the craiglist monitoring help section of our website.
libcraigscrape was primarily developed to support the included craigwatch script. See the included craigwatch script for examples of libcraigscape in action, and (hopefully) to serve an immediate craigscraping need.
Install via RubyGems:
sudo gem install libcraigscrape
On the ‘miami.craigslist.org’ site, using the query "search/sss?query=apple"
require 'rubygems'
require 'libcraigscrape'
require 'date'
require 'pp'
miami_cl = CraigScrape.new 'us/fl/miami'
miami_cl.posts_since(Time.parse('Sep 10'), 'search/sss?query=apple').each do |post|
pp post
end
On the ‘miami.craigslist.org’ under the ‘apa’ category
require 'rubygems'
require 'libcraigscrape'
require 'pp'
i=1
CraigScrape.new('us/fl/miami').each_post('apa') do |post|
break if i > 225
i+=1
pp post
end
In Florida, with the exception of ‘miami.craigslist.org’ & ‘keys.craigslist.org’ sites, output each post in the ‘crg’ category and for the search ‘artist needed‘
require 'rubygems'
require 'libcraigscrape'
require 'pp'
non_sfl_sites = CraigScrape.new('us/fl', '- us/fl/miami', '- us/fl/keys')
non_sfl_sites.each_post('crg', 'search/sss?query=artist+needed') do |post|
pp post
end
This grabs the full details under the specific post miami.craigslist.org/mdc/sys/1140808860.html
require 'rubygems'
require 'libcraigscrape'
post = CraigScrape::Posting.new 'http://miami.craigslist.org/mdc/sys/1140808860.html'
puts "(%s) %s:\n %s" % [ post.post_time.strftime('%b %d'), post.title, post.contents_as_plain ]
This grabs the post summaries of the single listings at miami.craigslist.org/search/sss?query=laptop
require 'rubygems' require 'libcraigscrape' listing = CraigScrape::Listings.new 'http://miami.craigslist.org/search/sss?query=laptop' puts 'Found %d posts for the search "laptop" on this page' % listing.posts.length
See COPYING