crawler 0.1.0 : Python Package Index
crawler 0.1.0
python crawler.
Latest Version:
0.1.2
python crawler.
=====
## Example
=====
from crawler.crawler import Crawler
mycrawler = Crawler()
seeds = ['http://www.example.com/'] # list of url
mycrawler.add_seeds(seeds)
url_patterns = ['^(.+example\.com)(.+)
] # list of regular expression for urls that crawler will work on. mycrawler.start(url_patterns) # start crawling ################# data files ################# three database (Berkeley DB) files will be generated. queue.db webpage.db duplcheck.db
没有评论:
发表评论