Metadata-Version: 2.0
Name: grab
Version: 0.4.13
Summary: Site Scraping Framework
Home-page: http://github.com/lorien/grab
Author: Grigory Petukhov
Author-email: lorien@lorien.name
License: BSD
Keywords: pycurl multicurl curl network parsing grabbing scraping lxml xpath
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Utilities

====
Grab
====

.. image:: https://travis-ci.org/lorien/grab.png
    :target: https://travis-ci.org/lorien/grab


Grab is a python site scraping framework. Grab provides powerful interface to two libraries:
lxml and pycurl. There are two ways how to use Grab:
1) Use Grab to configure network requests and to process fetched documents. In this way you
should manually control flow of you program.
2) Use Grab::Spider to buld asynchronous site scrapers. This is how scrapy works.

Example of Grab usage::

    from grab import Grab

    g = Grab()
    g.go('https://github.com/login')
    g.set_input('login', 'lorien')
    g.set_input('password', '***')
    g.submit()
    for elem in g.doc.select('//ul[@id="repo_listing"]/li/a'):
        print '%s: %s' % (elem.text(), elem.attr('href'))


Example of Grab::Spider usage::

    from grab.spider import Spider, Task
    import logging

    class ExampleSpider(Spider):
        def task_generator(self):
            for lang in ('python', 'ruby', 'perl'):
                url = 'https://www.google.com/search?q=%s' % lang
                yield Task('search', url=url)

        def task_search(self, grab, task):
            print grab.doc.select('//div[@class="s"]//cite').text()


    logging.basicConfig(level=logging.DEBUG)
    bot = ExampleSpider()
    bot.run()


Installation
============

Pip is recommended way to install Grab and its dependencies::

    $ pip install lxml
    $ pip install pycurl
    $ pip install grab


Documentation
=============

Russian docs: http://docs.grablib.org
English docs in progress.

Discussion group (Russian or English): http://groups.google.com/group/python-grab/


Contribution
============

If you found a bug or if you want new feature please create new issue on github:

* https://github.com/lorien/grab/issues


