Metadata-Version: 2.3
Name: spidar
Version: 0.0.1
Summary: Web Scraping library
Project-URL: Download, https://github.com/shari-ful/spidar.git
Project-URL: Homepage, https://github.com/shari-ful/spidar.git
Author-email: Shariful Alam <2ashariful@gmail.com>
License: MIT License
License-File: LICENSE
Keywords: HTML,XML,parse,soup
Requires-Python: >=3.6.0
Requires-Dist: soupsieve>1.2
Provides-Extra: cchardet
Requires-Dist: cchardet; extra == 'cchardet'
Provides-Extra: chardet
Requires-Dist: chardet; extra == 'chardet'
Provides-Extra: charset-normalizer
Requires-Dist: charset-normalizer; extra == 'charset-normalizer'
Provides-Extra: html5lib
Requires-Dist: html5lib; extra == 'html5lib'
Provides-Extra: lxml
Requires-Dist: lxml; extra == 'lxml'
Description-Content-Type: text/markdown

Spidar is a library that makes it easy to scrape information
from web pages. It sits atop an HTML or XML parser, providing Pythonic
idioms for iterating, searching, and modifying the parse tree.

# Quick start

```
>>> from spidar import Spidar
html_content = '<html><body><p>Some<b>bad<i>HTML</i></b></p></body></html>'
>>> sp = Spidar(html_content, 'html.parser')
>>> print(sp.prettify())
<html>
    <body>
        <p>
            Some
            <b>
                bad
                <i>
                    HTML
                </i>
            </b>
        </p>
    </body>
</html>
>>> sp.find(text="bad")
'bad'
>>> sp.i
<i>HTML</i>
#
>>> sp = Spidar("<p>Some<b>bad<i>HTML")
#
>>> print(sp.prettify())
<html>
    <body>
        <p>
            Some
            <b>
                bad
                <i>
                    HTML
                </i>
            </b>
        </p>
    </body>
</html>
#
>>> sp = Spidar("<tag1>Some<tag2/>bad<tag3>XML", "xml")
#
>>> print(sp.prettify())
<?xml version="1.0" encoding="utf-8"?>
<tag1>
    Some
        <tag2/>
            bad
        <tag3>
        XML
    </tag3>
</tag1>
```


# Running the unit tests

Spidar supports unit test discovery using Pytest:

```
$ pytest
```

