Metadata-Version: 2.1
Name: tsingspider
Version: 1.4.5
Summary: A spider library of several data sources
Home-page: https://github.com/TsingJyujing/DataSpider
Author: Tsing Jyujing
Author-email: nigel434@gmail.com
License: UNKNOWN
Description: # DataSpider
        
        ![Upload Python Package](https://github.com/TsingJyujing/DataSpider/workflows/Upload%20Python%20Package/badge.svg)
        
        A spider framework with several internal spiders.
        
        ## Install
        
        ```bash
        pip install --upgrade tsingspider
        ```
        
        ## Features
        
        - Light-weight: do not have to start browser simulator, won't cost lots of resources
            - But not all the website can download in this way
        - Lazy: won't download anything before you actually use the data
        - Useful Utilities
            - Support HLS download
            - Support cookies from firefox
            - Support Proxies
            - Generate magnet link from torrent data
        
        ## Write Your Own Spider
        
        To define a resource, you can use `LazySoup` or `LazyContent`.
        `LazyContent` is for binary data, basically all kinds of the data are binary.
        `LazySoup` is for the XML format resource, widely be used for downloading web-page.
        
        For example:
        
        ```python
        from tsing_spider.util import LazySoup, LazyContent
        
        class YourOwnSpider(LazySoup):
            def __init__(self, url:str):
                LazySoup.__init__(self, url)
        
            @property
            def some_info(self) -> str:
                """
                Extract information from self.soup
                the data will be downloaded at the first time of using it
                """
                pass
        ```
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
