Metadata-Version: 2.1
Name: cleanX
Version: 0.1.4
Summary: Python library for cleaning data in large datasets of Xrays
Home-page: https://github.com/drcandacemakedamoore/cleanX
Author: doctormakeda@gmail.com
Author-email: doctormakeda@gmail.com
Maintainer: doctormakeda@gmail.com
Maintainer-email: doctormakeda@gmail.com
License: MIT
Description: <p align="center">
        <img style="width: 30%; height: 30%" src="https://github.com/drcandacemakedamoore/cleanX/blob/main/test/cleanXpic.png">
        </p>
        
        # cleanX
        
        CleanX <a href="https://doi.org/10.5281/zenodo.4725904"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.4725904.svg" alt="(DOI)"></a> <a href="https://github.com/drcandacemakedamoore/cleanX/blob/master/LICENSE"><img alt="License: GPL-3" src="https://img.shields.io/github/license/drcandacemakedamoore/cleanX"></a>
        is an open source python library
        for exploring, cleaning and augmenting large datasets of Xrays as JPEG files.
        Please note you should have the file extension (.jpg or .jpeg) in lower case
        for some functions to work.
        (JPEG files can be extracted from DICOM files.)
        
        
        ### The latest official release:
        
        <a href="https://pypi.org/project/cleanX/"><img alt="PyPI" src="https://img.shields.io/pypi/v/cleanX"></a>
        
        
        primary author: Candace Makeda H. Moore
        
        other authors + contributors: Oleg Sivokon, Andrew Murphy
        
        ## Continous Integration (CI) status
        
        ![ci workflow](https://github.com/drcandacemakedamoore/cleanX/actions/workflows/on-commit.yml/badge.svg)
        ![ci workflow](https://github.com/drcandacemakedamoore/cleanX/actions/workflows/on-tag.yml/badge.svg)
        
        
        ## Requirements
        
        - a [python](https://www.python.org/downloads/) installation (3.7, 3.8 or 3.9)
        - ability to create virtual environments (reccomended, not absolutely neccesary)
        - tesseract-ocr, matplotlib, pandas, pillow and/or opencv
        - optional reccomendation of simpleITK or pydicom for dicom to jpg conversion
        - anaconda is now supported, but not technically neccesary
        
        
        ## Documentation
        
        Online documentation at https://drcandacemakedamoore.github.io/cleanX/
        
        We encourage you to build up-to-date documentation by command.
        
        Documentation can be generated by command:
        
        ``` sh
        python setup.py apidoc
        python setup.py build_sphinx
        ```
        
        The documentation will be generated in `./build/sphinx/html` directory. Documentation is generated
        automatically as new functions are added.  
        
        # Installation
        - setting up a virtual environment is desirable, but not absolutely neccesary
        
        - activate  the environment
        ### Anaconda Installation
        
        - use command for conda as below
        
                conda install -c doctormakeda -c conda-forge cleanx       
        
        You need to specify both channels because there are some cleanX
        dependencies that exist in both Anaconda main channel and in
        conda-forge
        
        ### pip installation
        - use pip as below
        
                pip install cleanX
            
            
        
        ## About using this library
        If you use the library, please credit me and my collaborators.  You are only free to use this library according to license. We hope that if you use the library you will open source your entire code base, and send us modifications.  You can get in touch with me by email (doctormakeda@gmail.com) if you have a legitamate reason to use my library without open-sourcing your code base, or following other conditions, and I can make you specifically a different license.
        
        We are adding new functions all the time. Many unit tests are availalable in the test folder. Test coverage is currently partial. The library includes many functions. Some newly added functions allow for rapid automated data augmentation (in ways that are realistic for X-rays). Some other functions are for cleaning datasets including ones that: 
        
        
                Get jpeg and csv filess out of dicom files
                
                Run on dataframes to make sure there is no image leakage
        
                Run on a dataframe to look for demographic or other biases in patients
            
                Crop off excessive black frames (run this on single images) one at a time
               
                Run on a list to make a prototype tiny Xray others can be comapared to
            
                Run on image files which are inside a folder to check if they are "clean"
        
                Take a dataframe with image names and return plotted(visualized) images  
        
                Run to make a dataframe of pics in a folder (assuming they all have the same 'label'/diagnosis)
        
                Normalize images in terms of pixel values (multiple methods)
        
        All important functions are documented in the online documentation.
Platform: UNKNOWN
Description-Content-Type: text/markdown
