Metadata-Version: 2.1
Name: mtsb
Version: 0.0.2
Summary: Python library that collects tweets about movies, performs a sentiment analysis and correlates it with the boxoffice result of the week after the movie release.
Home-page: https://github.com/federicodeservi/mtsb
Author: Federico De Servi, Alessandro Pontini
Author-email: federico@federicodeservi.com, a.pontini1@campus.unimib.it
License: MIT
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Natural Language :: English
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Intended Audience :: Developers
Requires-Python: >=3.5
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: beautifulsoup4
Requires-Dist: imdbpy
Requires-Dist: lxml
Requires-Dist: tweepy
Requires-Dist: kafka
Requires-Dist: pymongo
Requires-Dist: pprint
Requires-Dist: contractions
Requires-Dist: inflect
Requires-Dist: ntlk
Requires-Dist: google-cloud-language
Requires-Dist: html5lib
Requires-Dist: mime

# MTSB

MTSB (Movie Tweet Sentiment Boxoffice) is a python module that collects tweets about movies, performs a sentiment analysis and correlates it with the boxoffice result of the week after the movie release.

## Features

* Collect tweets about movies
* Creates hashtags for each movie
* Performs sentiment analysis on those tweets using Google's API or Textblob and returns a mean score
* Gets boxoffice data from boxofficemojo
* Performs correlation between the sentiment analysis and boxoffice data

## Requirements

* Python >= 3.5 (Might work on older versions but it has not been tested)
* The package has only been tested on Linux, with the following docker compose environment: https://gitlab.com/aletundo/data-management-lab
* All module dependencies are installed on installation, but you will also need:
    * You need to have set up correctly ntlk module: https://www.nltk.org/install.html
    * Performed at least once "ntlk.download()"
    * Already have API keys for tweet collection: https://developer.twitter.com/en.html
    * If you plan on using Google's API you lready need to have API keys for Google Natural Language service: https://cloud.google.com/natural-language/docs/setup
* You also need to have the following services installed (tested on Linux system)
    * Jupyter-lab
    * MongoDB
    * Nifi
    * Kafka

## Installation

In order to install MTSB you can simply:

```
pip install mtsb
```

## Docs

* tweet_collector()

Collect tweets about movies. It lets you choose between movies released in 2019 and releasing in 2020. It then creates a list of hashtags based on the movie's name and top actors and uses it to collect tweets from twitter.

```
import mtsb

mtsb.tweet_collector()
```

* sentiment()

Performs sentiment analysis on collected tweets using Google's API or Textblob and returns a weighted geometric average of score and magnitude.

```
import mtsb

mtsb.sentiment()
```

* sentiment_boxoffice_all()

Creates a dataframe with the following info for each movie:
    * Movie title
    * Weighted geometric average of score and magnitude (from sentiment() )
    * Gross boxoffice for the week after the movie release

```
import mtsb

mtsb.sentiment_boxoffice_all()
```

* spearman_corr(df)

Performs a spearman correlation using the df returned by sentiment_boxoffice_all().

```
mtsb.spearman_corr(df)
```

## Links

* PyPI: https://pypi.org/project/mtsb/

## Acknowledgements

Useful python libraries used:
* [imdbpy library](https://github.com/alberanid/imdbpy/ "imdbpy library title")
* [ntlk library](https://github.com/nltk/nltk "ntlk library title")
* [beautifulSoup library](https://pypi.org/project/beautifulsoup4/ "beautifulSoup library title")

## Licence

MIT licensed. See the bundled [LICENSE](https://github.com/federicodeservi/mtsb-analyzer/blob/master/LICENSE "LICENSE title") file for more details. 


