Metadata-Version: 2.1
Name: mrputils
Version: 0.7.0
Summary: This is a util module to help with movie revenue prediction
Home-page: https://github.com/scienclick/pde_cap_mrp_zagros/tree/main/mrputils
Author: Amir Shamsa
Author-email: amirshamsa@gmail.com
License: Apache License 2.0
Platform: UNKNOWN
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scikit-learn
Requires-Dist: nltk

# Movie Revenue Prediction 🎬

Mission: Given the characteristics of a movie (director, actors, budget…), predict the revenue it will generateDataset: Imdb (link).

#### 🚩Data
350 k+ movies
Multiple countries and languages
Data fetched from www.themoviedb.org.



Technology: Tensorflow, Steamlit, Python, NLP

#### 🚩 Zagros PDE Team 🌄
[Amir Shamsa](https://www.eureka.slb.com/CNP.cfm?uid=amir-20111016a)            [Syed Aaquib Hussain](https://www.eureka.slb.com/CNP.cfm?uid=syed-20160505)
[Mehdi Paydayesh](https://www.eureka.slb.com/CNP.cfm?uid=mehdi-20120402)        [Abdurraouf Aljaber](https://eur.delve.office.com/?u=9c7ac147-2739-4a06-899d-ff302ba9de0a&v=work)

#### 🚩Learn more
[Link to the detialed documnetation](https://slb001-my.sharepoint.com/:p:/g/personal/mpaydayesh_slb_com/Ec0pxL9AxSJOpv8jozuAhxYBrP4yZTm9_R2MNUCdyu8uvw?e=mNZBBv)

[Link to the final presentation](https://slb001-my.sharepoint.com/:p:/g/personal/mpaydayesh_slb_com/EdngxR73sKtFvvKSnq7EI4gBcvekNsW04VWBla1r_g9GTA?e=0pD2A0)

[Trello project managment](https://trello.com/invite/b/YjawKwro/ATTI7c94ea9cf3b681ea13ca96182052b4ccCD950991/project-management)

💖This has been a cool project 😆 in this bootcamp!


## Requirements

The major libraries used in these projects are:
1. numpy,
2. pandas,
3. seaborn,
4. matplotlib,
5. missingno,
6. random,
7. re
8. nltk
9. sklearn
10. tensorflow
11. xgboost
12. lightgbm


rand_state=100
RANDOMSEED = 100
DISPLAY_WIDTH = 400
DISPLAYMAX_COLUMNS = 25
#endregion

#region settings
random.seed(RANDOMSEED)
pd.set_option('display.width', DISPLAY_WIDTH)
pd.set_option('display.max_columns', DISPLAYMAX_COLUMNS)
import warnings
warnings.filterwarnings('ignore')
warnings.filterwarnings(action='once')

#endregion



## File structure

**Part 0: importing libararies**

**Part 1: define functions (methods)**

**Part 2: define processing functions (methods)**

**Part 3: QCs**

**Part 4: defining the features and targets**

**Part 5: making the pipeline**

**Part 6: cross validation and bagging regressor**

**Part 7: model selection**

**Part 8: gridSearch and hyperparameters testing**

**Part 9: TPOT testing**

**Part 10: stacking**

**Part 11: model performance and learning curve**

**Part 12: movie revenue prediction**

**Part 13: model B - creating a model to find similar movies using KNN**

**Part 14: model C - Creating a model to predict the movie popularity**

**Part 15: model D - scraping new movie data for testing the model**


