Metadata-Version: 2.1
Name: socialvec
Version: 0.1.7.1
Summary: SocialVec is a framework of Social Embeddings for eliciting social world knowledge from social networks.
Home-page: https://github.com/nirlotan/socialvec
Author: Nir Lotan
Author-email: nir.lotan@gmail.com
License: MIT license
Keywords: socialvec
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE
License-File: AUTHORS.rst
Requires-Dist: click >=8.1.3
Requires-Dist: setuptools >=60.2.0
Requires-Dist: pandas >=1.5.0
Requires-Dist: numpy >=1.23.3
Requires-Dist: fastparquet >=0.8.3
Requires-Dist: pyarrow
Requires-Dist: wget >=3.2
Requires-Dist: yaspin
Requires-Dist: PyYAML
Requires-Dist: gensim >4
Requires-Dist: scikit-learn
Requires-Dist: tensorflow

=========
SocialVec
=========

.. image:: https://img.shields.io/pypi/v/socialvec.svg
   :target: https://pypi.python.org/pypi/socialvec

The **SocialVec** package provides pre-trained embeddings for approximately 200,000 popular Twitter accounts. **SocialVec** is a framework for learning social entity embeddings, derived from a large-scale Twitter dataset encompassing 1.3 million users and the accounts they follow.

* Free software: MIT license

What are SocialVec Embeddings?
==============================

**SocialVec embeddings** are low-dimensional vector representations of popular Twitter accounts. These embeddings are trained on co-occurrence patterns observed in the Twitter social network. Accounts frequently co-followed by users are considered socially related, making these embeddings similar to word embeddings where words in similar contexts have similar vector representations.

Package Features
================

This package includes the following features:

- **Access to pre-trained SocialVec embeddings:**

  - Pre-trained embeddings for approximately 200,000 popular Twitter accounts.
  - Embeddings are 100-dimensional, trained using the Skip-gram model with negative sampling (SGNS).

- **Entity similarity computation:**

  - Calculate cosine similarity between SocialVec embeddings to assess social similarity between entities.
  - Enables tasks like:

    - Identifying similar entities (e.g., universities similar to UC Berkeley).
    - Recommending Twitter accounts based on existing followings.
    - Assessing the political leaning of news sources.

- **Entity analogy exploration:**

  - Experiment with relational arithmetic on SocialVec embeddings to explore entity analogies, similar to word analogies.

Potential Applications
======================

The **SocialVec** package can be used for a wide range of tasks, including:

- **Recommendation systems:** Recommending Twitter accounts or other content based on user social affinity captured by the embeddings.
- **Social analysis:** Investigating social trends and relationships between entities on Twitter.
- **Bias detection:** Identifying potential biases in social media content or user behavior based on social context.
- **Inferring personal traits:** Predicting user characteristics like age, gender, or political leaning based on their social connections on Twitter.

Examples
========

Here are some practical examples of what you can do with **SocialVec**:

- **Finding similar entities:** Retrieve universities similar to UC Berkeley based on the cosine similarity of their SocialVec embeddings.
- **Recommending Twitter accounts:** Suggest accounts similar to those followed by a specific user, leveraging social context captured in the embeddings.
- **Assessing political leaning:** Determine the political bias of news sources by comparing their similarity to embeddings of politically polarized accounts (e.g., accounts of prominent politicians).
- **Exploring entity analogies:** Complete analogies like *"X-Factor : Simon Cowell :: The Voice : ?"* using vector arithmetic on SocialVec embeddings.

Advantages of SocialVec
=======================

- **Captures social world knowledge:** Unlike embeddings derived from factual knowledge bases like Wikipedia or Wikidata, SocialVec embeddings reflect relationships between entities based on social media interactions.
- **Wider coverage:** SocialVec represents a broader range of entities, as many Twitter accounts do not have corresponding Wikipedia pages.

Notes
=====

This README covers the pre-trained embeddings provided by the package. Specific implementation details and additional functionality will be defined as part of the package's development.

Credits
=======

This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.

.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage


History
-------

0.1.0 (2022-09-29)
------------------

* First release on PyPI.

0.1.1 (2022-09-29)
------------------

* Include config.yaml in the distribution.

0.1.2 (2022-10-02)
------------------

* Rearrange config.yaml
* Support multiple versions of the SocialVec model
* Fix bug when searching for similarity using username

0.1.3 (2022-10-14)
------------------
* Initial version of SocialVecClassifier

0.1.4 (2022-11-08)
------------------
* Updates to SocialVecClassifier

0.1.5 (2023-11-07)
------------------
* Update a dedicated model for the SocialVecClassifier (2020c)

0.1.6 (2023-11-09)
------------------
* Modify requirements to support more up-to-date python versions

0.1.7 (2024-10-22)
------------------
* Add the option to load the model to RAM in case there is no write permission to the package folder (which

0.1.7.1 (2024-10-22)
--------------------
* Add pypi documentation
