Metadata-Version: 2.1
Name: s3-scheduler
Version: 0.0.2
Summary: Use S3 as a scheduler mechanism using Lambda
Home-page: https://github.com/efi-mk/s3-scheduler
Author: Efi Merdler-Kravitz
Author-email: efi.merdler@gmail.com
License: UNKNOWN
Description: ## Basic Usage
        
        ### Installation
        
        `pip install s3-scheduler`
        
        ### Setting up a recurring flow
        
        The library uses AWS builtin capability to run every 1 minute. The configuration depends on your framework. For example for Zappa use
        
        `s3_events.py`
        ```python
        
        scheduler = Scheduler(...)
        def check_scheduled_events():
            scheduler.handle()
        ```
        
        `zappa_settings.json`
        ```json
        {
          "production": {
            "events": [
              {
                "function": "utils.s3_events.check_scheduled_events",
                "expression": "rate(1 minute)"
              }
            ]
          }
        }
        ```
        
        ### Scheduling
        
        ```
        import boto3
        from s3_scheduler.scheduler import Scheduler
        from s3_scheduler.utils import nowut
        s3_resource = boto3.resource("s3")
        s3_client = boto3.client("s3")
        scheduler = Scheduler("bucket", "/path", s3_resource, s3_client)
        time = nowut() + timedelta(minutes=10)
        upload_to = scheduler.schedule(time, "s3-bucket", "s3_files-important", "content")
        ```
        During initialization the scheduler requires the bucket and folder in which to keep the actual scheduling details. Remember, each event is a separate file, therefore there is a need to save them somewhere. When to schedule is a simple `datetime` object.
        
        ### Stopping
        
        ```python
        scheduler = Scheduler("bucket", "/path", s3_resource, s3_client)
        time = nowut() + timedelta(minutes=10)
        key = scheduler.schedule(time, "s3-bucket", "s3_files-important", "content")
        scheduler.stop(key)
        ```
        
        In case you want to cancel the schedule event before it occurs
        
        # Using S3 as a scheduler
        
        S3 is a powerful tool and it can be used for more than elastic persistent layer. You can read more about it on [hackernoon.com](https://hackernoon.com/s3-the-best-of-2-worlds-92576f23c000)
        
        In the following post I’m going to demonstrate how to use S3 as a scheduling mechanism to execute various tasks.
        
        ## Overview
        
        ![Simple S3 flow](https://cdn-images-1.medium.com/max/2000/1*8_iclxSZ_B6M--uGXkNp0Q.png)
        
        S3 alongside a Lambda function creates a simple event base flow, e.g. attach a Lambda to S3 PUT event, create a new file and the Lambda function is called. In order to create a schedule event all you have to do is to write the file you want to act upon on the designated time, however AWS only enables you to create recurring [events using cron or rate expression](https://docs.aws.amazon.com/lambda/latest/dg/tutorial-scheduled-events-schedule-expressions.html), what happens when you want to schedule a one time event? You are stuck.
        
        The S3-Scheduler library enables you to do just that, it uses S3 as a scheduling mechanism that enables you to schedule one time event. 
        
        ### How it works
        
        ![](https://cdn-images-1.medium.com/max/2000/1*9ZApM13Gq9OyobtKmdlZkg.png)
        
        Each event is a separate file, behind the scenes the library uses the recurring mechanism to wake up every 1 minute, scan for the relevant files using S3’s [filter capabilities](https://boto3.readthedocs.io/en/latest/guide/collections.html) and if the scheduled time had passed move the file to the relevant bucket + key.
        
        The library, in order to function properly has to know the answer to three questions:
        
        1. The content to save.
        2. Where to save it (bucket + key) → will trigger the appropriate Lambda function.
        3. When to move it to the appropriate bucket.
        
        ### Encoding details
        
        The content to save is left unchanged, points 2 and 3 mentioned above are encoded in the key’s name and use | as a separator between the parts, for example to copy the relevant content on the 5th of August to a bucket called s3-bucket and a folder named s3_important_files the scheduler will produce the following file `2018–08–05|s3-bucket|s3_files-important` . By keeping the meta data outside the actual content we achieve couple of benefits:
        
        * Speed up the process, no need to read the entire content in order to decide when and where to copy. 
        * It allows the content to be binary, not only text based.
        * By using S3 filter capabilities it reduces the cost to fetch the correct files.
        * Easier debugging, just view the file name in order to understand when and where to copy.
        
        ## Fin
        
        Scheduling in the AWS serverless world is a bit tricky, right now AWS only provides CRON like capabilities, this post demonstrated a technique that can be used to create a more robust scheduling capability. 
        
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
