Metadata-Version: 2.1
Name: s3_scheduler
Version: 0.0.1
Summary: Use S3 as a scheduler mechanism using Lambda
Home-page: https://github.com/pypa/sampleproject
Author: Efi Merdler-Kravitz
Author-email: efi.merdler@gmail.com
License: UNKNOWN
Description: ## Basic Usage
        
        ### Installation
        
        `pip install s3-scheduler`
        
        ### Setting up a recurring flow
        
        The library uses AWS builtin capability to run every 1 minute. The configuration depends on your framework. For example for Zappa use
        
        <script src="https://gist.github.com/efi-mk/0d3a21556f7b47423b18cca74e3bc3d1.js"></script>
        
        ### Scheduling
        
        <script src="https://gist.github.com/efi-mk/539440cb03b23b05d2ea86df530fe580.js"></script>
        During initialization the scheduler requires the bucket and folder in which to keep the actual scheduling details. Remember, each event is a separate file, therefore there is a need to save them somewhere. When to schedule is a simple `datetime` object.
        
        ### Stopping
        
        <script src="https://gist.github.com/efi-mk/d0e6ed2dd80988c83cb194b08f968a3b.js"></script>
        
        In case you want to cancel the schedule event before it occurs
        
        # Using S3 as a scheduler
        
        S3 is a powerful tool and it can be used for more than elastic persistent layer. You can read more about it on [hackernoon.com](https://hackernoon.com/s3-the-best-of-2-worlds-92576f23c000)
        
        In the following post I’m going to demonstrate how to use S3 as a scheduling mechanism to execute various tasks.
        
        ## Overview
        
        ![Simple S3 flow](https://cdn-images-1.medium.com/max/2000/1*8_iclxSZ_B6M--uGXkNp0Q.png)*Simple S3 flow*
        
        S3 alongside a Lambda function creates a simple event base flow, e.g. attach a Lambda to S3 PUT event, create a new file and the Lambda function is called. In order to create a schedule event all you have to do is to write the file you want to act upon on the designated time, however AWS only enables you to create recurring [events using cron or rate expression](https://docs.aws.amazon.com/lambda/latest/dg/tutorial-scheduled-events-schedule-expressions.html), what happens when you want to schedule a one time event? You are stuck.
        
        The S3-Scheduler library enables you to do just that, it uses S3 as a scheduling mechanism that enables you to schedule one time event. 
        
        ### How it works
        
        ![](https://cdn-images-1.medium.com/max/2000/1*9ZApM13Gq9OyobtKmdlZkg.png)
        
        Each event is a separate file, behind the scenes the library uses the recurring mechanism to wake up every 1 minute, scan for the relevant files using S3’s [filter capabilities](https://boto3.readthedocs.io/en/latest/guide/collections.html) and if the scheduled time had passed move the file to the relevant bucket + key.
        
        The library, in order to function properly has to know the answer to three questions:
        
        1. The content to save.
        2. Where to save it (bucket + key) → will trigger the appropriate Lambda function.
        3. When to move it to the appropriate bucket.
        
        ### Encoding details
        
        The content to save is left unchanged, points 2 and 3 mentioned above are encoded in the key’s name and use | as a separator between the parts, for example to copy the relevant content on the 5th of August to a bucket called s3-bucket and a folder named s3_important_files the scheduler will produce the following file `2018–08–05|s3-bucket|s3_files-important` . By keeping the meta data outside the actual content we achieve couple of benefits:
        
        * Speed up the process, no need to read the entire content in order to decide when and where to copy. 
        * It allows the content to be binary, not only text based.
        * By using S3 filter capabilities it reduces the cost to fetch the correct files.
        * Easier debugging, just view the file name in order to understand when and where to copy.
        
        ## Fin
        
        Scheduling in the AWS serverless world is a bit tricky, right now AWS only provides CRON like capabilities, this post demonstrated a technique that can be used to create a more robust scheduling capability. 
        
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
