version | comment |
---|---|
2021/11/30 | Added this tracking table |
2018/06/18 | Created |
This is the Design document for PLFM-4930
Since Synapse hosts files uploaded by users, we must scan user uploaded content to mitigate spread of malware. ClamAV is the recommended open-source antivirus solution.
We are using a fork of an existing Github Project created by Upside Travel.
How it works
A brief overview of the system is listed below (diagram created using https://cloudcraft.co/):
Once a file is tagged as INFECTED, a S3 bucket policy prevents the file from being downloaded. Users attempting to download it will get an HTTP 403 Access Denied from Amazon S3.
There is currently nothing checking the Dead Letter SQS. Its purpose is to catch SNS messages that were unable to be delivered to the Scanner Lambda so that undelivered messages do not fail silently.
Building/installing the scanner
git clone https://github.com/Sage-Bionetworks/bucket-antivirus-function
On Windows make sure you have cloned it with Unix-style line endings (LF) instead of Windows-style line endings (CRLF). You may do this by changing your git config or just run the dos2unix utility on the folder.- Download Docker Community Edition: https://www.docker.com/community-edition#/download
On Mac/Linux enter the following commands your cloned bucket-antivirus-function folder
make
On Windows using Command Prompt while inside the
bucket-antivirus-function
folder(In File Explorer hold down Shift key and right-click then select "Open command window here"). This is just the same command that is run bymake
butpwd
does not work on Windows and must be substituted with%CD%
.docker run --rm -ti -v %CD%:/opt/app amazonlinux:latest /bin/bash -c "cd /opt/app && ./build_lambda.sh"
If you did not check out the git repository with Unix-style line endings on Windows you may get an error: "
: No such file or directory
"- The build should have created a file called
/build/lambda.zip
- Upload the zip file to an S3 Bucket that you own. This is how the CloudFormation stack will access your built
lambda.zip
- Create a CloudFormation Stack with using
virus-scan-cloudformation.json
located in the git repository.
Triggering the scanner manually:
Normally, files will be automatically scanned upload. However, it may be beneficial to rescan files when new virus definitions are added. To trigger the Scanner Lambda manually, a JSON message that mimics the S3 Event Notification JSON format must be written to the Scanner Trigger SNS. Below is a stripped down JSON example that contains all the information that the scanner will need from the JSON:
{ "Records": [ { "eventSource": "aws:s3", "eventName": "ObjectCreated:Put", "s3": { "bucket": { "name": "test.scan.bucket.sagebase.org" }, "object": { "key": "eicar.com", } } } ] }
Limitations
- We can only scan files up to a certain size (25MB). It is very time/resource consuming to scan very large files (if we scan every file uploaded) and most files containing virus are small in size.
- The synapse production bucket is setup to notify only multipart uploads (See - PLFM-7065Getting issue details... STATUS ). We do not scan simple uploads since synapse always uses multipart uploads and each part is uploaded as a simple upload that would create a lot of overhead
Deployment
The lambda function is built using as jenkins job (TODO put reference) that builds the zip package and uploads it to artifactory. The stack builder creates a dedicated stack for the lambda downloading the zip artifact and uploading to an S3 bucket so that it can be referenced by the function, additionally each bucket that need scanning is configured to send notifications after an upload to the SNS topic that the function is triggered by.