version | comment |
---|---|
2021/11/30 | Added this tracking table |
2018/06/18 | Created |
This is the Design document for PLFM-4930
Since Synapse hosts files uploaded by users, we must scan user uploaded content to mitigate spread of malware. ClamAV is the recommended open-source antivirus solution.
We are using a fork of an existing Github Project created by Upside Travel.
A brief overview of the system is listed below (diagram created using https://cloudcraft.co/):
Once a file is tagged as INFECTED, a S3 bucket policy prevents the file from being downloaded. Users attempting to download it will get an HTTP 403 Access Denied from Amazon S3.
There is currently nothing checking the Dead Letter SQS. Its purpose is to catch SNS messages that were unable to be delivered to the Scanner Lambda so that undelivered messages do not fail silently.
git clone https://github.com/Sage-Bionetworks/bucket-antivirus-function
On Mac/Linux enter the following commands your cloned bucket-antivirus-function folder
make |
On Windows using Command Prompt while inside the bucket-antivirus-function
folder(In File Explorer hold down Shift key and right-click then select "Open command window here"). This is just the same command that is run by make
but pwd
does not work on Windows and must be substituted with %CD%
.
docker run --rm -ti -v %CD%:/opt/app amazonlinux:latest /bin/bash -c "cd /opt/app && ./build_lambda.sh" |
If you did not check out the git repository with Unix-style line endings on Windows you may get an error: ": No such file or directory
"
/build/lambda.zip
lambda.zip
virus-scan-cloudformation.json
located in the git repository.Normally, files will be automatically scanned upload. However, it may be beneficial to rescan files when new virus definitions are added. To trigger the Scanner Lambda manually, a JSON message that mimics the S3 Event Notification JSON format must be written to the Scanner Trigger SNS. Below is a stripped down JSON example that contains all the information that the scanner will need from the JSON:
{ "Records": [ { "eventSource": "aws:s3", "eventName": "ObjectCreated:Put", "s3": { "bucket": { "name": "test.scan.bucket.sagebase.org" }, "object": { "key": "eicar.com", } } } ] } |
The lambda function is built using as jenkins job that builds a zip package containing the python code for the lambda(s) and uploads it to our jfrog repository as a generic artifact. The stack builder creates a dedicated stack for the lambda(s) downloading the zip artifact and uploading to an S3 bucket so that it can be referenced by the function(s), additionally each bucket that need scanning is configured to send notifications after a multipart upload to the SNS topic that the scanner function is triggered by.
This section documents common error messages, what they mean, and what action to take (if any).
These errors occur if the file is deleted from S3 before the antivirus is able to scan the file. This can happen, for example, if a file is uploaded as part of an Integration test and then deleted before the queue message is read by the antivirus. (This is particularly common in Bridge Server.) These messages can be ignored, and if necessary, they can be suppressed from the error alarm.
This is an error writing the AV status to the S3 file. If the scanner says the file is clean, this is not an issue. If the scanner says the file is infected, you will need to manually quarantine the file.
This is an error in the antivirus where it occasionally mistakes a binary file for a text file and tries to parse it as a text file. This causes the antivirus to fail to scan the file. In particular, it appears to be deterministic; running the file through the antivirus again causes it to fail again with the same error message.
This has been reported to bucket-antivirus-function. See:
https://github.com/bluesentry/bucket-antivirus-function/issues/142
https://github.com/bluesentry/bucket-antivirus-function/issues/183
https://github.com/bluesentry/bucket-antivirus-function/pull/188