Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

This document was created for review by the Synapse engineering team to respond to SWC-6374 - Getting issue details... STATUS . There is no guarantee that this information is up-to-date, nor is it prescriptive for future security concerns.

Findings

A recent penetration test performed by NCC has found a vulnerability in our Cross-Origin Resource Sharing (CORS) configurations for synapse.org (Portal services) and repo-prod.prod.sagebase.org (Repo services). The report describes how an XSS attack could be orchestrated to exfiltrate user credentials from an arbitrary subdomain of *.synapse.org. The scenario is unlikely, but high impact; a successful attack would fully compromise the Synapse account of each affected user.

Our Portal services' CORS configuration enables us to retrieve and re-use authentication credentials across all *.synapse.org sites, so users don’t have to authenticate multiple times. NCC recommends that we restrict this functionality to a whitelist of known subdomains of synapse.org. Because this functionality is only provided on *.synapse.org, our whitelist would not need to contain any origins that are not subdomains of synapse.org.

To follow this recommendation, we must estimate the level of effort of, and perform the following:

  1. Regularly or programmatically compile a list of valid synapse.org subdomains.

  2. Add functionality to the portal to utilize this list of subdomains to restrict valid .synapse.org origins.

The domain for the repository services (repo-<stack>.prod.sagebase.org) were also specified in the report, but the services are not vulnerable to the same attack, and specific findings were not provided. There is a related misconfiguration that does not expose us any vulnerability, but we should address it to reduce the risk of vulnerabilities introduced by future changes.

Both of our applications allow any origin to handle the responses of non-credentialed requests. Taking similar action against non-credentialed requests would require a much greater technical burden. We should seek further correspondence with NCC to determine if the above would be sufficient action to resolve the issue.

Background

This section will summarize the purpose of CORS, the HTTP response headers relevant to the penetration test findings, and outline the threat model that a “proper” CORS configuration is supposed to protect against.

Relevant CORS headers

By default, browsers restrict JavaScript code from accessing the response of an HTTP request to a different origin than the open window or frame. For certain types of requests, browsers send a “preflight” request to determine if the request would be allowed before it is sent. The server can set values in certain headers to loosen these restrictions. This document will focus only on the headers referenced in the penetration test and issue for the sake of brevity.

The Access-Control-Allow-Origin response header is used by the browser to determine if the JavaScript environment should be able to read the response of a cross-origin request. The value may be "*", an Origin (e.g. https://www.synapse.org), or null.

The Access-Control-Allow-Credentials response header is used by the browser to determine if a response can be exposed to the JavaScript that initiated the request in cases where the request includes credentials automatically attached by the browser (e.g. cookies). The value may be "true" or null. Note that the browser will always reject responses to requests with credentials if Access-Control-Allow-Origin is "*" (reference). Additionally, some modern browsers currently block, or plan to block, third-party cookies by default.

There are also additional response headers that can be used to restrict which request headers and methods are permitted, and which headers can be exposed in JavaScript.

Aside: CORS does not guarantee that a request will not be sent. Certain requests may trigger a preflight check that will cause a browser to not send a request, but this does not apply for all types of requests. For this reason, a strict CORS configuration does not eliminate the risk of CSRF attacks. A more appropriate solution for this threat model is to reject requests on the server based on the origin.

Current behavior of affected web applications

The penetration test reported this vulnerability across two different web applications that have server-side components: the repository app hosted on repo-<stack>.prod.sagebase.org, and the servlet for the web portal hosted on <stack>.synapse.org.

Service

Domain

Access-Control-Allow-Origin

Access-Control-Allow-Credentials

Implications

Synapse-Repository-Services

repo-prod.prod.sagebase.org

repo-staging.prod.sagebase.org

*

true, but these services cannot pass cross-origin credentials because Access-Control-Allow-Origin is always "*"

Any origin can access response data.

Cross-origin sites cannot access response data sent with credentials.

SynapseWebClient

www.synapse.org

staging.synapse.org

*, but if the origin ends with .synapse.org, returns the origin.

true iff the origin ends with .synapse.org. Otherwise, unspecified.

Any origin can access response data.

Only *.synapse.org origins can access response data sent with credentials.

Portal

Code: https://github.com/Sage-Bionetworks/SynapseWebClient/blob/develop/src/main/java/org/sagebionetworks/web/server/servlet/filter/CORSFilter.java

For requests originating outside of .synapse.org, Access-Control-Allow-Origin is *. This is appropriate because some portal services provide content specifically designed to be rendered on other sites (e.g. snippets on social media).

At the time of writing, the portal has served content to 863 unique origin values in the past 30 days. Full data omitted, methods shown below.

 Portal Unique Origins

In the Synapse production AWS account, Cloudwatch logs were used to determine the number of unique origin values that made requests to the portal servers.

To gather this data, three log groups were used:

/aws/elasticbeanstalk/portal-prod-435-0/var/log/httpd/access_log

/aws/elasticbeanstalk/portal-prod-436-0/var/log/httpd/access_log

/aws/elasticbeanstalk/portal-prod-437-0/var/log/httpd/access_log

These groups encompass the current production stack, as well as the previous two stacks.

The following Cloudwatch Log Insights query was used to determine the unique origins that used the Portal service in the last 30 days (note that the time filter was applied via the Cloudwatch UI):

parse @message '* - * [*] "* * *" * * "*" "*"' as host, identity, dateTimeString, httpVerb, url, protocol, statusCode, bytes, referer, useragent
| parse referer '*://*/' as protocol2, domain
| fields concat(protocol2, "://", domain) as origin
| stats count() as count by origin
| sort count desc

This returned 863 unique results. Here are the top 5:

Origin

Count

https://www.synapse.org

12,291,758

<empty>

2,483,980

https://staging.synapse.org

92,497

https://adknowledgeportal.synapse.org

10,304

https://accounts.google.com

4,716

The value for Access-Control-Allow-Credentials changes based on the origin. If the origin ends with .synapse.org, then we set the header to be true and set the value of Access-Control-Allow-Origin to match the request origin, so that the browser will permit using requests with credentials.

The only known scenario that depends on this configuration is that we persist authentication state across *.synapse.org domains. When logging in to Synapse on a .synapse.org site, a cookie is stored that contains the access token. The cookie is HttpOnly, so to get this access token into JavaScript, the browser will call the Portal service GET /Portal/sessioncookie with the cookie. The service responds with the access token in the response body, which can be stored in-memory in JavaScript. Because of our current CORS configuration, this functionality only works from .synapse.org domains. The proposed change from NCC does not compromise this behavior.

No changes would be required with respect to the dev environment because the Portal dev instance (portal-dev.dev.sagebase.org) is not configured to provide a shared authentication session.

As shown in the expandable section below, there are 22 unique .synapse.org origins that are utilizing the service that provides credentials in a cross-origin context. To follow NCC’s recommendation, we must be able to compile this list of valid subdomains, and then pass them to the portal to selectively filter the subdomains that are allowed to use credentials.

 Portal requests with *.synapse.org origin

First, we’ll derive the list of all *.synapse.org origins that have used the Portal service in the last 30 days:

parse @message '* - * [*] "* * *" * * "*" "*"' as host, identity, dateTimeString, httpVerb, url, protocol, statusCode, bytes, referer, useragent
| parse referer '*://*/' as protocol2, domain
| fields concat(protocol2, "://", domain) as origin
| stats count() as count by origin
| filter origin like /\.synapse\.org/
| sort count desc

origin

count

https://www.synapse.org

12299325

https://staging.synapse.org

92349

https://adknowledgeportal.synapse.org

10333

https://signin.synapse.org

2903

https://psychencode.synapse.org

1488

https://nf.synapse.org

1251

https://portal-prod-435-0.synapse.org

999

http://www.synapse.org

543

https://portal-prod-436-0.synapse.org

494

https://arkportal.synapse.org

443

https://bsmn.synapse.org

408

https://staging.accounts.sagebionetworks.synapse.org

407

https://www.cancercomplexity.synapse.org

397

https://www.synapse.org .

390

https://stopadportal.synapse.org

348

https://cancercomplexity.synapse.org

224

https://dhealth.synapse.org

216

https://synapse-prod.synapse.org

157

https://user-guides.synapse.org

130

https://tst.synapse.org

87

https://status.synapse.org

87

https://rest-docs.synapse.org

76

http://user-guides.synapse.org

62

https://portal-prod-437-0.synapse.org

40

https://origin.synapse.org

34

https://staging.arkportal.synapse.org

27

http://synapse-prod.synapse.org

18

https://python-docs.synapse.org

14

http://python-docs.synapse.org

12

https://staging.nf.synapse.org

7

https://r-docs.synapse.org

7

https://shinypro.synapse.org

3

http://tst.synapse.org

2

http://staging.synapse.org

1

https://alzdrugtool.synapse.org

1

http://staging-ran.synapse.org

1

http://ran.synapse.org

1

https://covidrecoverycorpsresearcher.synapse.org

1

https://staging-signin.synapse.org

1

Only a subset of these actually use credentials. We’ll see which origins called GET /Portal/sessioncookie to infer which origins rely on the current CORS configuration.

parse @message '* - * [*] "* * *" * * "*" "*"' as host, identity, dateTimeString, httpVerb, url, protocol, statusCode, bytes, referer, useragent
| parse referer '*://*/' as protocol2, domain
| fields concat(protocol2, "://", domain) as origin
| filter url like "/Portal/sessioncookie"
| filter origin like /\.synapse\.org/
| stats count() as count by origin
| sort count desc

Repo

Code: https://github.com/Sage-Bionetworks/Synapse-Repository-Services/blob/develop/services/authutil/src/main/java/org/sagebionetworks/authutil/SimpleCORSFilter.java

Today, the Access-Control-Allow-Origin header for the repository services is always *. This means that JavaScript code running on any origin may view the responses in cross-site requests to the repository services.

Restricting the Access-Control-Allow-Origin header to a whitelist would have substantial technical cost. Within a recent 30 day window, we have allowed requests from 46 different origins:

 Table of all Origin Requests from 2022-12-25 to 2023-01-24

ORIGIN

Number of Requests, past 30 days

<empty>

68,165,544

https://www.synapse.org

14,502,342

https://psychencode.synapse.org

3,131,519

https://adknowledgeportal.synapse.org

828,395

https://staging.synapse.org

70,991

https://nf.synapse.org

46,339

http://localhost:6060

24,957

https://arkportal.synapse.org

20,155

http://127.0.0.1:8888

17,700

https://signin.synapse.org

13,188

https://www.cancercomplexity.synapse.org

9,509

https://dhealth.synapse.org

7,421

https://bsmn.synapse.org

5,694

https://cancercomplexity.synapse.org

5,112

http://localhost:3000

2,566

https://staging.arkportal.synapse.org

2,507

https://sage-bionetworks.github.io

2,123

https://staging.adknowledgeportal.synapse.org

1,283

null

790

https://tst.synapse.org

701

https://stopadportal.synapse.org

593

https://staging.accounts.sagebionetworks.synapse.org

436

https://d2ludihrr6kxy3.cloudfront.net

337

http://127.0.0.1:3000

247

https://staging.nf.synapse.org

216

https://d9t3oxh59s1dm.cloudfront.net

100

https://agora.adknowledgeportal.org

82

https://portal-dev.dev.sagebase.org

38

https://staging.cancercomplexity.synapse.org

27

https://staging.studies.mobiletoolbox.org

27

https://portal-prod-432-2.synapse.org

20

https://portal-prod-436-0.synapse.org

20

https://synapse-prod.synapse.org

20

http://localhost:8080

14

https://34.230.158.104

10

https://research.sagebridge.org

10

http://127.0.0.1:3001

8

https://52.203.200.37

5

https://agora-develop.adknowledgeportal.org

5

https://psychencode-synapse-org.translate.goog

4

https://d2urqeqifglv0s.cloudfront.net

3

http://agora.adknowledgeportal.org

2

https://agora-staging.adknowledgeportal.org

2

https://tnt-ui-dot-amp-pd-data-coordination.uc.r.appspot.com

2

http://localhost

1

http://www.synapse.org

1

Data captured from running the following query in the data warehouse on 2023-01-24:

SELECT ORIGIN, COUNT(*) AS NUMBER_OF_RECORDS
FROM ACCESS_RECORD
WHERE TIMESTAMP BETWEEN unix_timestamp(curdate())*1000 - (30*24*60*60*1000) AND unix_timestamp(curdate())*1000
GROUP BY ORIGIN;

This is a wide range of origins that includes multiple applications, multiple environments, CDNs, and applications developed by other teams at Sage, so there would likely be significant technical complexity in maintaining this list.

Today, the current value of Access-Control-Allow-Credentials is irrelevant because we do not specify a specific origin for Access-Control-Allow-Origin. For this reason, we should change the repo to unset Access-Control-Allow-Credentials.

Additionally, we should investigate removing functionality related to credential cookies, because it’s not clear if they are used by clients today. See https://github.com/Sage-Bionetworks/Synapse-Repository-Services/blob/develop/services/authutil/src/main/java/org/sagebionetworks/authutil/CookieSessionTokenFilter.java.

  • No labels