Skip to end of banner
Go to start of banner

Authentication & Authorization

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 38 Next »

Goal:

Provide authentication / authorization for users of the Sage Platform. 

Authentication: Verify user's identity.
Authorization:  Allow a user, or an application invoked by user, to access data in the Platform.

The Platform comprises:

- Addama registry (running on Google App Engine, or "GAE")

- Addama feed service (on GAE)

- Addama file repository service (running on Amazon Elastic Compute Cloud, or "EC2")

- Addama Java Content Repository (JCR) service (on EC2)

- UI html files (on EC2)

- a Google Group

- shared Google Docs

- file repository, hosted at Sage, accessible via Secure CoPy (SCP)

Requirements and Design Constraints:

- Single sign-on.  Once a user has signed on to the platform, they don't have to sign into any of the components.

- One-stop user administration:  Adding or removing a user in one place will apply to all components.

- "Security at all layers":   No component can be part of the platform unless it adheres to (one of) our authentication mechanism(s).  (Note, we could implement several mechanisms "under the hood", if the systems we are integrating require it.)

- Platform will have 'arms length' integration with Google Apps, Groups.  (I.e. the rest of the system (Addama, Sage file repository) must work if Google tools are omitted.)

- We want to have full control over the UI (hence a custom approach using GWT instead of GoogleSites) but link to the relevant GoogleDocs and GoogleGroups and use the GoogleDocs UI and GoogleGroups UI when folks are interacting with those resource.

Analysis

There are just four components that need to perform authentication. (The others delegate authentication to the registry.)  They are listed here, along with authentication options:

Addama registry GAE application (Google account, Google Apps account, OpenID federated authentication)

Google Apps (Google Apps account, SAML delegated authentication)

Google Group (Google account, Google Apps account)

Sage SSH server (standard unix login)

If the SSH server were eliminated (by migrating the hosted files to an Addama repository service) then a common denominator *might* be Google App account authentication, which in turn might be delegated to an external identity provider.

Design

This is a possible design approach, contingent upon the answers to certain open questions (listed below):

- Restrict the Addama registry application to be hosted on a proprietary domain, configured to authenticate via "Google Apps for your domain".

- Configure Google Apps on our domain to delegate authentication via SAML.

- Configure Google Groups on our domain to delegate authentication via SAML.

- Migrate local file repository to Addama service.

- Employ Atlassian Crowd as the administration console for user authentication.

Open Questions

 - Are Atlassians Crowd pricing, license models, and hosting options acceptable for our purposes?  Do they prohibit integrating with NextBio?

(Note: Atlassian doesn't host Crowd, rather we download and host it ourselves. It's an Apache Tomcat application, with a variety of choices for databases.)

- What other SAML or OpenID identity provider (ip) tools (provding UIs and/or aggregating other ip's) are there?

- If we delegate user authentication on our domain using SAML (e.g., as described here http://code.google.com/googleapps/domain/sso/saml_reference_implementation.html), then if we also have a Google App  Engine (GAE) application configured to authenticate via "Google Apps for your domain", will the authentication of the GAE app also be delegated via SAML?

- When Goole Apps authentication is so delegated, will authentication for Google Groups on our domain also be so delegated?  (Can delegated users be added to a Google Apps group?)

- When using SAML-based authentication, are new users only to be created in the 3rd party Identity Provider, or do they somehow have to be created in our Google Apps domainas well?

- Can Google Apps and Google Groups use OpenID (instead of SAML) for authentication?

- When using SAML-based authentication, are new users only created in the 3rd party Identity Provider, or do they somehow have to be created in Google Apps as well?

The answer should be "the former" but check out
http://confluence.atlassian.com/display/CROWD/Configuring+the+Google+Apps+Connector
If a user exists  in Crowd but not in Google Apps, then the user will not be able to log in to Google Apps.

-Do we want to use google app's to see content we host elsewhere, or will google app's be the only place that doc's are stored in this 'sprint'?
 - Can "Google Group" membership be managed by an external authentication mechanism? (If not, then the google Provisioning API can create accounts for them in our domain.  Back-up alternative might be to use GMail + group alias rather than Google Groups for threaded discussions.)

- if we are doing "arm's length" integration with google app's, then what other providers should we plan for? - do we need 'audit logs', e.g. to show when users were added/removed and by whom?

Experiment to address key questions

- Set-up Crowd trial edition (where would it run?)
- Change Google Apps demo domain to authenticate against Crowd
- Change/deploy GAE app, authenticating via Google Apps
- Add user to Crowd
-Try to access Google Apps via this user (e.g. make a document)
- Try to log into to GAE app via this user
 (If not, can GAE OpenID option work with Crowd or can bypass UserService to use some sort of OpenID connector to reach Crowd?)
- Try to add user to a group in Google Apps
 (If not, then can use gmail OR can use Provisioning API to create account?)

Experiment execution

 Set Up Crowd

- Added 140.107.149.214 deflaux to C:/windows/system32/drivers/etc/hosts on my la top.  I can now PuTTY/SSH into 'deflaux' which is a Linux box in Nicole's office.

- Followed http://confluence.atlassian.com/display/CROWD/Installing+Crowd+and+CrowdID

- downloaded zip file

- downloaded and installed WinSCP; connected to deflaux:22 using SCP protocol.

- unzipped zip file and copied contents to /usr/local/tomcat on deflaux

- per the instructions, created the directory /var/crowd-home and edited .../crowd-webapp/WEB-INF/classes/crowd-init.properties accordingly.
- ran sudo ./start_crowd.sh

but got

"The BASEDIR environment variable is not defined correctly. This environment variable is needed to run this program"

- Googled around for a solution.  Found

http://codingexplorer.wordpress.com/2009/01/12/the-basedir-environment-variable-is-not-defined-correctly-this-environment-variable-is-needed-to-run-this-program/

- Did a bunch of trial-and-error and ended up trying

chmod -R 777 *

which seemed to do the trick:  Instead of getting an error message I got:

Using CATALINA_BASE:   /usr/local/tomcat/apache-tomcat
Using CATALINA_HOME:   /usr/local/tomcat/apache-tomcat
Using CATALINA_TMPDIR: /usr/local/tomcat/apache-tomcat/temp
Using JRE_HOME:        /usr
Using CLASSPATH:       /usr/local/tomcat/apache-tomcat/bin/bootstrap.jar

- Instructions say to go to http://localhost:8095/crowd.  I tried going to http://deflaux:8095/crowd and http://140.107.149.214:8095/crowd, but neither worked.  Used 'sudo ps' and 'sudo lsof -i :8095' to show that the server is indeed running.

- Nicole poked hole in the firewall on the box.  Now I can go to http://deflaux:8095/ and see the web page.  Woo hoo.

- Per the instructions at

http://confluence.atlassian.com/display/CROWD/Setting+Crowd+to+Run+Automatically+and+Use+an+Unprivileged+System+User+on+UNIX

I created the file /usr/local/tomcat/crowd.init.d, then from /etc/init.d

sudo ln /usr/local/tomcat/crowd.init.d crowd

Ran the web-based set up wizard:  Got a 30 day license key and chose to use the 'embedded' database.

Change Google Apps demo domain to authenticate against Crowd

Following:

http://confluence.atlassian.com/display/CROWD/Configuring+the+Google+Apps+Connector

Note, this says "you will need the Premier, Education, or Partners edition of Google Apps." so I may not be able to use 42stories.com.  I'll see how far I get before I'm stuck.

- Since we're using JDK 1.6, I followed the instructions which said to put the following two jars in <Crowd-Install>/crowd-webapp/WEB-INF/lib (<Crowd-Install>=/usr/local/tomcat)

1-) xml-security-1.4.2.jar
2-) commons-logging-1.1.1.jar
 

Notes:

Q: What's the cumulative file size on the Sage SSH server?

A: About 2GB, considering the files in the directory /data/incoming on sage.fhcrc.org

Google Apps provides two APIs to help with authentication:

1. SAML Single Sign-On (SSO) Service: would allow *us* to create and maintain users and groups outside of Google.

http://code.google.com/googleapps/domain/sso/saml_reference_implementation.html

2. Google Apps Provisioning API: would allow us to programmatically create Google users and groups in our private domain.  This would streamline adding users to Google Apps.  If we used it as a total solution, then the non-google app's (e.g. Addama) would have to go to google for authentication, which violates the 'arms length' integration requirement.

3. OpenID sounds like an alternative to SAML:

http://www.google.com/support/forum/p/apps-apis/thread?tid=33a3707bd2ea7904&hl=en

In the case of OpenID, the user may have a Google Account, a Google Apps Account, or an account from any other domain that provides OpenID federated login.

Integration of GAE with OpenID:

http://code.google.com/appengine/docs/java/users/overview.html

4. At times like this, faced with a moral dillema, I ask myself, "What would Atlassian Do" (WWAD)?

4.1 Seraph is a very simple, pluggable J2EE web application security framework developed  by Atlassian and used in our products.

http://confluence.atlassian.com/display/DEV/Single+Sign-on+Integration+with+JIRA+and+Confluence

4.2 Crowd is a single sign-on (SSO) application for as many users, web applications
and directory servers you need — all through a single web interface.
http://www.atlassian.com/software/crowd/

Crowd centralises identity management, allowing you to take users from different directories
and manage them in one place. Multiple user directories can be centrally managed via Crowd's
administration console.

Crowd's OpenID authentication server, CrowdID, talks with websites and applications using
OpenID. It expands Crowd's SSO capabilities to applications outside your organisation's firewall.

http://confluence.atlassian.com/display/CROWD/Configuring+the+Google+Apps+Connector
To enable single sign-on in Google Apps, you will need the Premier, Education, or Partners edition of Google Apps.
The Crowd Google Apps connector does not support the automatic adding of users. If a user exists
in Crowd but not in Google Apps, then the user will not be able to log in to Google Apps.

To add an application (e.g. a GAE app like Addama registry):

http://confluence.atlassian.com/display/CROWDDEV/Application+Integration+Overview

Licensing and hosting Crowd:

- Crowd is not hosted by Atlassian.  We have to run it ourselves.  It runs on Windows, Linux or Mac and uses an apache tomcat app server:

http://confluence.atlassian.com/display/CROWD/Installing+Crowd+and+CrowdID

- Pricing:  This is a little confusing but it seems to say that it's $10 for up to 10 users then $600/$1200 for up to 100 users (academic/commercial)

http://www.atlassian.com/software/crowd/pricing.jsp

 Open source alternatives to Crowd:

http://code.google.com/googleapps/domain/open_source_projects.html#sso

- Addama authentication is via Servlet filters using GAE User Service OR a Google API-key.

- Addama services

- Sage SSH/SCP server authenticates using standard unix log-in.

- Addama handles authentication via Servlet Filters; the servlet config xml file shows what's in place.

- Addama white list: "user x can get these services, or anything under the branch."

Nicole's "an area for testing" is a "google apps for your domain" domain

 http://www.google.com/a/sagebionetworks.com is a "test domain for Google Apps"

What's the difference between a "google account" and a "google apps account"?
 A: the latter is newer and ultimately should subsume the former.

Does Google Apps support OpenID?

A: Only as an "Identity provider" (of the Google Apps ID) not as a service provider seeking authentication.

http://code.google.com/googleapps/domain/sso/openid_reference_implementation.html

3 ways to authenticate GAE
- google accounts
- google-apps account (on proprietary domain associated with Google)
- OpenID

ours is a google apps premier (="business"?) account

Notes on Addama Registry Filters:

org.systemsbiology.addama.coresvcs.gae.filters.StaticContentFilter
  I don't think this has anything to do with authentication, rather it's a cache for static content.  

  Note:  You can't even get this far without being authenticated.
  Note: The white list (below) *authorizes*, and doesn't apply to static content.

 org.systemsbiology.addama.coresvcs.gae.filters.UserServiceFilter
  If logged-in Google Acct OR valid API Key, then allow, else deny.

 org.systemsbiology.addama.coresvcs.gae.filters.WhiteListFilter
  If the user is an Admin or is in a 'white list' for the requested resource, then allow, else deny.

 org.systemsbiology.addama.coresvcs.gae.filters.DirectLinkFilter
  Seems to handle a specific kind of request called a 'direct link' request.
  (This MIGHT be a method for retrieving large files.)

 org.systemsbiology.addama.coresvcs.gae.filters.AdminOnlyFilter
  Filter out any requests NOT from an admin.
  Applied only for addama/memcache/*

 org.systemsbiology.addama.coresvcs.gae.filters.ProxiesFilter
  Seems to forward certain requests (in particular, non-registry requests) to GAE's "URLFetchService".

- what does <security-constraint> in the GAE web.xml file mean?
A: from http://code.google.com/appengine/docs/java/users/overview.html
If you have pages that the user should not be able to access unless signed in, you can establish a security constraint for those pages in the deployment descriptor (the web.xml or app.yaml file). If a user accesses a URL with a security constraint and the user is not signed in, App Engine redirects the user to the sign-in page automatically (for Google Accounts or Google Apps authentication) or to the page at /_ah/login_required (for OpenID authentication), then directs the user back to the URL after signing in or registering successfully.

A security constraint can also require that the user be a registered administrator for the application. This makes it easy to build administrator-only sections of the site, without having to implement a separate authorization mechanism.


 

  • No labels