Adding Custom Agents to Synapse
Would you like to add a domain specific chat bot that can help your users discover the most relevant datasets in your portal?
Would you like to add a domain specific data curation assistant to your data ingress pipeline?
These are just a few examples of why a Sager might want to develop and deploy a custom Synapse agent. This document will help you understand how the Synapse chat framework works, and more importantly how to extend it to deploy custom agents into production.
What is the Synapse Chat Framework?
Synapse chat is built on a framework of web APIs, workers, and UI elements that enable a user to safely and securely interact with production Synapse data within a “chat” interface. The core of this framework is a Synapse worker that will forward a user’s prompt to an AWS bedrock agent by calling the agent’s invoke_agent method. To understand how this all works we need a quick review of bedrock agents.
Bedrock Agent
Most of the LLMs available today have been trained on vast amounts of general knowledge. A significant amount of the model’s training data is from the public domain. However, the model’s training data is unlikely to include any private data, such as private Synapse metadata/data. This means that, out-of-the-box, these LLMs cannot help users with navigating and understanding private Synapse metadata/data. A common mechanism to bridge this knowledge gap is to allow the LLM to access private data during a chat session in a process called Retrieval-Augmented Generation (RAG). Basically, the LLM can “augment” its response to user requests by gather private data during a chat sessions.
With a RAG system the LLM is not acting as a knowledge source. Instead, the LLM’s role is one of “orchestration”. Basically, we will be providing the LLM a with “toolkit” that it can use to do “things”. For example, we might provide an LLM with a toolkit that can do all of the following:
Gather private data
Transform data
Make data changes
Query help docs
Delegate tasks to other, specialized, models
Ultimately, the LLM will be responsible for “orchestrating” how it will use the provided toolkit to solve user’s prompts. Hopefully, it is clear that “retrieval” is just one facet of what is possible!
The Amazon’s Bedrock Agent feature allows us to setup a LLM orchestration toolkit. Since each agent is server-less, they are simple to define and deploy. In the next section, we will cover how the Synapse chat framework will help add common tools to your own custom agents. Later, we will cover how to deploy and test your custom agents into production.
Synapse Chat Framework
At this point you should have a basic understanding of a bedrock agent and how it allows us to create an LLM orchestration toolkit to solve real user problems. The Synapse chat framework provides both a chat UI and infrastructure for executing an orchestration toolkit. More importantly, the framework also provides a mechanism for the LLM to securely interact with private metadata/data in Synapse on the user’s behalf.
If you have used the existing chat UI in Synapse you are likely already familiar with basic concepts. When the user provides a prompt, the UI shows the “thought” of the LLM as it attempts to find a solution. These thoughts provide valuable insights into the LLM orchestration process. You might see thoughts that mention “running a search” or “fetch entity metadata”. Both of these are examples of where LLM orchestration process both identified and executed these tools from its provided toolkit.
Chat Worker
In the introduction, we mentioned that core of the Synapse chat framework is an asynchronous worker that will take single user prompt and call invoke_agent on the appropriate bedrock agent. It might be helpful to read about how to submit such a job through the API docs. However, before we can start a job, we must first start a new agent session. An agent session defines which agent is to be used for a chat session. If we do not specify an agent, then Synapse will use the “baseline” agent. We will cover more about the baseline agent later. A key concept of a session is the sessionId, which is the unique identifier that the bedrock agent will use to isolate a user’s conversation. The session will contain the full history of a conversation and is used to provide LLM’s context window with the relevant information. Note: When an agent fetches data from Synapse that data will automatically be added to the session and forwarded to the LLM by bedrock.
A request to start a new chat session includes two important parameters:
agentAccessLevel - This parameter is used to tell the framework what level of access to Synapse the agent should have during the session.
agentRegistrationId - Defines which agent should be used for session. This is how we will tell Synapse to use our custom agents for chat session in production.
(See: CreateAgentSessionRequest for more information).
If we start a new chat session without providing a value for the agent registration ID, then the default “baseline” will be used. The baseline agent is a general use agent with basic prompting for all of the current support tools for accessing Synapse data. For the details on the current baseline agent see its CloudFormation template: bedrock_agent_template.json
In the next section we will show how a Sager can create and use their own agents with a “hello world” example.
Creating a Custom Agent
We will be using CloudFormation to handle all of the details of creating an our hello-world bedrock agent.
Here is the the template:
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "A simple 'hello world' bedrock agent.",
"Parameters": {
"agentName": {
"Description": "Provide a unique name for this bedrock agent.",
"Type": "String",
"AllowedPattern": "^([0-9a-zA-Z][_-]?){1,100}$"
}
},
"Resources": {
"bedrockAgent": {
"Type": "AWS::Bedrock::Agent",
"Properties": {
"AgentName": {
"Ref": "agentName"
},
"AgentResourceRoleArn": "arn:aws:iam::050451359079:role/bedrock-agent-role-bedrockAgentRole-uVdCv8WImmcJ",
"AutoPrepare": true,
"Description": "A simple 'hello world' bedrock agent that will reply with 'world' when given a 'hello'",
"FoundationModel": "anthropic.claude-3-sonnet-20240229-v1:0",
"IdleSessionTTLInSeconds": 3600,
"Instruction": "You are a helpful test agent that when greeted with: 'hello' will always response with: 'world'",
"SkipResourceInUseCheckOnDelete": true
}
}
}
}
With under 30 lines of JSON we have all of the information needed to define our agent. You can read more about all of the fields in the Amazon documentation for AWS::Bedrock::Agent, but for our example agent we fill focus the Instruction
field. The instructions represent our basic prompt engineering that defines our agent:
You are a helpful test agent that when greeted with: 'hello' will always response with: 'world'
With these instructions, our new agent should do exactly that.
Create a new file with the contents contents of the above template with a name like: my-hello-world.json
Logging in with JumpCloud: https://console.jumpcloud.com/login
Select: aws-sso-organization
Select: org-sagebase-synapsellm-prod . Note if you do not see this account in your JumpCloud list, please contact IT and ask to be added.
Select: LlmDeveloper
From the top right corner, ensure you are in us-east-1 (N. Virginia)
Navigate to CoundFormation by entering it into the search bar at the top left:
Select “Create Stack” With new resources (standard)
The first page of the wizard allows us to upload our ‘my-hello-world.json’ template file as follows. Click the “Choose File” button and upload the template you created in step 1. and then click “next”:
On the next page we must provide two unique names, one for the CloudFormation stack, and the other will be the name of our agent. I used my name in both fields:
On the next page of the wizard we can accept all of the default options and select “next”:
On the final page of the wizard we will also accept all of the default values and select “submit”:
CloudFormation should start the process to create your new stack that defines your hello-world agent. After a minute or two you should see something like this:
Let’s now go find our new agent in Bedrock by typing “bedrock” into the search box at the top left, then select “Amazon Bedrock”:
From the navigation panel on the left find the section named: 'Agents”:
You should be able to find your newly created agent.
Finally if you click the link on your agent you should see your new agent’s details. Note: The agent’s ID is circled as we are going to need this in a future step:
Congratulations! If you made it this far you have successfully created your own hello-world bedrock agent. In the next section we will cover how to test our new agent in Synapse.
Registering Our New Agent
Before we can use our newly created hello-world agent we must register it with Synapse. For this section you will need to have a Synapse personal access token with both “view” and “modify” for a user that belongs to the Sage Bionetworks team. If your user does not belong to this team you will first need to request to join the team. If you do not already have a personal access token for your account you will need to create one with both “view” and “modify” by following the instructions from the Synapse help docs: Personal Access Tokens.
You will also need to have curl installed on your local machine and available from the command line. You can verify your have curl installed by running curl --version
from your favorite command prompt.
Create a token file that contains your personal access token with the following template:
Authorization: Bearer <paste youre personal access token here>
Open a command prompt with access to curl and setup the following:
curl -X PUT 'https://repo-prod.prod.sagebase.org/repo/v1/agent/registration' \ -H @C:/Users/John/.synapse/non-admin-token.txt \ -H "Content-Type: application/json; charset=utf8" \ --data-raw '{"awsAgentId":"I6DE8ZTR25"}'
You will need to replace:
C:/Users/John/.synapse/non-admin-token.txt
with the path to the token file you created in step 1.You will also need to replace the awsAgetnId with your own agent ID from step 17. of the previous section.
Once curl command is setup correctly, pres enter to execute it. If successful you should receive a response similar to:
Make sure that resulting awsAgentId matches your agent’s ID, and record the resulting agentRegistrationId.
Now that we have an agent registration ID for our new agent we can test our agent in production using the following URL (replace the '9' with the agentrRegistrationId you received in step 5):
Congratulations! At this point you have successfully created your new hello-world agent and successfully registered it with Synapse. Finally, by providing your agent registration ID in the UI, you should have successfully started a new conversation with your agent in production Synapse!
While our hello-world agent is extremely simple, hopefully you are now equipped with the ability to create and test more sophisticated agents. Once your agents are ready to be used by end users, we should be able to configure the Synapse UI and portals to utilize your agent ids in the correct context.