...
Table of Contents |
---|
Executive Summary
Worth further investigationNew stuff that might be worth investigating
Worth investigating on a limited basis
https://github.com/features/copilot - Continue passive investigation as part of my 60-day free trial while working on normal sprint tasks. Seems most promising for enumerating test casesuseful as a sophisticated auto-complete to reduce copy-pasting.
https://wwwopenai.diffblue.com/products/ - Might be valuable product/gpt-4 (ChatGPT) - Good at describing copy-pasted code. Might be useful for unit testing simple functions, but inadequate for anything complex. Almost certainly not worth the paid tierInability to follow links to code limits its usefulness.
Investigation blocked
https://codeball.ai/ - Ran into issues setting up Codeball. Devs have not yet responded to requests for support.https://www.codiga.io/ - Requires org-level GitHub permissions to install, even if I’m only installing on repositories that I have admin access over. Currently waiting on IT to respond to install requestaws.amazon.com/codeguru/ - CodeGuru attempts to connect to the wrong fork of our repo. Blocked waiting for AWS Support to respond.
Not worth further investigation
https://platformwww.openaidiffblue.com/docsproducts/guides/code - Inadequate. Maybe try - ChatGPT does a better job. However, DiffBlue has the advantage in that it runs as an IDE plugin. Might be worth looking into again in 2024 or later when AI research has improved2025 as technology improves.
Update: GPT-4 was released recently and might be worth looking intohttps://codeball.ai/ - Appears to be a dead project.
https://www.codiga.io/ - Codiga is being shut down and its future is undetermined.
Code Generation
OpenAI
OpenAI is best known for their product ChatGPT. However, they have models that can be used for code generation. See https://platform.openai.com/docs/guides/code . Initial investigations into OpenAI were underwhelming. The better of the two models (code-davinci-002) was frequently under load and unavailable, so I did my investigations with the lesser of the two models (code-cushman-001, described as “almost as capable as code-davinci-002, but slightly faster”).
...
Conclusion: OpenAI isn’t ready for code generation, at least not as is.
GitHub Copilot
GitHub Copilot was similar to OpenAI in that both ended up writing code that was repetitive and didn’t compile and needed a lot of editing to get it to compile and run. The one advantage Copilot has over OpenAI is that Copilot can run as an IDE plugin.
In terms of cost, GitHub Copilot costs $10/mo, which is less than 0.2% of an entry-level Associate Engineer’s salary at Sage.
...
GPT-4 / ChatGPT
GPT-4 is currently only available as ChatGPT through https://chat.openai.com/chat
First, I copy-pasted the method at https://github.com/DwayneJengSage/BridgeServer2/blob/develop/src/main/java/org/sagebionetworks/bridge/services/UploadService.java#L522
I went to UploadServiceTest.java and gave the following prompt
Code Block | ||
---|---|---|
| ||
// Unit test for UploadService.getUploadViewForExporter3() with timeline and adherence.
@Test
public void testGetUploadViewForExporter3() { |
Copilot filled this in with the following
...
language | java |
---|
...
. ChatGPT responded with
This method is a part of some application or service that is responsible for managing uploads of health data records, which are related to a study. The method fetches the view that includes both the upload and the record (if they exist) for a given upload ID, and optionally gets the timeline metadata and the adherence records, if they exist.
The method takes as input parameters the
appId
anduploadId
, which are required, and thestudyId
, which is only required if we are fetching adherence. It also takes two boolean flags,fetchTimeline
andfetchAdherence
, which are used to determine whether to include the timeline metadata and the adherence records in the view, respectively.The method first checks that the
appId
anduploadId
are not null, and that thestudyId
is not blank iffetchAdherence
is true. It then gets the upload and the record for the givenuploadId
, and checks that they are not null and that they belong to the sameappId
. If neither the upload nor the record exist, it throws an EntityNotFoundException.The method then gets the
healthCode
from the upload or the record, and uses it to get theuserId
. It checks that the caller has permission to read the uploads for the givenuserId
andstudyId
.If
fetchTimeline
orfetchAdherence
is true, the method gets theinstanceGuid
from the upload or the record metadata, and uses it to fetch the timeline metadata and/or the adherence records, if they exist. If the timeline metadata is associated with a differentappId
, the method logs an error and moves on. If the adherence records are associated with a differentappId
, it throws an EntityNotFoundException.Finally, the method returns the
UploadViewEx3
object that contains the upload, the record, the timeline metadata (if fetched), the adherence records (if fetched), thehealthCode
, and theuserId
.
This is an accurate summary of the code.
I then prompted ChatGPT: “Write unit tests for that method.” ChatGPT responded with
Code Block | ||
---|---|---|
| ||
public class UploadViewEx3Test { private UploadViewEx3 uploadView; private String appId = "test-app-id"; private String studyId = "test-study-id"; private String uploadId = "test-upload-id"; private boolean fetchTimeline = true; private boolean fetchAdherence = true; private UploadDao uploadDao; private HealthDataEx3Service healthDataEx3Service; private AccountService accountService; private Schedule2Service schedule2Service; private AdherenceService adherenceService; @Before public void setUp() { uploadDao = mock(UploadDao.class); healthDataEx3Service = mock(HealthDataEx3Service.class); accountService = mock(AccountService.class); schedule2Service healthDataRecord)= mock(Schedule2Service.class); adherenceService // Mock timeline.= mock(AdherenceService.class); TimelineMetadata timelineMetadatauploadView = new TimelineMetadataUploadViewEx3(); timelineMetadatauploadView.setAppVersionsetId(1uploadId); } timelineMetadata.setPhoneInfo("phone info"); @Test(expected = NullPointerException.class) public void testGetUploadViewForExporter3WithNullAppId() { UploadViewEx3 uploadViewEx3 = new timelineMetadata.setTimeZone("America/Los_Angeles"UploadViewEx3(); timelineMetadatauploadViewEx3.setUploadDate(CREATED_ON);getUploadViewForExporter3(null, studyId, uploadId, fetchTimeline, fetchAdherence); } timelineMetadata.setUploadId(UPLOAD_ID_1); @Test(expected = NullPointerException.class) public void timelineMetadata.setUploadMetadatatestGetUploadViewForExporter3WithNullUploadId(METADATA); { when(mockTimelineService.getTimelineMetadata(UPLOAD_ID_1)).thenReturn(Optional.of( UploadViewEx3 uploadViewEx3 = new UploadViewEx3(); uploadViewEx3.getUploadViewForExporter3(appId, studyId, null, fetchTimeline, timelineMetadatafetchAdherence)); } // Mock adherence. @Test(expected = BadRequestException.class) public void testGetUploadViewForExporter3WithFetchAdherenceTrueAndBlankStudyId() AdherenceRecord{ adherenceRecord = new AdherenceRecord(); UploadViewEx3 uploadViewEx3 = new adherenceRecord.setAppVersionUploadViewEx3(1); adherenceRecorduploadViewEx3.setPhoneInfogetUploadViewForExporter3("phone info");appId, "", uploadId, fetchTimeline, true); } adherenceRecord.setTimeZone("America/Los_Angeles"); @Test(expected = EntityNotFoundException.class) public void adherenceRecord.setUploadDate(CREATED_ON);testGetUploadViewForExporter3WithNoUploadOrRecord() { adherenceRecordwhen(uploadDao.setUploadId(UPLOAD_ID_1); adherenceRecord.setUploadMetadata(METADATAgetUploadNoThrow(uploadId)).thenReturn(null); when(mockAdherenceServicehealthDataEx3Service.getAdherenceRecord(UPLOAD_ID_1getRecord(uploadId, false)).thenReturn(Optional.ofempty()); uploadView.getUploadViewForExporter3(appId, studyId, uploadId, adherenceRecord)fetchTimeline, fetchAdherence); } // Execute@Test(expected and= validateEntityNotFoundException.class) public UploadView uploadView = svc.getUploadViewForExporter3(TEST_APP_ID, null, UPLOAD_ID_1, true,void testGetUploadViewForExporter3WithUploadFromDifferentApp() { Upload upload true= mock(Upload.class); assertEqualswhen(uploadViewupload.getUploadId(), UPLOAD_ID_1getAppId()).thenReturn("different-app-id"); assertEqualswhen(uploadViewuploadDao.getHealthCodegetUploadNoThrow(), HEALTH_CODEuploadId)).thenReturn(upload); assertEquals(uploadView.getStudyIdgetUploadViewForExporter3(), TEST_STUDY_IDappId, studyId, uploadId, fetchTimeline, fetchAdherence); } |
This looks really nice, except that most of the setters on Upload, Record, Timeline, and Adherence don’t actually exist, nor do the getters on UploadView. Also, by the logic of the code under test, this wouldn’t actually exercise the code under test.
At Nick Grosenbacher’s suggestion, I attempted to use Copilot to enumerate test cases. I gave the following prompt
Code Block | ||
---|---|---|
| ||
// Generate test cases for UploadService.getUploadViewForExporter3().
// test case: |
Copilot auto-completed it to
Code Block | ||
---|---|---|
| ||
// test case: upload, record, expected exception |
And then repeated it a bunch of times.
Okay, let’s trye
Code Block | ||
---|---|---|
| ||
// test case 2: |
Copilot auto-completed it to the same thing
Code Block | ||
---|---|---|
| ||
// test case 2: upload, record, expected exception |
Okay, maybe my mistake was asking Copilot to enumerate test cases in a file separate from where my code is. Let’s try putting this prompt in a comment immediately after the method I’m trying to test
Code Block | ||
---|---|---|
| ||
// Generate test cases for getUploadViewForExporter3().
// test case: |
Copilot auto-completed it with
Code Block | ||
---|---|---|
| ||
// test case: getUploadViewForExporter3() with upload
// test case: getUploadViewForExporter3() with record
// test case: getUploadViewForExporter3() with upload and record
// test case: getUploadViewForExporter3() with upload and record, but upload is from a different app
// test case: getUploadViewForExporter3() with upload and record, but record is from a different app
// test case: getUploadViewForExporter3() with upload and record, but upload and record are from different apps
// (repeats the last line ad infinitum) |
Okay, it’s nowhere near exhaustive. There are a lot of important cases that it’s missing, but it’s a good start.
Conclusion: Copilot is worth continuing to passively investigate while working on normal sprint work. It seems okay for enumerating test cases, but it’s incomplete, and I wouldn’t yet trust it to generate code for me.
DiffBlue Cover
See https://www.diffblue.com/products/ DiffBlue Cover is a tool to analyze code and generate unit tests. It exists as an IntelliJ plugin.
@Test(expected = EntityNotFoundException.class)
public void testGetUploadViewForExporter3WithRecordFromDifferentApp() {
HealthDataRecordEx3 record = mock(HealthDataRecordEx3.class);
when(record.getAppId()).thenReturn("different-app-id");
when(healthDataEx3Service.getRecord(uploadId, false)).thenReturn(Optional.of(record));
uploadView.getUploadViewForExporter3(appId, studyId, uploadId, fetchTimeline, fetchAdherence);
}
@Test
public void testGetUploadViewForExporter3WithUploadOnly() {
Upload upload = mock(Upload.class);
when(upload.getAppId()).thenReturn(appId);
when(upload.getHealthCode()).thenReturn("test-health-code");
when(uploadDao.getUploadNoThrow(uploadId)).thenReturn(upload); |
Note that the test is incomplete. I prompted ChatGPT with “Please continue.” ChatGPT wrote one more unit test
Code Block | ||
---|---|---|
| ||
@Test
public void testGetUploadViewForExporter3_RecordExistsButNoUpload() {
String appId = "test-app";
String studyId = "test-study";
String uploadId = "test-upload";
String healthCode = "test-health-code";
String userId = "test-user-id";
HealthDataRecordEx3 record = new DynamoHealthDataRecordEx3();
record.setId(uploadId);
record.setAppId(appId);
record.setHealthCode(healthCode);
when(uploadDao.getUploadNoThrow(uploadId)).thenReturn(null);
when(healthDataEx3Service.getRecord(uploadId, false)).thenReturn(Optional.of(record));
when(accountService.getAccountId(appId, "healthcode:" + healthCode)).thenReturn(Optional.of(userId));
doNothing().when(CAN_READ_UPLOADS).checkAndThrow(studyId, userId);
UploadViewEx3 result = service.getUploadViewForExporter3(appId, studyId, uploadId, false, false);
assertNotNull(result);
assertEquals(uploadId, result.getId());
assertEquals(healthCode, result.getHealthCode());
assertNull(result.getUpload());
assertSame(record, result.getRecord());
assertEquals(userId, result.getUserId());
assertNull(result.getTimelineMetadata());
assertNull(result.getAdherenceRecords());
verify(uploadDao).getUploadNoThrow(uploadId);
verify(healthDataEx3Service).getRecord(uploadId, false);
verify(accountService).getAccountId(appId, "healthcode:" + healthCode);
verify(CAN_READ_UPLOADS).checkAndThrow(studyId, userId);
} |
Then added some descriptive text
In this test, we simulate the case where a record exists but an upload does not. We create a
HealthDataRecordEx3
object with the same ID as the upload ID and set itsappId
andhealthCode
properties. We then mock theUploadDao
andHealthDataEx3Service
to return null for the upload and the created record for the record. We also mock theAccountService
to return theuserId
for the givenappId
andhealthCode
. We then invoke the method withfetchTimeline
andfetchAdherence
set to false, and assert that the returnedUploadViewEx3
object has the expected properties. Finally, we verify that the appropriate methods were called on the mocked objects.These two tests should cover the majority of the possible scenarios in the
getUploadViewForExporter3
method. However, depending on the implementation details of the various services that the method depends on, there may be other edge cases that should be tested as well.
Note that attempts to get ChatGPT to regenerate testGetUploadViewForExporter3WithUploadOnly()
caused ChatGPT to get confused and I was unable to get what ChatGPT originally intended (if it intended anything at all) with that test.
The tests that ChatGPT did write needed quite a bit of massaging. However, the tests do compile and pass and are readable and logical. However, the tests only cover 51% of line coverage and 38% of branch coverage (even worse than DiffBlue), and misses some important test cases.
I also tried the prompt: “Enumerate test cases for method getUploadViewForExporter3 in class UploadService at https://github.com/DwayneJengSage/BridgeServer2/blob/develop/src/main/java/org/sagebionetworks/bridge/services/UploadService.java”. ChatGPT replied with
Here are some test cases for the
getUploadViewForExporter3
method in theUploadService
class:1. Test case where the
studyId
parameter is null. Expect an exception to be thrown.2. Test case where the
studyId
parameter is an empty string. Expect an exception to be thrown.3. Test case where the
healthCode
parameter is null. Expect an exception to be thrown.4. Test case where the
healthCode
parameter is an empty string. Expect an exception to be thrown.5. Test case where the
synapseUserId
parameter is null. Expect an exception to be thrown.6. Test case where the
synapseUserId
parameter is an empty string. Expect an exception to be thrown.7. Test case where the
numFiles
parameter is negative. Expect an exception to be thrown.8. Test case where the
numBytes
parameter is negative. Expect an exception to be thrown.9. Test case where the
numRecords
parameter is negative. Expect an exception to be thrown.10. Test case where the
uploadId
parameter is null. Expect an exception to be thrown.11. Test case where the
uploadId
parameter is an empty string. Expect an exception to be thrown.12. Test case where the
uploadId
parameter is not a valid UUID. Expect an exception to be thrown.13. Test case where the
numFiles
,numBytes
, andnumRecords
parameters are all greater than 0, and theuploadId
parameter is a valid UUID. Expect anUploadView
object to be returned with all the fields populated appropriately.*
testGetUploadViewForExporter3WithUploadOnlyAndNoData
: test the case where thehealthCode
parameter is not null,studyId
is null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that no records are returned in the response.*
testGetUploadViewForExporter3WithStrictValidationAndNoData
: test the case where thehealthCode
parameter is not null,studyId
is null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.STRICT_VALIDATION
. This test should ensure that no records are returned in the response.*
testGetUploadViewForExporter3WithUploadOnlyAndRecords
: test the case where thehealthCode
parameter is not null,studyId
is null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that records are returned in the response.*
testGetUploadViewForExporter3WithStrictValidationAndRecords
: test the case where thehealthCode
parameter is not null,studyId
is null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.STRICT_VALIDATION
. This test should ensure that records are returned in the response.*
testGetUploadViewForExporter3WithStudyAndNoData
: test the case where thehealthCode
parameter is null,studyId
is not null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that no records are returned in the response.*
testGetUploadViewForExporter3WithStudyAndRecords
: test the case where thehealthCode
parameter is null,studyId
is not null,startDate
andendDate
are null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that records are returned in the response.*
testGetUploadViewForExporter3WithDatesAndNoData
: test the case where thehealthCode
parameter is null,studyId
is null,startDate
andendDate
are not null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that no records are returned in the response.*
testGetUploadViewForExporter3WithDatesAndRecords
: test the case where thehealthCode
parameter is null,studyId
is null,startDate
andendDate
are not null, and theexporter3
parameter isUploadValidationStrictness.UPLOAD_ONLY
. This test should ensure that records are returned in the response.
Note that the output refers to parameters not present in the linked code, which makes this list of test cases not particularly useful.
Next, I tried the following prompt: “Summarize the pull request https://github.com/Sage-Bionetworks/BridgeServer2/pull/649 in a page of text, with the intended audience of other software developers.” ChatGPT responded with
This pull request adds functionality to the BridgeServer2 project that allows a user to upload large files in chunks to the server. The goal of this feature is to reduce the risk of uploading a large file in a single request, which could cause issues with network timeouts, file size limitations, or server resource constraints.
The changes made in this pull request include modifications to the UploadService class, as well as the addition of a ChunkedFileHelper class that handles the chunking and reassembly of files. The UploadService now has a new method called "uploadChunk" that accepts a chunk of a file, and a separate method called "completeChunkedFileUpload" that reassembles the chunks into a single file once all the chunks have been uploaded.
Several new unit tests were added to test this new functionality, including tests to ensure that files are correctly split into chunks, that the correct error messages are returned when necessary, and that the reassembly of chunks produces the expected result.
Overall, this pull request adds a valuable feature to BridgeServer2 that improves the reliability and robustness of file uploads, especially for larger files.
Which is completely wrong. Additional prompts asking for the title and description of the linked pull request confirm that ChatGPT is unable to follow links.
Conclusion:
Better at generating unit tests than DiffBlue. Might be good for generating tests for simple code, but fails at anything complex.
Good at describing a copy-pasted piece of code. Unable to follow links (to either code or pull requests) and describing that code, let alone enumerate or write unit tests. This limits ChatGPT from being able to work with large codebases.
GitHub Copilot
GitHub Copilot was similar to OpenAI in that both ended up writing code that was repetitive and didn’t compile and needed a lot of editing to get it to compile and run. The one advantage Copilot has over OpenAI is that Copilot can run as an IDE plugin.
In terms of cost, GitHub Copilot costs $10/mo, which is less than 0.2% of an entry-level Associate Engineer’s salary at Sage.
I tried Copilot specifically for unit test generation. For reference, the code under test was https://github.com/DwayneJengSage/BridgeServer2/blob/develop/src/main/java/org/sagebionetworks/bridge/services/UploadService.java#L522
I went to UploadServiceTest.java and gave the following prompt
Code Block | ||
---|---|---|
| ||
// Unit test for UploadService.getUploadViewForExporter3() with timeline and adherence.
@Test
public void testGetUploadViewForExporter3() { |
Copilot filled this in with the following
Code Block | ||
---|---|---|
| ||
// Unit test for UploadService.getUploadViewForExporter3() with timeline and adherence.
@Test
public void testGetUploadViewForExporter3() {
// Mock upload.
Upload upload = Upload.create();
upload.setAppId(TEST_APP_ID);
upload.setUploadId(UPLOAD_ID_1);
upload.setHealthCode(HEALTH_CODE);
upload.setStudyId(TEST_STUDY_ID);
upload.setCreatedOn(CREATED_ON);
upload.setCompletedOn(CREATED_ON);
upload.setMetadata(METADATA);
upload.setRecordId(RECORD_ID);
when(mockUploadDao.getUploadNoThrow(UPLOAD_ID_1)).thenReturn(upload);
// Mock record.
HealthDataRecordEx3 healthDataRecord = HealthDataRecordEx3.create();
healthDataRecord.setAppId(TEST_APP_ID);
healthDataRecord.setUploadId(UPLOAD_ID_1);
healthDataRecord.setHealthCode(HEALTH_CODE);
healthDataRecord.setStudyId(TEST_STUDY_ID);
healthDataRecord.setCreatedOn(CREATED_ON);
healthDataRecord.setMetadata(METADATA);
when(mockHealthDataEx3Service.getRecord(UPLOAD_ID_1, false)).thenReturn(Optional.of(
healthDataRecord));
// Mock timeline.
TimelineMetadata timelineMetadata = new TimelineMetadata();
timelineMetadata.setAppVersion(1);
timelineMetadata.setPhoneInfo("phone info");
timelineMetadata.setTimeZone("America/Los_Angeles");
timelineMetadata.setUploadDate(CREATED_ON);
timelineMetadata.setUploadId(UPLOAD_ID_1);
timelineMetadata.setUploadMetadata(METADATA);
when(mockTimelineService.getTimelineMetadata(UPLOAD_ID_1)).thenReturn(Optional.of(
timelineMetadata));
// Mock adherence.
AdherenceRecord adherenceRecord = new AdherenceRecord();
adherenceRecord.setAppVersion(1);
adherenceRecord.setPhoneInfo("phone info");
adherenceRecord.setTimeZone("America/Los_Angeles");
adherenceRecord.setUploadDate(CREATED_ON);
adherenceRecord.setUploadId(UPLOAD_ID_1);
adherenceRecord.setUploadMetadata(METADATA);
when(mockAdherenceService.getAdherenceRecord(UPLOAD_ID_1)).thenReturn(Optional.of(
adherenceRecord));
// Execute and validate.
UploadView uploadView = svc.getUploadViewForExporter3(TEST_APP_ID, null, UPLOAD_ID_1, true,
true);
assertEquals(uploadView.getUploadId(), UPLOAD_ID_1);
assertEquals(uploadView.getHealthCode(), HEALTH_CODE);
assertEquals(uploadView.getStudyId(), TEST_STUDY_ID);
} |
This looks really nice, except that most of the setters on Upload, Record, Timeline, and Adherence don’t actually exist, nor do the getters on UploadView. Also, by the logic of the code under test, this wouldn’t actually exercise the code under test.
At Nick Grosenbacher’s suggestion, I attempted to use Copilot to enumerate test cases. I gave the following prompt
Code Block | ||
---|---|---|
| ||
// Generate test cases for UploadService.getUploadViewForExporter3().
// test case: |
Copilot auto-completed it to
Code Block | ||
---|---|---|
| ||
// test case: upload, record, expected exception |
And then repeated it a bunch of times.
Okay, let’s trye
Code Block | ||
---|---|---|
| ||
// test case 2: |
Copilot auto-completed it to the same thing
Code Block | ||
---|---|---|
| ||
// test case 2: upload, record, expected exception |
Okay, maybe my mistake was asking Copilot to enumerate test cases in a file separate from where my code is. Let’s try putting this prompt in a comment immediately after the method I’m trying to test
Code Block | ||
---|---|---|
| ||
// Generate test cases for getUploadViewForExporter3().
// test case: |
Copilot auto-completed it with
Code Block | ||
---|---|---|
| ||
// test case: getUploadViewForExporter3() with upload
// test case: getUploadViewForExporter3() with record
// test case: getUploadViewForExporter3() with upload and record
// test case: getUploadViewForExporter3() with upload and record, but upload is from a different app
// test case: getUploadViewForExporter3() with upload and record, but record is from a different app
// test case: getUploadViewForExporter3() with upload and record, but upload and record are from different apps
// (repeats the last line ad infinitum) |
Okay, it’s nowhere near exhaustive. There are a lot of important cases that it’s missing, but it’s a good start.
Conclusion: Copilot is worth continuing to passively investigate while working on normal sprint work. It seems okay for enumerating test cases, but it’s incomplete, and I wouldn’t yet trust it to generate code for me.
Update: Copilot is good as a more sophisticated auto-complete that can auto-complete entire blocks of code for you, which cuts down on repeated copy-pasting.
DiffBlue Cover
See https://www.diffblue.com/products/ DiffBlue Cover is a tool to analyze code and generate unit tests. It exists as an IntelliJ plugin.
Some limitations of DiffBlue Cover
...
Code Block | ||
---|---|---|
| ||
package org.sagebionetworks.bridge.services; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertNull; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.mockito.Mockito.any; import static org.mockito.Mockito.anyBoolean; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.verify; import static org.mockito.Mockito.when; import com.amazonaws.services.s3.AmazonS3; import com.fasterxml.jackson.databind.node.JsonNodeFactory; import com.fasterxml.jackson.databind.node.ObjectNode; import java.util.Optional; import org.junit.jupiter.api.Disabled; import org.junit.jupiter.api.Test; import org.junit.jupiter.api.extension.ExtendWith; import org.mockito.InjectMocks; import org.mockito.Mock; import org.mockito.junit.jupiter.MockitoExtension; import org.mockito.junit.jupiter.MockitoSettings; import org.mockito.quality.Strictness; import org.sagebionetworks.bridge.config.BridgeConfig; import org.sagebionetworks.bridge.dao.UploadDao; import org.sagebionetworks.bridge.dao.UploadDedupeDao; import org.sagebionetworks.bridge.dynamodb.DynamoHealthDataRecordEx3; import org.sagebionetworks.bridge.dynamodb.DynamoUpload2; import org.sagebionetworks.bridge.exceptions.BadRequestException; import org.sagebionetworks.bridge.exceptions.EntityNotFoundException; import org.sagebionetworks.bridge.models.BridgeEntity; import org.sagebionetworks.bridge.models.upload.UploadViewEx3; @ExtendWith(MockitoExtension.class) @MockitoSettings(strictness = Strictness.LENIENT) class UploadServiceDiffBlueTest { @Mock private AccountService accountService; @Mock private AdherenceService adherenceService; @Mock private AmazonS3 amazonS3; @Mock private AppService appService; @Mock private BridgeConfig bridgeConfig; @Mock private Exporter3Service exporter3Service; @Mock private HealthDataEx3Service healthDataEx3Service; @Mock private HealthDataService healthDataService; @Mock private Schedule2Service schedule2Service; @Mock private UploadDao uploadDao; @Mock private UploadDedupeDao uploadDedupeDao; @InjectMocks private UploadService uploadService; @Mock private UploadValidationService uploadValidationService; /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter3() { when(healthDataEx3Service.getRecord((String) any(), anyBoolean())) .thenReturn(Optional.of(new DynamoHealthDataRecordEx3())); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(new DynamoUpload2()); assertThrows(EntityNotFoundException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(uploadDao).getUploadNoThrow((String) any()); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter32() { when(healthDataEx3Service.getRecord((String) any(), anyBoolean())) .thenReturn(Optional.of(new DynamoHealthDataRecordEx3())); when(uploadDao.getUploadNoThrow((String) any())).thenThrow(new RuntimeException()); assertThrows(RuntimeException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(uploadDao).getUploadNoThrow((String) any()); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter33() { when(healthDataEx3Service.getRecord((String) any(), anyBoolean())) .thenReturn(Optional.of(new DynamoHealthDataRecordEx3())); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(null); assertThrows(EntityNotFoundException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(healthDataEx3Service).getRecord((String) any(), anyBoolean()); verify(uploadDao).getUploadNoThrow((String) any()); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter34() { when(healthDataEx3Service.getRecord((String) any(), anyBoolean())) .thenReturn(Optional.of(new DynamoHealthDataRecordEx3())); DynamoUpload2 dynamoUpload2 = mock(DynamoUpload2.class); when(dynamoUpload2.getAppId()).thenReturn("42"); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(dynamoUpload2); assertThrows(EntityNotFoundException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(healthDataEx3Service).getRecord((String) any(), anyBoolean()); verify(uploadDao).getUploadNoThrow((String) any()); verify(dynamoUpload2).getAppId(); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter35() { when(healthDataEx3Service.getRecord((String) any(), anyBoolean())) .thenReturn(Optional.of(mock(DynamoHealthDataRecordEx3.class))); DynamoUpload2 dynamoUpload2 = mock(DynamoUpload2.class); when(dynamoUpload2.getAppId()).thenThrow(new RuntimeException()); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(dynamoUpload2); assertThrows(RuntimeException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(uploadDao).getUploadNoThrow((String) any()); verify(dynamoUpload2).getAppId(); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test @Disabled("TODO: Complete this test") void testGetUploadViewForExporter36() { // TODO: Complete this test. // Reason: R013 No inputs found that don't throw a trivial exception. // Diffblue Cover tried to run the arrange/act section, but the method under // test threw // java.lang.NullPointerException // at org.sagebionetworks.bridge.services.UploadService.getUploadViewForExporter3(UploadService.java:549) // See https://diff.blue/R013 to resolve this issue. when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(null); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(mock(DynamoUpload2.class)); new RuntimeException(); uploadService.getUploadViewForExporter3("42", "42", "42", true, true); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter37() { when(accountService.getAccountId((String) any(), (String) any())).thenReturn(Optional.of("foo")); when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); DynamoUpload2 dynamoUpload2 = mock(DynamoUpload2.class); when(dynamoUpload2.getMetadata()).thenReturn(new ObjectNode(JsonNodeFactory.withExactBigDecimals(true))); when(dynamoUpload2.getHealthCode()).thenReturn("Health Code"); when(dynamoUpload2.getAppId()).thenReturn("42"); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(dynamoUpload2); new RuntimeException(); UploadViewEx3 actualUploadViewForExporter3 = uploadService.getUploadViewForExporter3("42", "42", "42", true, true); assertEquals("foo", actualUploadViewForExporter3.getUserId()); assertNull(actualUploadViewForExporter3.getRecord()); assertEquals("42", actualUploadViewForExporter3.getId()); assertEquals("Health Code", actualUploadViewForExporter3.getHealthCode()); verify(accountService).getAccountId((String) any(), (String) any()); verify(healthDataEx3Service).getRecord((String) any(), anyBoolean()); verify(uploadDao).getUploadNoThrow((String) any()); verify(dynamoUpload2).getMetadata(); verify(dynamoUpload2).getAppId(); verify(dynamoUpload2).getHealthCode(); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter38() { when(accountService.getAccountId((String) any(), (String) any())).thenReturn(Optional.of("foo")); when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); DynamoUpload2 dynamoUpload2 = mock(DynamoUpload2.class); when(dynamoUpload2.getMetadata()).thenThrow(new EntityNotFoundException(BridgeEntity.class)); when(dynamoUpload2.getHealthCode()).thenThrow(new EntityNotFoundException(BridgeEntity.class)); when(dynamoUpload2.getAppId()).thenReturn("42"); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(dynamoUpload2); new RuntimeException(); assertThrows(EntityNotFoundException.class, () -> uploadService.getUploadViewForExporter3("42", "42", "42", true, true)); verify(healthDataEx3Service).getRecord((String) any(), anyBoolean()); verify(uploadDao).getUploadNoThrow((String) any()); verify(dynamoUpload2).getAppId(); verify(dynamoUpload2).getHealthCode(); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter39() { when(accountService.getAccountId((String) any(), (String) any())).thenReturn(Optional.of("42")); when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(mock(DynamoUpload2.class)); new RuntimeException(); assertThrows(BadRequestException.class, () -> uploadService.getUploadViewForExporter3("42", "", "42", true, true)); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test void testGetUploadViewForExporter310() { when(accountService.getAccountId((String) any(), (String) any())).thenReturn(Optional.of("42")); when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); DynamoUpload2 dynamoUpload2 = mock(DynamoUpload2.class); when(dynamoUpload2.getMetadata()).thenReturn(new ObjectNode(JsonNodeFactory.withExactBigDecimals(true))); when(dynamoUpload2.getHealthCode()).thenReturn("Health Code"); when(dynamoUpload2.getAppId()).thenReturn("42"); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(dynamoUpload2); new RuntimeException(); UploadViewEx3 actualUploadViewForExporter3 = uploadService.getUploadViewForExporter3("42", "42", "42", true, false); assertEquals("42", actualUploadViewForExporter3.getUserId()); assertNull(actualUploadViewForExporter3.getRecord()); assertEquals("42", actualUploadViewForExporter3.getId()); assertEquals("Health Code", actualUploadViewForExporter3.getHealthCode()); verify(accountService).getAccountId((String) any(), (String) any()); verify(healthDataEx3Service).getRecord((String) any(), anyBoolean()); verify(uploadDao).getUploadNoThrow((String) any()); verify(dynamoUpload2).getMetadata(); verify(dynamoUpload2).getAppId(); verify(dynamoUpload2).getHealthCode(); } /** * Method under test: {@link UploadService#getUploadViewForExporter3(String, String, String, boolean, boolean)} */ @Test @Disabled("TODO: Complete this test") void testGetUploadViewForExporter311() { // TODO: Complete this test. // Reason: R013 No inputs found that don't throw a trivial exception. // Diffblue Cover tried to run the arrange/act section, but the method under // test threw // java.lang.NullPointerException // at org.sagebionetworks.bridge.services.UploadService.getUploadViewForExporter3(UploadService.java:574) // See https://diff.blue/R013 to resolve this issue. when(accountService.getAccountId((String) any(), (String) //any())).thenReturn(null); at org.sagebionetworks.bridge.services.UploadService.getUploadViewForExporter3(UploadService.java:574)when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); // See https://diff.blue/R013 to resolve this issue. when(uploadDao.getUploadNoThrow((String) any())).thenReturn(mock(DynamoUpload2.class)); new RuntimeException(); when(accountServiceuploadService.getAccountId((String) any(), (String) any())).thenReturn(null); when(healthDataEx3Service.getRecord((String) any(), anyBoolean())).thenReturn(Optional.empty()); when(uploadDao.getUploadNoThrow((String) any())).thenReturn(mock(DynamoUpload2.class)); new RuntimeException(); uploadService.getUploadViewForExporter3("42", "42", "42", true, false); } } |
DiffBlue at least wrote tests that not only compile, but also succeed! The test coverage is poor though, with only 55% line coverage and 42% branch coverage, and most of the lines it failed to cover were important cases that needed to be tested.
Also, the tests are almost unreadable. I wouldn’t accept this level of quality from a paid intern. I’m not even sure I would accept this level of quality for free.
Conclusion: DiffBlue is worth investigating for simple code chunks, but is a poor fit for anything more complex.
Code Reviews
AI seems to have trouble generating code, but it might be able to help with code reviews.
Personally, I think AI-assisted code review shouldn’t replace human-led code review. But I think the two pieces of value are
catching things that human reviewers might miss and
catching things early so that developers can start fixing issues in their code instead of waiting for teammates’ availability.
I’m not sure how we could measure it, but I conjecture that if we point an AI at our code and it catches issues at all (other than inconsequential issues or stylistic nitpicks), then it’s providing value. Whether that value is worth what we’re being charged for it remains to be seen, but AWS CodeGuru charges $30/mo for each 100k lines of code, which is less than 0.5% of an entry-level Associate Engineer’s salary at Sage.
Leads include
...
...
getUploadViewForExporter3("42", "42", "42", true, false);
}
} |
DiffBlue at least wrote tests that not only compile, but also succeed! The test coverage is poor though, with only 55% line coverage and 42% branch coverage, and most of the lines it failed to cover were important cases that needed to be tested.
Also, the tests are almost unreadable. I wouldn’t accept this level of quality from a paid intern. I’m not even sure I would accept this level of quality for free.
Conclusion: ChatGPT does a better job. However, DiffBlue is more convenient to use as an IDE plugin. Might be worth investigating again in a year or two as technology improves.
Code Reviews
AI seems to have trouble generating code, but it might be able to help with code reviews.
Personally, I think AI-assisted code review shouldn’t replace human-led code review. But I think the two pieces of value are
catching things that human reviewers might miss and
catching things early so that developers can start fixing issues in their code instead of waiting for teammates’ availability.
I’m not sure how we could measure it, but I conjecture that if we point an AI at our code and it catches issues at all (other than inconsequential issues or stylistic nitpicks), then it’s providing value. Whether that value is worth what we’re being charged for it remains to be seen, but AWS CodeGuru charges $30/mo for each 100k lines of code, which is less than 0.5% of an entry-level Associate Engineer’s salary at Sage.
Leads include
Codeball
Codeball runs in your GitHub repository as a GitHub action. However, I ran into permissions issues setting up GitHub and was not able to resolve them on my own. It looks like Codeball support is done through Discord, and the Codeball devs aren’t very active there. As a result, I am currently blocked and unable to evaluate Codeball any further.
In addition to Discord, I also tried creating a GitHub issue for support https://github.com/sturdy-dev/codeball-action/issues/186
Update: Codeball appears to be a dead project. There hasn’t been any activity on their GitHub or their Discord for several months.
Codiga
Codiga runs as a GitHub app. Installing the Codiga app requires org-level permissions, even if you’re only installing for repositories that you have admin permissions on. This is currently waiting on IT to respond to the installation request.
See
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
AWS CodeGuru
See https://aws.amazon.com/codeguru/
Codeball
See There’s an issue where CodeGuru tries to associate with my fork at https://codeballgithub.ai/
Codeball runs in your GitHub repository as a GitHub action. However, I ran into permissions issues setting up GitHub and was not able to resolve them on my own. It looks like Codeball support is done through Discord, and the Codeball devs aren’t very active there. As a result, I am currently blocked and unable to evaluate Codeball any further.
Codiga
Codiga runs as a GitHub app. Installing the Codiga app requires org-level permissions, even if you’re only installing for repositories that you have admin permissions on. This is currently waiting on IT to respond to the installation request.com/DwayneJengSage/BridgeServer2 instead of the root fork at https://github.com/Sage-Bionetworks/BridgeServer2, which means I’m unable to point CodeGuru at our pull requests.
I filed an AWS support request https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/?displayId=12290789771&language=en