We're updating the issue view to help you get more done.Learn more

Update Synapse tables for integer types

Tried to download a Synapse table as a Pandas data frame and then upload it back to Synapse.

The following example succeeded as expected,

1 2 3 results = syn.tableQuery("SELECT * FROM syn3458936 WHERE recordId IN ('cd0e7472-e975-4344-8214-d766aea417bf')") df = results.asDataFrame() table = syn.store(Table('syn3458936', df, etag=results.etag))

whereas, if we add one particular row to the query with a missing integer value on the "heartAgeDataBloodGlucose" column, the upload would fail.

1 2 3 results = syn.tableQuery("SELECT * FROM syn3458936 WHERE recordId IN ('cd0e7472-e975-4344-8214-d766aea417bf', 'f1b94621-9b31-40cd-a645-392aa690f7ab')") df = results.asDataFrame() table = syn.store(Table('syn3458936', df, etag=results.etag))

The Synapse error message is like "Value at 9,9 was not a valid INTEGER. For input string: "100.0"".

The reason is that, Pandas integer type does not support NaN. When a column has a missing value, it's type falls back to float. Then the string representation of the float value is incompatible with Synapse Table's Int type.

Status

Assignee

Ziming Dong

Reporter

Bruce Hoff

Labels

Validator

Bruce Hoff

Components

Fix versions

Affects versions

py-1.3

Priority

Critical