cannot make cache directory on Windows 10

Description

Identified by user through the help forum here:

https://www.synapse.org/#!Synapse:syn5637528/discussion/threadId=6334

Loading the package fails due to the Synapse cache directory not being able to be created. My suspicion is that this line (https://github.com/Sage-Bionetworks/synapsePythonClient/blob/master/synapseclient/cache.py#L30) is not safe for Windows, as tildes are reserved for other uses (https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file?redirectedfrom=MSDN#short-vs-long-names). Instead, we should use `os.path.expanduser` to get the current users home directory (https://docs.python.org/3.5/library/os.path.html). (But note that is already done, here:
https://github.com/Sage-Bionetworks/synapsePythonClient/blob/master/synapseclient/cache.py#L80)

I will file this issue for the Python client as well and link it here.

Reminder to follow up in the discussion forum when a fix is known, thanks!

Environment

None

Activity

Show:
Ziming Dong
December 6, 2019, 11:34 PM

Nothing wrong with Unicode characters as home directory in Python 3.5 on Windows

Ziming Dong
December 6, 2019, 11:37 PM

This is likely an encoding error when transferring Unicode from Python to R in PythonEmbedInR

Bruce Hoff
December 8, 2019, 3:25 PM

The user reports:

I have found that my chinese directory name result in this problem. I tried to library the package in another account with english directory name it succeeded.
Thank you for your long-time guidance! Hope that it won't occur other errors....
And i have a modest proposal : i wish chinese directory could be discerned. I think it could help many people like me. Thanks again!

Unfortunately they did not perform the requested experiment. I am downgrading the issue from Blocker since the user has a workaround.

Bruce Hoff
December 8, 2019, 3:28 PM


> This is likely an encoding error when transferring Unicode from Python to R in PythonEmbedInR
Thanks for looking into this. What doesn't make sense is that to make synapser work does not involve sending the string from Python to R. It would help if you could reproduce the user's problem: Install R on the Windows machine, install the 'synapser' package and then execute 'library(synapser)' . Do you see the same error?

Jordan Kiang
June 16, 2020, 11:33 AM

The issue derives from an odd misbehavior of R concerning the HOME directory on Windows with a non-ascii username.

If no HOME environment variable is set (it is not typically set on Windows, which uses USERPROFILE instead for most similar uses), then R will make a HOME variable available in its session as described here: https://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-are-HOME-and-working-directories_003f

This variable is not coming from the Windows environment and unfortunately can be invalid on Windows when the username has non-ascii character(s). e.g. in the repro below the username is “中文”.

Sys.getenv('HOME')
[1] "C:\Users\??\Documents"
grepl('??', Sys.getenv('HOME'), fixed=TRUE)
[1] TRUE
Sys.getenv('USERPROFILE')
[1] "C:\Users\<U+4E2D><U+6587>"

Note in the above that it isn’t just a display encoding issue or similar in the console, the variable contains actual '?' characters where non-ascii characters should be. On the other hand the USERPROFILE environment variable which is coming from Windows has unicode code points and works properly.

The invalid HOME variable set by R is then getting propagated to the Python synapseclient via synapser which uses it when it tries to generate the synapseCache, and fails as a consequence. The path itself is invalid, it’s not an issue with os.makedirs so precreating the cache directory doesn’t help. A workaround on an individual basis is to actually set a HOME environment variable in Windows to e.g. %USERPROFILE%\Documents, after which R won’t try to create its own HOME variable in the session.

I have asked a user here to confirm this workaround fixes the issue for them. Once that is confirmed a workaround fix in Synapser on Windows could be to set the CACHE_ROOT_DIR based on the USERPROFILE in Windows rather than letting it use the potentially invalid HOME directory.

Notably I also had issues installing other libraries with such a username (for example renv) so this seems like a poorly supported case and below are some similar issues I ran across while researching this.

Similar issues:

https://www.reddit.com/r/rstats/comments/635k6m/rstudio_nonascii_windows_username_workaround/

Assignee

Jordan Kiang

Reporter

Kenneth Daily

Labels

Validator

Kenneth Daily

Development Area

Synapse Core Infrastructure

Release Version History

None

Sprint

None

Priority

Major
Configure