Tips for Unriddling Encoding in SAS Visual Analytics 6.3
In preparation for Tricia and my joint paper at SAS® Global Forum 2014, “SAS Admins Need a Dashboard Too” our SAS admin support team, Ken Aanderud and Paul Homes had been busy setting up our new SAS Visual Analytics 6.3 environments, which we are very excited about! As we got underway loading tables to explore and analyze data we came across an error that prevented a table to be loaded into memory:
ERROR: Some character data was lost during transcoding in the dataset LIB.TABLE. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.
The log message indicated Cross Environment Data Access (CEDA) was being used:
NOTE: Data file LIB.TABLE.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
So we had an encoding issue… how to unriddle it?
Being Base SAS programmers we looked to our trusty PROC CONTENTS to interrogate the descriptor portion of a SAS dataset that shows the table encoding value. The PROC CONTENTS output showed the table was in latin1 format and the default system encoding for the SAS Visual Analytics 6.3 installation was utf8 so our next step was to convert them.
The following is the code was used to do the conversion and the tip is to use the CVP (character variable padding) engine on the LIBNAME statement…
libname l1 cvp './data.latin1' inencoding=latin1;
libname u8 './data' outencoding=utf8;
proc copy in=l1 out=u8 noclone;
If you’d like to know more about the CVP engine and encoding some useful references you may want to check out are:
- SAS(R) 9.4 National Language Support (NLS): Reference Guide, Second Edition
Avoiding Character Data Truncation By Using the CVP Engine
- SUGI28 Paper 281-28 Multi-Lingual Computing with the SAS® 9.1 Unicode Server, Stephen Beatrous, SAS Institute Cary, NC http://www2.sas.com/proceedings/sugi28/281-28.pdf
Have you come across any encoding issues that you’ve unriddled? If so, how? Please share in the comments below.