SAS Administration: Fetch Big Data Faster Across ODBC
On my current “big data” project, I’m connecting SAS to PostgreSQL using the ODBC access engine. At first I was seeing pretty bad performance – specifically I wasn’t able to view the ~2.2 billion sample records I created. After adding 2 simple lines to the odbc.ini configuration file I was able to fly through the data.
By default, ODBC data sources (at least in Linux) may not have the fetch parameter defined in the odbc.ini configuration file depending on how the ODBC connection was originally setup. This can severely impact how efficient you can read high volumes of data because the ODBC connection will try to access the entire table at once. Setting the fetch parameter tells the ODBC driver to retrieve a certain number of records at a time in memory, rather than all at once. This is similar to setting the READBUFF parameter as I previously mentioned.
Enable Fetching with ODBC Sources
Find your odbc.ini configuration file. Mine is located at /etc/odbc.ini. Add the following 2 lines to your specific DSN you wish to enable fetching.
In the end, this is what my odbc.ini configuration looks like today:
Also note, here is an error that can potentially be fixed using this configuration tweak: http://support.sas.com/kb/41/369.html
Let me know your ideas!