From cURL to Proc HTTP
Sending HTTP request is a vital functionality that really helps you to connect to more services and integrate other tools.
Most API documentation provide you example calls in cURL. cURL stand for Client for URLs and is a command line tool to call URLs and receive a response. It has been around for almost 25 years and is available on basically any shell.
Now the question becomes how to translate a cURL command to the language of your choosing. For many languages this is available through tools like Postman.
SAS has a great way to call APIs - Proc HTTP, which has been around for quite some time (it was introduced back in SAS 9.2). But I’m not aware of any tool that can directly translate cURL to Proc HTTP and that’s a shame.
Of course you could argue why not use the x command or %sysexec to directly run the cURL command. My main reason is that in most SAS environments the execution of system commands is not allowed and for good reasons. The other question might be why not use Proc Python and to that I have two thoughts:
Not everybody will have access to a Python runtime
I wanted to learn more about Proc HTTP and find it to be very powerful
* Example of a x command using curl; x 'curl https://api.coindesk.com/v1/bpi/currentprice.json' * Example of a %sysexec to run curl; data _null_; %sysexec curl https://api.coindesk.com/v1/bpi/currentprice.json; run;
So I thought why shouldn’t I build that? And I will document my journey in this post.
Digging into cURL
To get started I wen to cURL manual page to get a general lay of the land of what awaits me. If you take a look there is a lot of options (well over 200) that can be specified but the basic structure is very easy:
curl -option1 value1 —option2 value2 … URL
First you have of course the command curl itself, then you can have a bunch of options & their associated value and then finally you have a URL.
cURL also supports a lot of different protocols everything from of course HTTP/HTTPS to TELNET and SMP.
Setting a scope
If I wouldn’t limit the scope I would have to spend an insane amount of time trying to implement and test all of the different features that are provided by cURL itself. So I have to create some restrictions to make this project manageable.
The first restriction I want to introduce is for the protocol and this is very easy as all I currently am interested in is http/https I will translate these types of commands. This already reduces the amount of protocols from 17 to 1 which is great. That also means I will not be supporting the connect http request method.
Next I will only support a subset of all of the available options. Some options like -l are automatically dropped because they are not supported by the http/https protocol. We can eliminate:
2 options that are only available for FTP/SFTP
14 options that are only available for FTP
1 option that is only available for FTP/POP3
2 options that are only available for IMAP/POP3/SMTP
3 options that are only available for SMTP
1 options that is only available for FTP/IMAP/POP3/SMTP
1 option that is only available for TELNET
This list is based on the protocols in brackets included in the manual page of curl.
I will further subset this list based on common use cases that I have encountered. So I will have to give a notification to users if I do not support an option.
Output handling is also something I will not be doing to completion - I want to handle JSON responses directly but for everything else only offer a basic outline.
Making a Proc HTTP request
The SAS documentation for Proc HTTP comes with a lot of examples - which I will make use of here as well - and explanations, it is a great resource. So lets take the cURL command we had early and put it into Proc HTTP:
proc http
url='https://api.coindesk.com/v1/bpi/currentprice.json';
run;
That is pretty straight forward we basically just take the URL from the cURL command into the procedure option url. We have to note that the URL needs to be inside of quotes but that is actually also a best practice for cURL. Of course everything is enclosed in a proc http and run;.
Running this command is a bit different though then compared to cURL. Running this Proc HTTP will just give us a note in the log saying 200 OK - this is a reference to the HTTP status code that was returned but we do not see the actual content of the response.
For us to handle the output we will have to store it in a file and then read. For our first step here we are going to use a very simplistic approach by just using the infile statement in a data step:
* Create a temporary file;
filename outResp temp;
* Write the response to our filename reference;
proc http
url='https://api.coindesk.com/v1/bpi/currentprice.json'
out=outResp;
run;
* Read in the data of the response;
data work.outResp;
infile outResp;
input message $;
run;
* Deassign the filename reference;
filename outResp clear;
This will get us not the complete response (we will handle that later) but we can at least take a peak. Before we dive deeper into this let us first take a look at error handling.
HTTP Status Code handling
Whenever you run Proc HTTP two macro variables are created:
SYS_PROCHTTP_STATUS_CODE - which contains the actual HTTP Status Code response from the called site/API - in our previous example this was 200
SYS_PROCHTTP_STATUS_PHRASE - which contains the translation of the HTTP Status Code into a more human understandable form - for 200 that would be OK
Now we also know how the log message was generated. In general we can group HTTP status codes into two classes:
Good responses - any code below 400
Bad responses - any code equal to or greater then 400
Unfortunately SAS will always write the response as a note no matter the value - try this as an example:
proc http
url='https://www.davidweik.org/testing';
run;
The log will tell us NOTE: 404 Not Found - this should be an error so lets add some error code post processing to enhance the log and make it easier to use:
* HTTP Status Handling;
data _null_;
if &SYS_PROCHTTP_STATUS_CODE. lt 400 then do;
put 'NOTE: The request was most likely successful.';
put "NOTE: The HTTP Status Code is &SYS_PROCHTTP_STATUS_CODE..";
put "NOTE: That means: &SYS_PROCHTTP_STATUS_PHRASE..";
end;
else do;
put 'ERROR: The request was most likely not successful.';
put "ERROR: The HTTP Status Code is &SYS_PROCHTTP_STATUS_CODE..";
put "ERROR: That means: &SYS_PROCHTTP_STATUS_PHRASE..";
end;
run;
The messages are a bit on the conservative side but they are already much better in my opinion.
Baseline Parser
Now that we have created a baseline we can go ahead and start writing a parser for cURL commands. My first idea was to count the spaces in the curl command and split up the string based on that:
%let cURLString = curl https://api.coindesk.com/v1/bpi/currentprice.json;
data work.curlString;
length curlComponent $32000.;
arguments = countw("&cURLString.", ' ');
do i = 1 to arguments;
curlComponent = scan("&cURLString.", i, ' ');
output;
end;
drop arguments i;
run;
This will give us the following table:
curlComponent |
---|
curl |
https://api.coindesk.com/v1/bpi/currentprice.json |
Next we will have to start parsing each component as they come in and give them tags - the following tags come to mind:
Options - identifiable by two or a single dash (-)
Option value - The next string after an option
URL - identifiable by starting with http(s)://
Method - By convention they are written as capitalized strings - the get method is omitted (default) the same is true for proc http
We should also remove the curl line from the output and multiline cURL commands are written with a \ which should also be removed:
data work.curlString;
length curlComponent $32000. tag $32.;
arguments = countw("&cURLString.", ' ');
do i = 1 to arguments;
* Reset Tag value;
tag = '';
curlComponent = compress(scan("&cURLString.", i, ' '));
* Tag assignment;
if curlComponent in ('-G', '--get', 'GET', 'POST', 'HEAD', 'PUT', 'DELETE') then tag = 'method';
else if prxmatch('/https{0,1}:\/\//', curlComponent) ne 0 then tag = 'url';
else if prxmatch('/^--{0,1}/', curlComponent) ne 0 then tag = 'options';
else tag = 'value';
* Remove meaningless content;
if curlComponent = 'curl' then;
else if curlComponent = '\' then;
else if curlComponent in ('-X', '--request') then;
else output;
end;
drop arguments i;
run;
While writing the code for this I realized that I can also drop the -X & —request option as their value will be handled by the row that has the tag method assigned. Now with our basic parser in place we will have to think about the output that our user can use.
Generating Baseline Output
The goal is for the user to be able to copy paste the code and then run it. For this I want the output to appear in the SAS Results. My first instinct was to go with Proc Print but then I would have to deal with a column heading so Proc Report looks like a cleaner choice.
But first lets write the code to write the code for the output. I will be using a data step in which we will write SAS code to a table:
* Start of the Proc HTTP;
data work.procHTTPCodeStart;
length outputCode $32000.;
outputCode = 'proc http';
run;
* Procedure options for Proc HTTP;
data work.procHTTPCode;
length outputCode $32000.;
set work.curlString;
if tag = 'url' then outputCode = "url='" || trim(curlComponent) || "';";
run;
* End of the Proc HTTP;
data work.procHTTPCodeEnd;
length outputCode $32000.;
outputCode = 'run;';
run;
* Print the code;
ods listing close;
ods html5;
proc report data=work.allCode noheader;
run;
title;
ods html5 close;
This produces the following output:
proc http
url='https://api.coindesk.com/v1/bpi/currentprice.json';
run;
This isn’t the prettiest code but by adding some titles we can make the user aware of the auto formatting capabilities of SAS Studio:
title1 'cURL to Proc HTTP conversion';
title2 'Just copy and paste the code below';
title3 'Hit CTRL+SHIFT+B or use the Format code button to make it look nice';
After all of this we know have a basic functioning prototype on top of which we will now build to add translate more cURL options into Proc HTTP and also adding our error code handling to the output.
Adding more cURL Option Parsing
Let’s start of by actually supporting more methods then just get:
if tag = 'method' then outputCode = "method='" || trim(curlComponent) || "'";
There are two debug options in cURL: —trace (all information) and -v/—verbose which has a lot less information. Debugging in Proc HTTP is done by setting a debug level - this is done as its own statement in Proc HTTP so we will have to improve our parser to add information on if the line is a procedural option or its own statement:
contentType = 'option';
if curlComponent in ('-v', '--verbose', '--trace') then contentType = 'statement';
Now we have introduced a filter to distinguish between procedural options and procedural statements. Lets add the output generator and implement a where condition to only select the relevant output based on our new filter:
* Procedure statements for Proc HTTP;
data work.procHTTPStmnt;
length outputCode $32000.;
set work.curlString(where=((contentType='statement')));
if curlComponent = '-v' then outputCode = 'debug level = 1;';
else if curlComponent = '--verbose' then outputCode = 'debug level = 2;';
else if curlComponent = '--trace' then outputCode = 'debug level = 3;';
run;
I have chosen to map the different cURL options to the 3 different Proc HTTP debugging levels. I think using the highest level of debugging for trace is logical - distinguishing between -v and —verbose is a bit arbitrary but it does introduce more control for end user. Do find out more about the debug levels please refer to the SAS documentation.
Now we have one more topic if you specify the trace option in cURL it is followed by a file name. What would currently happen with that is that a empty line would be printed to the Results - that isn’t too bad but I would like to improve this. So we will have to head back to our parser and filter out rows after a trace. My initial thought is to introduce a variable that is just a flip for outputting or not:
if curlComponent in ('--trace') then skipFlip = 1;
else if skipFlip = 1 and curlComponent not in ('--trace') then skipFlip = 0;
The removes the empty lines nicely from our output. While looking at the output I also went ahead and added some empty lines to separate blocks of code so that now the following cURL command:
curl --trace ./test/texting.txt https://api.coindesk.com/v1/bpi/currentprice.json
Produces this output:
This is already looking great in my opinion. I have three more things in mind I want to handle because they are very common with cURL:
Using a Proxy
Adding Headers
Sending Data
Lets start in that order (yes I they are in order of complexity). Adding proxy support, is straight forward there is the -x/—proxy option followed by the proxy URL - this needs to be moved into the procedural option proxyhost in Proc HTTP. The parsers is already pretty well setup for this, the issue is that we have an additional URL in our cURL command that would be tagged as a URL and that would create multiple url options. So we will introduce a check for proxy urls and if we find one we will overwrite the tag to say proxy instead:
else if curlComponent in ('-x', '--proxy') then proxyURL = 1;
if proxyURL = 1 and tag = 'url' then do; tag = 'proxy'; proxyURL = 0; end;
And adding it to the output:
else if tag = 'proxy' then outputCode = "proxyhost='" || trim(curlComponent) || "'";
One down, next up is adding headers. Here I foresee quite some trouble because this is the default structure for headers in a cURL command:
curl -H "X-First-Name: David" https://example.com
the options itself are easy its -H/—header but afterwards you can already see that our current parsers would break up the header values into multiple rows. Headers in Proc HTTP are a procedural statement and I consider it best practice to only have one and then just stack the headers after each other, but because headers in cURL can occur all over the place it makes it easier to just create multiple header statements (which is supported). Finally the cURL header key and value are separated by a colon and in SAS it is an equal sign so one more thing to keep in mind. We will use the following cURL command as our test case:
curl 'http://httpbin.org/headers' --header 'First-Header: Hello World' --header 'Second-Header: Hello David'
Assigning headers a tag is straight forward I made the following additions to the parser:
else if curlComponent in ('-H', '--header') then headerFlip = 1;
if headerFlip = 1 and tag ne '' then tag = 'header';
else if headerFlip = 1 and tag eq '' then;
This now means we will have each component of the header tagged as such. Now in the output generation we have to set everything bag together. For this arises the need to retain values across rows as long as the end of a header segment isn’t reached. The start and end of a header are indicated by quote (can be single or double). Then on the start of the header we have to add our structure for the headers statement in Proc Http. Then we concatenate the values of subsequent headers until we find a quote at the end of a string, then we close out our headers statement and output the code to the table:
* Handle headers;
if substr(curlComponent, 1, 1) in ("'", '"') then do;
retainFlip = 1;
tmpHeader = "headers '" || trim(dequote(tranwrd(curlComponent, ':', ''))) || "' = '";
end;
else if prxmatch('/"$/', trim(curlComponent)) ne 0 or prxmatch("/'$/", trim(curlComponent)) ne 0 then do;
retainFlip = 0;
outputCode = catx(' ', trim(tmpHeader), left(reverse(dequote(left(reverse(curlComponent)))))) || "';";
end;
else if tag = 'header' then do;
tmpHeader = catx(' ', trim(tmpHeader), trim(curlComponent));
end;
else retainFlip = 0;
A really horrible line in this code is the following:
left(reverse(dequote(left(reverse(curlComponent)))))
This is just to get rid of the quote at the end of string. I’m pretty certain that there is a better way of doing this, but the dequote function didn’t work on the non-reversed string so I got creative - if you have a better way of dealing with this please let me know. Lets do a quick Result check:
Two done, one final one to go - sending data. For sending data there is a lot of cURL options:
-F/—form: this represents form data with the pattern key = value this can be passed directly into the procedural option in
—data-raw: this can be any type of input we have only one way of handling this by passing it directly to in
—data-binary: there is no way of passing binary content directly. it can be done through a file - we just need to notify the user that the file need to be assigned to a filename
—data-urlencode: this can be directly passed into in
-T/—upload-file: this can be passed as a filename into in, the user needs notification similar to —data-binary
-d/—data: this is a very problematic one has it is unspecific and could be any of the above - a notification to the user needs to be made to notify of issues with files
Luckily the in option does handle all of the use cases very well. But similar to the headers we will have to deal with the splitting of the content and a way to replace double quotes in general from the string - this will lead to weird strings but that can be handled. For replacing double quotes with single quotes I found this almost perfect answer in the SAS Community - just a bit of work an voilá:
%let cURLString = %sysfunc(translate(%superq(cURLString),%str(%'),%str(%")));
We need to adjust the parser:
else if curlComponent in ('-F', '--form', '--data-raw', '--data-urlencode', '--data-binary', '-T', '--upload-file', '-d', '--data') then dataFlip = 1;
if dataFlip = 1 and tag ne '' then tag = 'data';
else if dataFlip = 1 and tag eq '' then;
And to warn the user about the filename the following was added:
* Data Warning to remind user of changing to filename;
if curlComponent in ('--data-binary', '-T', '--upload-file', '-d', '--data') then do;
put 'WARNING: You are sending data that might use a file as an input.';
put 'WARNING: In SAS you need to change this to a filename and use the file reference in the in option.';
end;
Now moving on to the Output generation:
* Handle data;
if tag = 'data' and retainFlip in (., 0) then do;
retainFlip = 1;
tmpData = trim(curlComponent);
end;
else if tag ne 'data' then do;
retainFlip = 0;
end;
else tmpData = catx(' ', trim(tmpData), trim(curlComponent));
else if retainFlip = 0 and tmpData ne '' then do;
outputCode = 'in="' || trim(substr(trim(tmpData), 2, length(trim(tmpData)) - 2)) || '"';
output;
end;
else if eof and tmpData ne '' then do;
outputCode = 'in="' || trim(substr(trim(tmpData), 2, length(trim(tmpData)) - 2)) || '"';
output;
end;
Some additional comforts
When you call a SAS Viya API from a SAS session on the same SAS Viya host then you can make use of the special procedural option called oauth_bearer = sas_services. That makes code a bit neater and transportable between environments. I will not be removing any authentication headers when this option is set just adding this option.
* End of Procedure options for Proc HTTP;
data work.procHTTPOptEnd;
length outputCode $32000.;
* Check for SAS authentication;
if &sasAuthentication. then do;
outputCode = 'oauth_bearer = sas_services';
output;
end;
outputCode = 'out=outResp;';
output;
run;
Finally lets add some output options so that the user can directly work with the results - the following four option come to my mind:
ignore - well it should still be possible to ignore the response
print to log - helpful if you are unsure of the response, it will just be printed to the log
save to table - saving the response to a table - users should be able to specify the table
create json lib - create a library with the json engine - here I will reuse the outResp
* Handle the response;
data work._outputOption;
length outputCode $32000.;
if &outputOption. = 0 then output;
else if &outputOption. = 1 then do;
output;
outputCode = '* Print response to Log;';
output;
outputCode = 'data _null_;';
output;
outputCode = 'infile outResp;';
output;
outputCode = 'input;';
output;
outputCode = 'put _infile_;';
output;
outputCode = 'run;';
output;
end;
else if &outputOption. = 2 then do;
output;
outputCode = '* Save response to a Table;';
output;
outputCode = 'data &tableName.;';
output;
outputCode = 'infile outResp;';
output;
outputCode = 'input;';
output;
outputCode = 'responseLine = _infile_;';
output;
outputCode = 'run;';
output;
end;
else if &outputOption. = 3 then do;
output;
outputCode = '* Create a JSON library from response;';
output;
outputCode = 'libname outResp json;';
output;
end;
run;
And that is it - a cURL to Proc HTTP converter. A bit over 300 lines of code that generate Proc HTTP and helpful code to deal with errors and results.
Conclusion
This was quite the journey and it took me way longer than I had expected. I hope this blog post is somewhat coherent. From learning a lot of new things about the cURL command, to learning a lot about parsing and handling special cases in SAS.
From a SAS code perspective the main nuggets I learned are around replacing quotes inside of a macro variable through this snippet:
%sysfunc(translate(%superq(cURLString),%str(%'),%str(%")));
This is super cool and will for sure come in handy in the future. The second is a reminder of just how powerful and flexible SAS is a tool and if I can think of it I can probably build it. And lastly I had a ton of fun exploring the possibilities of Proc HTTP and comparing to cURL.
Finally to make it accessible I have published my code, which should work on 9.4M3+ and SAS Viya in this GitHub repository: Convert cURL to Proc HTTP.sas
And I will also be creating a custom step and publish it on the SAS Software GitHub Custom Step repository.
Additional resources and special shutouts
There have also been quite the number of SAS Blogs, papers and questions around it. I’ll try to give credit but sometimes it is hard to keep track of everything. That is why I want to first shoutout three special resources that helped me a lot:
How to translate your cURL command into SAS code - one of the inspirations for this whole post
REST Just Got Easy with SAS and PROC HTTP - helped me to better work out how to send data
The ABCs of the PROC HTTP - a great introduction to Proc HTTP