An Application Programming Interface (API) is a set of rules which enable programs to communicate with one another effectively. Developers of many websites incorporate APIs into their servers in order to enable clients to request data using a set of rules that create a specific URL. The generated URL sends a request to the server via
GET or a similar HTTP verb, and if the request is valid, the data is returned in a response – often packaged in JSON format.
When making a request to an API for a dataset, there are often limits to the maximum number of data points which may be returned in a single response. In a situation such as this, it is wise to automate requests to the API to minimize the manual work of entering repeated changes to the request URL. In my case, I was looking to build a dataset of historical trading data for arguably the hottest and most volatile assets in the world right now – Cryptocurrencies.
With over 1800 different cryptocurrencies and newcomers arriving daily and hourly, and price swings can push into the double digits, it is a monumental task to keep on top of price movements and relationships without some kind of computational help. For this reason, it’s a good call to create a database to stockpile as much detailed information as possible. There are a number of resources to obtain historical cryptocurrency data, however, for the purposes of this application I chose to use CryptoCompare for their broad range of public API services.
Programs can be written in almost any language to automate requesting data from APIs. In order to create this iterative cryptocurrency processor, some level of analytical knowledge is required to both communicate with the API and process the response data into a single location for further processing and analysis. Alteryx is a global leader in self-service data analytics, with a wide range of drag-and-drop tools, providing an easy to use platform for those with any level of programming experience to process and analyse data.
The first step in interacting with a web API is understanding the set of rules which determine which URLs are valid, making it easier to make a request. To do this, look for the API documentation section of the website and find the section for the data you wish to retrieve.
In this tutorial, we’re looking to extract trading data at the highest possible detail - a minutely basis. Shown below, Histominute are a list of valid parameters for this request and information on their default values.
It’s relatively straightforward to ascertain from examples the general structure which the request URL must use. The base URL string is as follows: https://min-api.cryptocompare.com/data/histominute?
https://min-api.cryptocompare.com/data/ is the root of the URL and
histominute? marks the category of data the query is requesting. The remainder of the URL string are the specifications of any parameters to be included. The strings relevant for this application are as follows:
Note that Unix Epoch timestamp is the number of seconds elapsed since January 1st 1970, midnight GMT. To find the current time in Unix, use:
Datetimediff(Datetimenow(), 1970-01-01, “Seconds”)
Each string is joined to the end of the URL to create something like:
The Response to the URL query above is 1440 minutes (24 hours) of trading data in JSON format.
Alteryx has several built-in tools which enable the user to communicate with web APIs and utilise the resulting data. In order to create a workflow which uses a request URL to download data we need to use 2 tools:
Formula toolto concatenate the segments of the URL together.
Download toolto connect to the formula-generated URL.
These tools output the JSON data in a single field named “DownloadData” or similar.
Now that we have a JSON file containing the desired data, we would like to transform the data from JSON format into something more practicable so that it can be used for analysis or visualisation. The first step is to parse the JSON data into individual rows for each data value, with one column containing the label of the value and one containing the value corresponding to the label. To do this, simply add a
JSON parse tool to the output of the Download tool and select the
DownloadData JSON field.
Next, add a
Text to Columns tool to the output of the
JSON Parse and select the
JSON_Name field as the field to parse. Ensure that the tool is configured to split the field into 3 columns using the full stop
. as the delimiter. The tool will parse
Data.0.time into 3 columns containing
time. A filter can then be used to remove all records which don’t contain
Data in the first parsed field.
Since all the data measures and dimensions are in a single column, we need to use a cross-tab tool to assign columns to each measure, grouping the data by cryptocurrency ticker, fiat ticker, iteration number and the number assigned to each data point. Ensure that the new column headers are set to the third JSON name column and the values for new columns is set to the JSON value string field.
After using a
select tool to remove unwanted columns, a
filter tool to remove non-data rows and a
sort tool to order the data by time and trading pair, we are left with a clean dataset of only relevant information ready for storage.
Now that we’ve got this table with 1440 minutes of data, you might notice that we still have 6 more days of data to retrieve. In order to obtain all 7 days of data, we need to make 6 further requests to the API and we could do that very efficiently using an iterative macro. The iterative macro does not need to conduct the transformation steps as described in the previous section since these are only required for the final output – instead, it will contain the minimum possible operations:
formulatool to generate the request URL.
numeric up/downuser input tool.
macro tool must also contain a macro input, as well as an output tool. Once the tools are added to the workflow, the file type is locked to macro causing lightning and “Q” icons to appear on top of the tools. In order to specifically set the macro to an iterative macro,click on the workflow sheet and look to the container on the left – under “Type” select “Iterative Macro” from the dropdown menu.
The first tool after the macro input is a
select tool, which allows us to force fields to be string values for concatenation. The next two formulas determine the toTs parameter and concatenate all the fields together to create the download URL for each iteration:
As the iteration number starts at 0, the first
toTs parameter value is equal to
TimeNow which was set at the initialisation of the workflow. As the macro iterates, the parameter will be reduced by 86400 seconds each loop. Putting all the elements together creates a workflow which should look something like the image below:
Iterative macros require two outputs in order to function properly; the
R Output is the results output which the macro will use once the maximum number of iteration has been reached. The
I Output is the iteration output, which is used to feed the required columns back to the input for the next iteration. The function of each output tool is configured using the Interface Designer accessible through the “View” menu.
Once the workflow is created, save it as a
.yxmc file in a designated macro folder. If you wish to hold your custom macros in a new folder, you can set paths to look for macros in Alteryx user settings.
Now that we have an iterative macro to automate the API requests, we can incorporate all the sections so far into a single workflow which outputs all the specified cryptocurrency data into a CSV file saved locally. The workflow shown below uses an
input data tool which means the user can manually enter cryptocurrency and fiat ticker symbols to output weekly data for any selection of trading pairs. This workflow has demonstrated this functionality by requesting both BTC/USD and ETH/USD trading pairs which outputs a total of 20,174 data records.
Just a quick note before we finish this tutorial...
Formulas and functions can be added to the workflow before the data output tool in order to create automatic calculations which could be considered useful in analysis or visualisation later. In this example below, a
formula tool was included to calculate the percentage change in price over each minute.
Hopefully this tutorial provided a good basis to understand how iterative macros work in Alteryx!