Querying The Adobe Analytics 2.0 Reporting API
published onA guide to requesting data from the Adobe Analytics 2.0 Reporting API, key differences from the 1.4 API and example requests in Python
In the first post of this series, we detailed the steps required to create an integration in Adobe I/O:
Adobe I/O Integrations For Adobe Analytics 2.0 APIs
In post two, we used the details given by that integration to generate an auth token:
Adobe Analytics 2.0 API Authentication
And so, with our auth token and details in hand, we've reached the point where we finally have the ability to interact with and retrieve data from Adobe Analytics 2.0 APIs.
In this post we'll mostly be focusing on the /reports
endpoint - the single point to which we provide our reporting request, and from which resultant data returned. For those familiar with reporting using the Adobe Analytics 1.4 Reporting API, you'll be aware that the accepted method of retrieving data was to first queue your report by posting your request to:
https://api.omniture.com/admin/1.4/rest/?method=Report.Queue
Then subsequently, post the report ID given in that response to another endpoint:
https://api.omniture.com/admin/1.4/rest/?method=Report.Get
If enough time had passed for Adobe to process your request, the data would be returned. If not, we spammed the endpoint until it was ready.
One of the big differences in the 2.0 API is that we now only need a single request to retrieve data - it's given to us directly in the initial response. For those migrating their applications from 1.4 APIs, the limitations this introduces will present significant challenges, more on this later.
The final variable we need to authenticate with the /reports
endpoint is our globalCompanyId. Once we've found this out once it can be hardcoded into our scripts - if you followed the final instruction in post two of the series you'd already know it. If not then the following in Python (again using the super useful requests library) should be enough:
discoveryHeader = {
'Authorization': 'Bearer ' + token,
'x-api-key': apiKey,
}
discoveryEndpoint = 'https://analytics.adobe.io/discovery/me'
discoveryResponse = requests.get(discoveryEndpoint,headers = discoveryHeader)
responseJSON = discoveryResponse.json()
globalCompanyID = responseJSON['imsOrgs'][0]['companies'][0]['globalCompanyId']
Take the value in the globalCompanyId
variable, write it down somewhere, and you don't have to run through this extra unnecessary step each time your script executes.
Authenticating with the API is now just a matter of a well formatted header, as in this example
header = {
'Authorization': 'Bearer ' + token,
'x-api-key': apiKey,
'x-gw-ims-org-id': orgID,
'Accept': 'application/json',
'x-proxy-global-company-id': globalCompanyID,
'Content-Type': 'application/json'
}
token
here being the access token generated in our previous step. With the header built we turn to the body of our request, where we define specifically the data we want returned. The bare minimum we need to provide is a report suite ID, date range, metric and dimension, which will look something like this:
body = {
'rsid': 'yourReportSuiteID',
'globalFilters': [{
'type': 'dateRange',
'dateRange': '2020-02-01T00:00:00/2020-02-05T23:59:59.999'
}],
'metricContainer': {
'metrics': [
{'id': 'metrics/pageviews'},
{'id': 'metrics/visits'}
]},
'dimension': 'variables/daterangeday',
'settings': {
'dimensionSort': 'asc'
}}
This simple request returns Page Views and Visits, by Day, from 1st Feb' 2020, to 5th Feb' 2020. For the sake or readability, a setting has also been included to ensure our response data is ordered by date, from oldest to newest.
Note the very specific format of date range we've used...
2020-02-01T00:00:00/2020-02-05T23:59:59.999
That is:
yyyy-MM-dd'T'HH:mm:ss/yyyy-MM-dd'T'hh:MM:ss
(DateTime from) / (DateTime to)
The API will return an error if we don't feed it the date range in this format, down to the minute. Seconds are optional, but essential in our 'Date To' value - as 23:59 (HH:mm) alone will be read as 23:59:00 (HH:mm:ss) and so would not include any hits from 23:59:00 to 00:00:00.
Our request contains two metrics - Page Views and Visits. These are represented by
'metricContainer': {
'metrics': [
{'id': 'metrics/pageviews'},
{'id': 'metrics/visits'}
]}
Standard metrics in Adobe Analytics all follow a fairly generic ID format such as metrics/visitors
, metrics/event10
, metrics/evar7instances
. A full list of standard metrics can be returned from the /metrics
API endpoint, and calculated metrics from the /calculatedmetrics
endpoint, both of which we'll discuss later in this post.
Our request contains just a single dimension - Date, represented by
'dimension': 'variables/daterangeday'
Again a fairly standardised format - variables/daterangemonth
, variables/evar1
, variables/prop7
are some other examples. The /dimensions
API endpoint will return a list of useable variables, which we'll go into later.
On to the request. Remember the globalCompanyID
from earlier? We're using it again - the full endpoint we send the request to is given by:
endpoint = 'https://analytics.adobe.io/api/' + globalCompanyID + '/reports'
And so (still using the Python requests library), our call to the API looks like this:
r = requests.post(endpoint, json = body, headers = header)
Viewing the JSON response - r.json()
then shows us full results of our query:
{
'totalPages': 1,
'firstPage': True,
'lastPage': True,
'numberOfElements': 5,
'number': 0,
'totalElements': 5,
'columns': {
'dimension': {
'id': 'variables/daterangeday',
'type': 'time'
},
'columnIds': ['500a0d35-2b1d-3b58-8452-71ffa46ffea8', '9eea17bd-ba96-4567-9f78-d2e37adc9556']
},
'rows': [{
'itemId': '1200101',
'value': 'Feb 1, 2020',
'data': [1322810.0, 301492.0]
}, {
'itemId': '1200102',
'value': 'Feb 2, 2020',
'data': [1403411.0, 329818.0]
}, {
'itemId': '1200103',
'value': 'Feb 3, 2020',
'data': [1059812.0, 230631.0]
}, {
'itemId': '1200104',
'value': 'Feb 4, 2020',
'data': [1002329.0, 224023.0]
}, {
'itemId': '1200105',
'value': 'Feb 5, 2020',
'data': [962907.0, 217958.0]
}
],
'summaryData': {
'filteredTotals': [5751269.0, 1286380.0],
'totals': [5751269.0, 1286380.0]
}
}
And there we have it, one very simple API request and response.
Our metrics per individual dimension item are given in the rows
array, with that dimension item - Feb 1, 2020 for example, shown in value
, and associated metrics in the data
array, ordered the same way they were defined in the request - [Page Views, Visits]. Metric totals, independent of dimension, are given in summaryData
.
It's worth pointing out also, the totalPages
, firstPage
and lastPage
attributes. The /reports
API response defaults to a limit of 50 items in rows
, if this limit is reached then totalPages
will be greater than 1, and we'll need to make further requests to the API to return the rest of our data. Within settings
in the body of our request, we can set the page number to return
'settings': {'dimensionSort': 'asc','page':'1'}
Pages are indexed at 0, so 'page':'1'
actually represents the second page of our results.
An alternative to pagination is to set a higher limit in our initial request, again in settings
:
'settings': {'dimensionSort': 'asc','limit':'10000'}
For larger queries however, best practices are to split the request into multiple smaller requests, setting a sensible limit and then returning data one page at a time. For Analytics 2.0 APIs, timeout is set at 60 seconds, compared to 1.4 APIs which, being asynchronous effectively had no timeout.
This means there's a good chance that a heavyweight query, for example - Unique Visitors by Minute over the past 3 years, will take longer than 60 seconds to process, and so the API will respond with a 504 Gateway Timeout error.
The /dimensions
, /metrics
& /calculatedmetrics
API Endpoints
Using the same request header and base URL as above we can return a full list of available dimensions, metrics and calculated metrics for use in our API queries. Dimensions and standard metrics are given per report suite - we specify which one in the request. Our request to /dimensions
for example:
dimensionsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/dimensions?rsid=myReportSuite'
r2 = requests.get(dimensionsEndpoint, headers = header)
Where we obviously replace myGlobalCompanyID
and myReportSuite
with actual values. r2.json()
then contains a long list of all our dimensions - IDs, titles and a bunch of other metadata. One such example:
{
'id': 'variables/browser',
'title': 'Browser',
'name': 'Browser',
'type': 'string',
'category': 'Audience',
'support': ['oberon', 'dataWarehouse'],
'pathable': False,
'segmentable': True,
'reportable': ['oberon'],
'supportsDataGovernance': True
}
For now we're only interested in the id
attribute, this is the value we specify in the body of our /reports
API query.
The same request to the /metrics
endpoint then looks like this
metricsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/metrics?rsid=myReportSuite'
r3 = requests.get(metricsEndpoint, headers = header)
And r3.json()
shows data in a similar manner
{The
'id': 'metrics/checkouts',
'title': 'Checkouts',
'name': 'Checkouts',
'type': 'int',
'category': 'Conversion',
'support': ['oberon','dataWarehouse'],
'allocation': true,
'precision': 0,
'calculated': false,
'segmentable': true,
'supportsDataGovernance': true,
'polarity': 'positive'
}
/calculatedmetrics
endpoint is slightly different - report suite ID is an optional parameter here, but if included can be a comma separated list of report suite IDs, using the rsids
querystring parameter, instead of rsid
. If omitted, calculated metrics from all report suites are returned.calcMetricsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/calculatedmetrics?includeType=shared&rsids=optionalReportSuite,optionalReportSuite2'
r4 = requests.get(calcMetricsEndpoint, headers = header)
includeType
is required on this request (otherwise nothing is returned), and can be set to all
or shared
to view all possible calculated metrics (created by anyone in your organisation), or only calculated metrics shared with the service account linked to our Adobe I/O integration. This is listed as the Technical account email in our integration in the Adobe I/O console, and will look something like
9fa11e0f-d3cb-4d7e-b28e-f07f4530c324@techacct.adobe.com
An individual calculated metric from this response looks like:
{
'id': 'cm1624_5aa1c148e390c23c99385156',
'name': 'Orders / Visit',
'description': 'Average orders per visit',
'rsid': 'testrsid',
'owner': {'id': 200026049},
'polarity': 'positive',
'precision': 1,
'type': 'decimal'
}
Notice the different format of ID here when compared to a standard metric - the initial metrics/...
is not required when using a calculated metric in your /reports
request.
While it's useful knowing how to interact with these API endpoints programmatically, in practice it's often simpler to use Adobe's Swagger UI, found at https://adobedocs.github.io/analytics-2.0-apis. For calculated metrics, the ID can also be pulled directly from Analytics - it's found at the end of the URL in the Calculated Metric Builder:
...#/components/calculatedMetrics/edit/cm1624_5aa1c148e390c23c99385156
Using Segments in Queries, and the /segments
Endpoint
Using our earlier /reports
query as a basis, we can set global segments by adding an item to globalFilters
body = {
'rsid': 'yourReportSuiteID',
'globalFilters': [{
'type': 'dateRange',
'dateRange': '2020-02-01T00:00:00/2020-02-05T23:59:59.999'
}, {
'type': 'segment',
'segmentId': '93adb41be4b0a2a125bf38c4'
}],
'metricContainer': {
'metrics': [{
'id': 'metrics/pageviews'
}]},
'dimension': 'variables/daterangeday',
'settings': {
'dimensionSort': 'asc'
}}
Segment IDs can be returned from the /segments
endpoint, which behaves in a similar manner to /calculatedmetrics
- optional report suites and the required includeType
parameter
segmentsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/segments?includeType=shared&rsids=optionalReportSuite,optionalReportSuite2'
r5 = requests.get(segmentsEndpoint, headers = header)
But again, in practice it's likely to be easier using the Swagger UI, or just grabbing the ID from the Segment Builder URL in the Analytics UI
...#/components/segments/edit/93adb41be4b0a2a125bf38c4
That's probably enough to digest for just one post. The next post in the series will focus on querying the API for multiple dimension breakdowns, for example Visits per Day per Device. The changes introduced in the 2.0 Reporting API make this a truly painful task in comparison to 1.4 requests, so prepare to have your day thoroughly ruined.
Part 1 - Adobe I/O Integrations For Adobe Analytics 2.0 APIs
Part 2 - Adobe Analytics 2.0 API Authentication
Part 3 - Querying The Adobe Analytics 2.0 Reporting API
Comments