Querying The Adobe Analytics 2.0 Reporting API

published on
A guide to requesting data from the Adobe Analytics 2.0 Reporting API, key differences from the 1.4 API and example requests in Python

In the first post of this series, we detailed the steps required to create an integration in Adobe I/O:
Adobe I/O Integrations For Adobe Analytics 2.0 APIs

In post two, we used the details given by that integration to generate an auth token:
Adobe Analytics 2.0 API Authentication

And so, with our auth token and details in hand, we've reached the point where we finally have the ability to interact with and retrieve data from Adobe Analytics 2.0 APIs.

In this post we'll mostly be focusing on the /reports endpoint - the single point to which we provide our reporting request, and from which resultant data returned. For those familiar with reporting using the Adobe Analytics 1.4 Reporting API, you'll be aware that the accepted method of retrieving data was to first queue your report by posting your request to:

https://api.omniture.com/admin/1.4/rest/?method=Report.Queue

Then subsequently, post the report ID given in that response to another endpoint:

https://api.omniture.com/admin/1.4/rest/?method=Report.Get

If enough time had passed for Adobe to process your request, the data would be returned. If not, we spammed the endpoint until it was ready.

One of the big differences in the 2.0 API is that we now only need a single request to retrieve data - it's given to us directly in the initial response. For those migrating their applications from 1.4 APIs, the limitations this introduces will present significant challenges, more on this later.

The final variable we need to authenticate with the  /reports  endpoint is our globalCompanyId. Once we've found this out once it can be hardcoded into our scripts - if you followed the final instruction in post two of the series you'd already know it. If not then the following in Python (again using the super useful requests library) should be enough:

discoveryHeader = {
'Authorization': 'Bearer ' + token,
'x-api-key': apiKey,
}

discoveryEndpoint = 'https://analytics.adobe.io/discovery/me'

discoveryResponse = requests.get(discoveryEndpoint,headers = discoveryHeader)

responseJSON = discoveryResponse.json()

globalCompanyID = responseJSON['imsOrgs'][0]['companies'][0]['globalCompanyId']

Take the value in the globalCompanyId variable, write it down somewhere, and you don't have to run through this extra unnecessary step each time your script executes.

Authenticating with the API is now just a matter of a well formatted header, as in this example

header = {
'Authorization': 'Bearer ' + token,
'x-api-key': apiKey,
'x-gw-ims-org-id': orgID,
'Accept': 'application/json',
'x-proxy-global-company-id': globalCompanyID,
'Content-Type': 'application/json'
}

token here being the access token generated in our previous step. With the header built we turn to the body of our request, where we define specifically the data we want returned. The bare minimum we need to provide is a report suite ID, date range, metric and dimension, which will look something like this:

body = {
'rsid': 'yourReportSuiteID',
'globalFilters': [{
'type': 'dateRange',
'dateRange': '2020-02-01T00:00:00/2020-02-05T23:59:59.999'
}],
'metricContainer': {
'metrics': [
{'id': 'metrics/pageviews'},
{'id': 'metrics/visits'}
]},
'dimension': 'variables/daterangeday',
'settings': {
'dimensionSort': 'asc'
}}

This simple request returns Page Views and Visits, by Day, from 1st Feb' 2020, to 5th Feb' 2020. For the sake or readability, a setting has also been included to ensure our response data is ordered by date, from oldest to newest.

Note the very specific format of date range we've used...

2020-02-01T00:00:00/2020-02-05T23:59:59.999

That is:

yyyy-MM-dd'T'HH:mm:ss/yyyy-MM-dd'T'hh:MM:ss

(DateTime from) / (DateTime to)

The API will return an error if we don't feed it the date range in this format, down to the minute. Seconds are optional, but essential in our 'Date To' value - as 23:59 (HH:mm) alone will be read as 23:59:00 (HH:mm:ss) and so would not include any hits from 23:59:00 to 00:00:00.

Our request contains two metrics - Page Views and Visits. These are represented by

'metricContainer': {
'metrics': [
{'id': 'metrics/pageviews'},
{'id': 'metrics/visits'}
]}

Standard metrics in Adobe Analytics all follow a fairly generic ID format such as metrics/visitors, metrics/event10, metrics/evar7instances. A full list of standard metrics can be returned from the /metrics API endpoint, and calculated metrics from the /calculatedmetrics endpoint, both of which we'll discuss later in this post.

Our request contains just a single dimension - Date, represented by

'dimension': 'variables/daterangeday'

Again a fairly standardised format - variables/daterangemonth, variables/evar1, variables/prop7 are some other examples. The /dimensions API endpoint will return a list of useable variables, which we'll go into later.

On to the request. Remember the globalCompanyID from earlier? We're using it again - the full endpoint we send the request to is given by:

endpoint = 'https://analytics.adobe.io/api/' + globalCompanyID + '/reports'

And so (still using the Python requests library), our call to the API looks like this:

r = requests.post(endpoint, json = body, headers = header)

Viewing the JSON response - r.json() then shows us full results of our query:

{
'totalPages': 1,
'firstPage': True,
'lastPage': True,
'numberOfElements': 5,
'number': 0,
'totalElements': 5,
'columns': {
'dimension': {
'id': 'variables/daterangeday',
'type': 'time'
},
'columnIds': ['500a0d35-2b1d-3b58-8452-71ffa46ffea8', '9eea17bd-ba96-4567-9f78-d2e37adc9556']
},
'rows': [{
'itemId': '1200101',
'value': 'Feb 1, 2020',
'data': [1322810.0, 301492.0]
}, {
'itemId': '1200102',
'value': 'Feb 2, 2020',
'data': [1403411.0, 329818.0]
}, {
'itemId': '1200103',
'value': 'Feb 3, 2020',
'data': [1059812.0, 230631.0]
}, {
'itemId': '1200104',
'value': 'Feb 4, 2020',
'data': [1002329.0, 224023.0]
}, {
'itemId': '1200105',
'value': 'Feb 5, 2020',
'data': [962907.0, 217958.0]
}
],
'summaryData': {
'filteredTotals': [5751269.0, 1286380.0],
'totals': [5751269.0, 1286380.0]
}
}

And there we have it, one very simple API request and response.

Our metrics per individual dimension item are given in the rows array, with that dimension item - Feb 1, 2020 for example, shown in value, and associated metrics in the data array, ordered the same way they were defined in the request - [Page Views, Visits]. Metric totals, independent of dimension, are given in summaryData

It's worth pointing out also, the totalPages, firstPage and lastPage attributes. The /reports API response defaults to a limit of 50 items in rows, if this limit is reached then totalPages will be greater than 1, and we'll need to make further requests to the API to return the rest of our data. Within settings in the body of our request, we can set the page number to return

'settings': {'dimensionSort': 'asc','page':'1'}

Pages are indexed at 0, so 'page':'1'  actually represents the second page of our results.

An alternative to pagination is to set a higher limit in our initial request, again in settings:

'settings': {'dimensionSort': 'asc','limit':'10000'}

For larger queries however, best practices are to split the request into multiple smaller requests, setting a sensible limit and then returning data one page at a time. For Analytics 2.0 APIs, timeout is set at 60 seconds, compared to 1.4 APIs which, being asynchronous effectively had no timeout.

This means there's a good chance that a heavyweight query, for example - Unique Visitors by Minute over the past 3 years, will take longer than 60 seconds to process, and so the API will respond with a 504 Gateway Timeout error.


The /dimensions, /metrics & /calculatedmetrics API Endpoints

Using the same request header and base URL as above we can return a full list of available dimensions, metrics and calculated metrics for use in our API queries. Dimensions and standard metrics are given per report suite - we specify which one in the request. Our request to /dimensions for example:

dimensionsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/dimensions?rsid=myReportSuite'

r2 = requests.get(dimensionsEndpoint, headers = header)

Where we obviously replace myGlobalCompanyID and myReportSuite with actual values. r2.json() then contains a long list of all our dimensions - IDs, titles and a bunch of other metadata. One such example:

{
'id': 'variables/browser',
'title': 'Browser',
'name': 'Browser',
'type': 'string',
'category': 'Audience',
'support': ['oberon', 'dataWarehouse'],
'pathable': False,
'segmentable': True,
'reportable': ['oberon'],
'supportsDataGovernance': True
}

For now we're only interested in the id attribute, this is the value we specify in the body of our /reports API query.

The same request to the /metrics endpoint then looks like this

metricsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/metrics?rsid=myReportSuite'

r3 = requests.get(metricsEndpoint, headers = header)

And r3.json() shows data in a similar manner

{
'id': 'metrics/checkouts',
'title': 'Checkouts',
'name': 'Checkouts',
'type': 'int',
'category': 'Conversion',
'support': ['oberon','dataWarehouse'],
'allocation': true,
'precision': 0,
'calculated': false,
'segmentable': true,
'supportsDataGovernance': true,
'polarity': 'positive'
}
The /calculatedmetrics endpoint is slightly different - report suite ID is an optional parameter here, but if included can be a comma separated list of report suite IDs, using the rsids querystring parameter, instead of rsid. If omitted, calculated metrics from all report suites are returned.
calcMetricsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/calculatedmetrics?includeType=shared&rsids=optionalReportSuite,optionalReportSuite2'

r4 = requests.get(calcMetricsEndpoint, headers = header)

includeType is required on this request (otherwise nothing is returned), and can be set to all or shared to view all possible calculated metrics (created by anyone in your organisation), or only calculated metrics shared with the service account linked to our Adobe I/O integration. This is listed as the Technical account email in our integration in the Adobe I/O console, and will look something like 

9fa11e0f-d3cb-4d7e-b28e-f07f4530c324@techacct.adobe.com

An individual calculated metric from this response looks like:

{
'id': 'cm1624_5aa1c148e390c23c99385156',
'name': 'Orders / Visit',
'description': 'Average orders per visit',
'rsid': 'testrsid',
'owner': {'id': 200026049},
'polarity': 'positive',
'precision': 1,
'type': 'decimal'
}

Notice the different format of ID here when compared to a standard metric - the initial metrics/... is not required when using a calculated metric in your /reports request.

While it's useful knowing how to interact with these API endpoints programmatically, in practice it's often simpler to use Adobe's Swagger UI, found at https://adobedocs.github.io/analytics-2.0-apis. For calculated metrics, the ID can also be pulled directly from Analytics - it's found at the end of the URL in the Calculated Metric Builder:

...#/components/calculatedMetrics/edit/cm1624_5aa1c148e390c23c99385156

Using Segments in Queries, and the /segments Endpoint

Using our earlier /reports query as a basis, we can set global segments by adding an item to globalFilters 

body = {
'rsid': 'yourReportSuiteID',
'globalFilters': [{
'type': 'dateRange',
'dateRange': '2020-02-01T00:00:00/2020-02-05T23:59:59.999'
}, {
'type': 'segment',
'segmentId': '93adb41be4b0a2a125bf38c4'
}],
'metricContainer': {
'metrics': [{
'id': 'metrics/pageviews'
}]},
'dimension': 'variables/daterangeday',
'settings': {
'dimensionSort': 'asc'
}}

Segment IDs can be returned from the /segments endpoint, which behaves in a similar manner to /calculatedmetrics - optional report suites and the required includeType parameter

segmentsEndpoint = 'https://analytics.adobe.io/api/myGlobalCompanyID/segments?includeType=shared&rsids=optionalReportSuite,optionalReportSuite2'

r5 = requests.get(segmentsEndpoint, headers = header)

But again, in practice it's likely to be easier using the Swagger UI, or just grabbing the ID from the Segment Builder URL in the Analytics UI

...#/components/segments/edit/93adb41be4b0a2a125bf38c4

That's probably enough to digest for just one post. The next post in the series will focus on querying the API for multiple dimension breakdowns, for example Visits per Day per Device. The changes introduced in the 2.0 Reporting API make this a truly painful task in comparison to 1.4 requests, so prepare to have your day thoroughly ruined.


Part 1 - Adobe I/O Integrations For Adobe Analytics 2.0 APIs

Part 2 - Adobe Analytics 2.0 API Authentication 

Part 3 - Querying The Adobe Analytics 2.0 Reporting API

Comments