Manifest reference

The manifest.yml is written in a human-readable YML format. It is divided into six sections, each of them described in detail below.

Info Section

Info section is used for defining basic information about Connector.

Example:

Info:
    name:        Sendgrid
    author:      John Smith
    description: Sendgrid description
    avatar:      'http://cloud-elements.com/wp-content/uploads/2015/03/sendgrid-logo-01.png'
    avatarBig:   'http://cloud-elements.com/wp-content/uploads/2015/03/sendgrid-logo-01.png'
    timezone:    UTC
    profile:     a-field-from-settings
    popup:
        width:  1090
        height: 750
    category:
        - email delivery
    metrics:
        - sessions
        - id: revenue
          name: Revenue
          dataType: DailyLatestSum
          intervals: [Today, Yesterday, Last7Days]
          format: '0.00 USD'
Property name    Type    Description
name string [required] Connector name. Will be used on listing page.
description string [required] Connector description. Will be used on listing page.
avatar string [required] Absolute path to avatar image (square - 140x140). Will be used on connected data sources page in Databox WebApp.
avatarBig string [required] Absolute path to datasource image (366x272). Will be used on available data sources page in Databox WebApp.
timezone string Set only if API returns all responses in one timezone. For example: UTC
profile string Enter a field from settings (Access credentials) that will be set as a connection name, later visible in "Data manager" in Sandbox. If blank, the connection will be unnamed.
popup list Used for OAuth1 and OAuth2 authentication popup . Default size is 550x450. If the popup is too small or too big, you can set custom window size with width and height parameters. See example above.
category[] list [required] List of category names. When imported to Databox will be used for filtering by category name.
metrics[] list [required] List of metric keys. When imported to Databox will be used to select metrics from datasource in Databox Designer.
metrics[].id string This should be unique in Connector context. The key must contain only letters (a-z, A-Z), numbers (0-9) or underscores (_).
metrics[].name string Human-readable metric key name. For example, Page Views.
metrics[].dataType string How will the data be presented, when pushed to Databox.
Default data type is DailyLatestSum.
metrics[].intervals list List of possible intervals. When imported to Databox will be used to select intervas from datasource in Databox Designer.
If not provided, you'll be able to select all intervals in Databox Designer.
metrics[].format string Metric key format.
Example: 0.00 for a number with two decimal digits.
metrics[].description string Metric key description.
Example: The session index for a user. Each session from a unique user will get its own incremental index starting from 1 for the first session. Subsequent sessions do not change previous session indices. For example, if a user has 4 sessions to the website, sessionCount for that user will have 4 distinct values of '1' through '4'.

Metric intervals

Listed below are all possible intervals in Databox.

Interval name Short name Interval name Short name
Last 24 hours Last24Hours Year-to-date YTD
Today Today This week ThisWeek
Yesterday Yesterday This month ThisMonth
Last 7 days Last7Days This quarter ThisQuarter
Last 14 days Last14Days This year ThisYear
Last 30 days Last30Days Last week LastWeek
Last 28 days Last28Days Last month LastMonth
Last 90 days Last90Days Last quarter LastQuarter
Last 180 days Last180Days Last year LastYear
Week-to-date WTD Last 12 months Year
Month-to-date MTD All time AllTime
Quarter-to-date QTD

Metric data types

Data type Description (ex. for Last 7 days interval)
Diff Sums every pushed value. Used for pushing timestamps when some event/action occurred.
DailyLatestSum Sums most recent values for each day (depends on granularity).
OverallTotal Used when pushing current (total) data. Only most recent entry is displayed, regardless of interval selected.

Metric formats

Format Default format in which the data should be displayed (Defaults to auto)
PrefixCurrency Prefixes the value with the pushed unit. Ex. $100
0.00 Displays the value with two decimal numbers.
0.00◊ Displays as percentage, already multiplied by 100. Ex. 100◊ => 100%
0.00% Displays as percentage, multiplies the pushed value by 100. Ex. 0.6% => 60%
Duration Converts pushed time in seconds to Days, Hours,...

Metric transforms

Flag Additional flags for inverting, disabling,... values of metric keys
changeInvert true; Changes the way comparisons are made. Ex. Rank is better if it's a lower value.
disabled true; Disables the metric for selection in Databox.
noReset true; The values aren't reset when changing interval. Ex. Twitter Followers: If last month you had 300 followers and you change the interval to this month, if you haven't gotten any more followers the value should still be 300.

Auth Section

Auth section defines authentication protocol and provides all necessary property fields for connecting to a 3rd party API. You can use parameters for sensitive information like API keys or credentials. In example below we use client_id and client_secret for OAuth2 authentication.

Example:

Auth:
    provider: oauth2
    parameters:
        client_id:              '<YOUR CLIENT ID>'
        client_secret:          '<YOUR CLIENT SECRET>'
        grant_type:             'authorization_code'
        authorization_endpoint: 'https://www.dropbox.com/1/oauth2/authorize'
        access_token_endpoint:  'https://api.dropbox.com/1/oauth2/token'
        refresh_token_endpoint: 'https://api.dropbox.com/1/oauth2/token'

If you need to execute API call just after the successful authentication you can set onSuccess parameter.

Example:

...
Auth:
    provider: oauth2
    parameters:
        client_id:     '<YOUR CLIENT ID>'
        client_secret: '<YOUR CLIENT SECRET>'
        ...
    onSuccess:
      operation: GetUserId
      transform: "{userId:id}"
...

API call GetUserId will be executed and then transformation {userId:id} will be applied to response before it's saved to settings.

When using scope parameter sometimes you need to use different concating character then default space ' '. You can set different concating character with scope_glue parameter.

Example:

Auth:
    ...
    scope:
      - example_scope1
      - example_scope2
    ...

This config will produce scope=example_scope1 example_scope2, but if you set scope_glue like this:

Auth:
    ...
    scope_glue: ','
    scope:
      - example_scope1
      - example_scope2
    ...

scope will be scope=example_scope1,example_scope2.

Reference:

Property name    Type    Description
provider string [required] Describes the authentication protocol with the following string values supported: oauth1, oauth2, basic
parameters list [required] Describes the parameters used in the authentication process.
    Required parameters for oauth1 provider are:
     - consumer_key
     - consumer_secret
     - authorization_endpoint
     - request_token_endpoint
     - access_token_endpoint
    Optional parameters for oauth1 provider are:
     - scope
     - scope_glue
     
    Required parameters for oauth2 provider are:
     - client_id
     - client_secret
     - grant_type
     - authorization_endpoint
     - access_token_endpoint
    Optional parameters for oauth2 provider are:
     - scope
     - scope_glue
     - approval_prompt
     - refresh_token_endpoint
     
    Required parameters for basic provider are:
     - items[] where each item can have parameters name, type, value, note and label. You can also apply regular expression for specific field with regex parameter and regexMessage for an error message input does not match the pattern.

[required]  Items with name username and password.
onSuccess task Describe API call which will be called just after authentication was successful. You can set operation and transform fields. Response will be saved to settings.

OAuth2 examples

Auth:
    provider: oauth2
    parameters:
        client_id:              '<YOUR CLIENT ID>'
        client_secret:          '<YOUR CLIENT SECRET>'
        grant_type:             'authorization_code'
        authorization_endpoint: 'https://www.dropbox.com/1/oauth2/authorize'
        access_token_endpoint:  'https://api.dropbox.com/1/oauth2/token'
        refresh_token_endpoint: 'https://api.dropbox.com/1/oauth2/token'
        scope:                  'read_directory'
Auth:
    provider: oauth2
    parameters:
        client_id:              '<YOUR CLIENT ID>'
        client_secret:          '<YOUR CLIENT ID>'
        grant_type:             'authorization_code'
        authorization_endpoint: 'https://www.facebook.com/dialog/oauth'
        access_token_endpoint:  'https://graph.facebook.com/oauth/access_token'
        refresh_token_endpoint: 'https://graph.facebook.com/oauth/access_token'
        scope:
            - manage_pages
            - read_insights

OAuth1 example

Auth:
    provider: oauth1
    parameters:
        consumer_key:           '<YOUR CONSUMER KEY>'
        consumer_secret:        '<YOUR CONSUMER SECRET>'
        authorization_endpoint: 'https://api.twitter.com/oauth/authorize'
        request_token_endpoint: 'https://api.twitter.com/oauth/request_token'
        access_token_endpoint:  'https://api.twitter.com/oauth/access_token'

Basic authentication example

Auth:
    provider: basic
    parameters:
        - name:  profile
          label: Name
          note:  You can find your API Secret in your Mixpanel account under Project settings > Management.
        - name:  username
          label: API Secret
          type:  password
        - name:  password
          value: X
          type:  hidden

This is example for Mixpanel authentication. The API only requires API secret (passed as username in Basic Auth). But because password is required for Basic Auth, we hid the field in connect popup with type: hidden.

Combining basic authentication with OAuth

Sometimes you'll need additional information before starting OAuth authorization flow. Example below is for HubSpot connector. We first need HubId, because we have to provide it as portalId in OAuth2 authorization flow. That way Auth Section contains a list of two providers. We first open popup with hubId field and then pass it to OAuth2 authorization with parameter extra_params.

Auth:
    - provider: basic
      parameters:
        - name:  'hubId'
          label: 'Hub ID'
          note:  'You can find your Hub ID in the top right corner in your HubSpot portal.'
          regex: '^[0-9]+$'
          regexMessage: 'Invalid Hub ID.'
    - provider: oauth2
      parameters:
        client_id:     '<YOUR CLIENT ID>'
        client_secret: '<YOUR CLIENT SECRET>'
        grant_type:    'authorization_code'
        authorization_endpoint: 'https://app.hubspot.com/auth/authenticate'
        access_token_endpoint:  'https://api.hubapi.com/auth/v1/token'
        refresh_token_endpoint: 'https://api.hubapi.com/auth/v1/refresh'
        extra_params:
            portalId: '{hubId}'

API Section

API:
    baseUrl: https://api.github.com/
    operations:
        Default:
            parameters:
                page:
                    type: string
                    location: query
                    default: 1
                per_page:
                    type: string
                    location: query
                    default: 100
        GetUserRepos:
            extends: Default
            httpMethod: GET
            uri: user/repos
        GetContributors:
            httpMethod: GET
            uri: repos/{repo.name}/contributors

GitHub API URL is entered as baseUrl, followed by operations:

Default parameters page and per_page are defined, these are parameters that will be set for every request that uses them with extends keyword.

GetUserRepos is the arbitrary name we gave to the GET API call user/repos. It's a good practice to keep it close to actual API call name. It uses the Default parameters via extends keyword. Uri user/repos will be called.

GetContributors is another API call, URI will use the parameter {repo.name} which is set during the initial connect and will be substituted by chosen repo name.

User Select section

If you need to provide the user with a way to interactively choose some value needed in API calls later, you can define User selection rules. Below you can see an example for Google Analytics where the user is required to select a view for which we will be retrieving the reports for.

Example:

UserSelect:
    - id: accountId
      name: Account selection
      description: These accounts are listed from Google Analytics. Please select one.
      operation: GetAccounts
      transform: "items[].{id: id, name: name}"
    - id: webPropertyId
      name: Select property
      description: Please select web property
      operation: GetWebProperties
      transform: "items[].{id: id, name: name}"
    - id: propertyId
      name: Select view
      description: Please select view profile
      operation: GetListOfProfiles
      transform: "items[].{id: id, name: name}"

Each step of user selection needs to return a response in format:

[
  {
    "id": <some value>
    "name": <some value>
  }
]

The id parameter for the user selected item will be saved. In the example above, we'll save 3 parameters:

  • accountId - in step 1
  • webPropertyId - in step 2
  • propertyId - in step 3

You can access values of these parameters in the API section. User selection section in completed when there are no further steps defined.

Tasks Section

Tasks are used to retrieve data from API endpoints. You can define as many tasks as you like. Each Task is uniquely defined with name. Later in Transformations section you can use responses from Tasks by referencing Tasks by this name.

Task usually returns larger amount of data than we need. That's why data can be filtered and cleaned up before use. JMESPath format is used for this purpose. Its syntax may be used in transform field. It's a good practice to filter out all the data we don't need, for easier development and debugging later on.

Basic task example:

Tasks:
    - name: LastCustomers
      operation: GetAllCustomers
      transform: 'items[0:5]'

Example for task with pagination:

Tasks:
    - name: GetConversations
      operation: GetConversations
      transform: 'items[].{status:status createdAt:createdAt closedAt:closedAt, tags:tags}'
      parameters:
        page: 1
      paginate:
        property: page
        max_pages: 5
        stop-if-empty: items
        array_merge: true

Task reference:

Parameter Description
name Task name
operation Operation to call (defined in API Section)
transform JMESPath expression to transform response
parameters Set request parameters. Parameters are defined in operation (API Section)
paginate An option to paginate through requests. Pagination parameters:
  property - parameter you wish to paginate that is defined in parameters. (See example)
  increase - sey by how much you're increasing parameter with each page (default: 1)
  max_pages - limit number of pages (how many pages to pull)
  array_merge - if true, all responses will be merged into one response. Otherwise you'll get an array of responses. (Default:
  stop-if-null - set the field in API response. If response field is null (for ex. next_page), the pagination will stop before hitting max_pages limit.
  stop-if-empty - same as stop-if-null, but this one checks if field in response is an empty array.

Transformations Section

Transformations use responses from Tasks and apply custom transformations to create a new response (that will be pushed to Databox). Currently only PHP language is supported. Start creating Transformations by saving PHP code for your connector.

<?php

namespace Databox\MyConnector;

class Transform
{
    public function getContributors($response)
    {
      // Your code

      return $push;
    }
}

Manifest example:

Transformations:
  - name: GetContributors
    from: GetContributors
    use: getContributors
  - name: GetBranches
    from: GetBranches
    use:
      - metrickey: branches
  - name: GetFollowers

Firstly, transformation is named with name field, then the Task is linked to it with from directive and finally, PHP function that will transform the data is named with use keyword, getContributors will be the function name in our case.

Second transformation named GetBranches is even simpler, it doesn't need any PHP code. Output of task GetBranches is simply pushed as metric key named branches. As output of task GetBranches is just a simple number, this shortcut can be taken to make it simpler and omit any PHP code.

Third transformation only have name field. Databox will use task GetFollowers and search for the function also with name GetFollowers in this case.

Basically, we can do anything with data that the native API didn't provide or just push the data we get. Data is then pushed to the Databox.

Only name field is required. If you don't specify from or use field it will fallback to what you typed in from field.

Batch Section

Batches define which tasks/transformations will run on initial connect (id: default) and which hourly (id: metrickeyname). In example below, default batch starts upon initial connect and runs two tasks/transformations. After it, two hourly batches are defined. Their id is set to metric keys used by connector (Info Section).

There are some reserved batch names (ids):
- default: runs on initial connect and once per day
- history: runs only once on initial connect
- nightly: runs once per day at midnight

Batch:
    ## Default batch
    - id: default
      runs:
        - use: GetBranches
        - use: GetCommitsByBranchId

    ## Batch for each metric defined in Info Section
    - id: branches
      runs:
        - use: GetBranches
    - id: commits
      runs:
        - use: GetBranches
        - use: GetCommitsByBranchId

Other batches (as branches and commits in example) must match the connector's metric keys. Fetcher will get all metric keys selected in Databox Designer and match them with batches. That way the fetcher will run only transformations for selected metric keys. If two batches share the same transformation, it will be executed only once.