Release Status Open Beta Availability Free
Status Page GitLab Status Page Default Historical Sync 1 year
Whitelist Tables/Columns Unsupported/Unsupported Default Replication Frequency 30 minutes
Full Table Endpoints 1 Incremental Endpoints 5
Destination Incompatibilities None

Connecting GitLab

Connecting Stitch to GitLab is a four-step process:

  1. Create a Gitlab access token
  2. Add GitLab as a Stitch data source
  3. Define the Historical Sync
  4. Define the Replication Frequency

Prerequisites

Verify that you have access to any projects you want to replicate data from. Stitch is only able to access the same projects as the person who creates the integration.

Creating a Gitlab Access Token

  1. Sign into your GitLab account.
  2. Click the user menu (your icon) > Settings.
  3. Click the Access Tokens tab.
  4. In the Name field, enter Stitch. This will allow you to easily identify what application is using the token.
  5. In the Scopes section, check the api box. This will allow Stitch to access your API and replicate your GitLab data.
  6. Click Create Personal Access Token.
  7. The new Access Token will display at the top of the page. Copy the token before navigating away from the page - GitLab won’t display it again.

Add GitLab as a Stitch Data Source

  1. On the Stitch Dashboard page, click the Add an Integration button.
  2. Click the GitLab icon.

  3. Enter a name for the integration. This is the name that will display on the for the integration; it’ll also be used to create the schema in your data warehouse.

    For example, the name “Stitch GitLab” would create a schema called stitch_gitlab in the data warehouse. This schema is where all the tables for this integration will be stored.

  4. In the API URL field, enter https://gitlab.com/api/v3
  5. In the Private Token field, paste the Personal Access Token you created in the previous section.
  6. In the Projects to Track field, enter the projects you want to track separated by spaces.

    For example: in an organization named stitch, there are two projects to track: stitch-data and stitch-docs. To track them, you’d enter them like this: stitch/stitch-data stitch/stitch-docs

Defining the Historical Sync

The Sync Historical Data setting will define the starting date for your GitLab integration. This means that data equal to or newer than this date will be replicated to your data warehouse.

Change this setting if you want to sync data beyond GitLab’s default setting of 1 year. For a detailed look at historical syncs, check out the Syncing Historical SaaS Data article.

Define the Replication Frequency

The Replication Frequency controls how often Stitch will attempt to replicate data from your GitLab integration. By default the frequency is set to 30 minutes, but you can change it to better suit your needs.

Before setting the Replication Frequency, note that:

  • The more often GitLab is set to replicate, the higher the number of replicated rows.
  • The number of rows in the source may not equal the number of rows replicated by Stitch. Tables that use Full Table Replication will result in a higher number of replicated rows.

  • If you’re using a data warehouses that doesn’t natively support nested structures, you’ll see a higher number of replicated rows due to the de-nesting Stitch performs.

To help prevent overages, we recommend setting the Replication Frequency to something less frequent - like 6 hours instead of 30 minutes. For tips on reducing your row count, check out the Reducing Your Row Count section of our Billing Guide.

After selecting a Replication Frequency, click Save Integration.

GitLab’s Intial Sync

After you finish setting up GitLab, you might see its Sync Status show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial sync for the integration. This may take some time to complete.


GitLab Schema

Stitch's GitLab integration includes these tables:


branches

Replication Method: Incremental
Primary Key: project_id : name
Contains Nested Structures?: No

The branches table contains high-level info about repository branches in your projects.

Table Info & Attributes

branches Attributes

  • project_id

  • name

  • merged

  • protected

  • developers_can_push

  • developers_can_merge

  • commit_id

commits

Replication Method: Incremental
Primary Key: id
Contains Nested Structures?: No

The commits table contains info about repository commits in a project.

Table Info & Attributes

commits Attributes

  • Commit ID (id)

  • project_id

  • short_id

  • title

  • author_name

  • author_email

  • committer_name

  • committer_email

  • created_at

  • message

  • allow_failure

issues

Replication Method: Incremental
Primary Key: id
Contains Nested Structures?: Yes

The issues table contains info about issues contained within projects.

Table Info & Attributes

issues & Nested Structures

This table contains nested structures. If you use a data warehouse that doesn't natively support nested structures, some of the attributes listed below may be in a subtable.

These items are marked with a *

issues Attributes

  • Issue ID (id)

  • project_id

  • milestone_id

  • author_id

  • assignee_id

  • description

  • state

  • iid

  • labels *

  • title

  • updated_at

  • created_at

  • subscribed

  • user_notes_count

  • due_date

  • web_url

  • confidential

milestones

Replication Method: Incremental
Primary Key: id
Contains Nested Structures?: No

The milestones table contains info about project milestones.

Table Info & Attributes

milestones Attributes

  • Milestone ID (id)

  • iid

  • project_id

  • title

  • description

  • due_date

  • start_date

  • state

  • updated_at

  • created_at

projects

Replication Method: Incremental
Primary Key: id
Contains Nested Structures?: Yes

The projects table contains info about specific projects.

Table Info & Attributes

projects & Nested Structures

This table contains nested structures. If you use a data warehouse that doesn't natively support nested structures, some of the attributes listed below may be in a subtable.

These items are marked with a *

projects Attributes

  • Project ID (id)

  • approvals_before_merge

  • archived

  • avatar_url

  • builds_enabled

  • container_registry_enabled

  • created_at

  • creator_id

  • default_branch

  • description

  • forks_count

  • http_url_to_repo

  • issues_enabled

  • last_activity_at

  • lfs_enabled

  • merge_requests_enabled

  • name

  • name_with_namespace

  • namespace__id

  • namespace__kind

  • namespace__name

  • namespace__path

  • only_allow_merge_if_all_discussions_are_resolved

  • only_allow_merge_if_build_succeeds

  • open_issues_count

  • owner_id

  • path

  • path_with_namespace

  • permissions__group_access__access_level

  • permissions__group_access__notification_level

  • project_access__group_access__access_level

  • project_access__group_access__notification_level

  • public

  • public_builds

  • request_access_enabled

  • shared_runners_enabled

  • shared_with_groups *

  • snippets_enabled

  • ssh_url_to_repo

  • star_count

  • tag_list *

  • visibility_level

  • web_url

  • wiki_enabled

users

Replication Method: Full Table
Primary Key: id
Contains Nested Structures?: No

The users table contains info about the users in your GitLab account.

Table Info & Attributes

users Attributes

  • User ID (id)

  • username

  • name

  • state

  • avatar_url

  • web_url



Questions? Feedback?

Did this article help? If you have questions or feedback, please reach out to us.