Introduction

With more than 15 years of our own source code residing in GitClear's database, our company has a lot riding on the security of the code processed by GitClear. When it comes to something this important, your Devops team needs transparency and details to make an informed decision about the security posture of a prospective partner. That's why we err on the side of disclosure when it comes to security. Here are the domains across which we work to keep your code secure throughout its lifetime on GitClear:

  1. Security Overview
  2. Source Code Protection
  3. Product Security
  4. Network and Server Security
  5. Additional Security Details & Policies
  6. ISO 27001 (2022 edition) and SOC2 Certification
  7. Reporting an Issue

Security Overview

  • All connections to and from GitClear servers are protected by 256-bit strength HTTPS
  • Servers reside in 24/7 security-staffed data center requiring key card and biometric identifier to access
  • Password login disabled across all servers, SSH keypairs required
  • SSH access requires passing through a VPN firewall that logs and reports all new logins to Devops team
  • Code analysis does not require persisting code line content on our servers
  • We don't clone your repo, access is made available through secure provider commit APIs
  • Source code content is one-way hashed to make long-range code analysis possible without code line text
  • Data is purged from disconnected repos, all data expeditiously removed upon service cancellation

Source Code Protection

All source code is accessed via secure APIs provided by our git hosts: GitHub, GitLab, Bitbucket and Azure Devops. These APIs are accessed over HTTPS using ephemeral access tokens that we shall subsequently refer to as "Provider Tokens." Git is not accessed directly, so our servers never receive a copy of your git repo, we only process code content received via specific API provider responses (e.g., to get the diff between two commits).

Provider Tokens

Provider Tokens are conditionally granted to GitClear upon the customer completing an Oauth2 login from GitClear to their git provider. We request the minimal set of authorization possible (varies by git provider) to be able to receive commit data pertaining to the repos that the customer selected for import during the account onboarding process.

As is the default for Oauth2-based connections, GitClear never requests nor receives the customer password at their git provider. All Provider Tokens are SSL encrypted between the git provider and GitClear's app servers, and then encrypted to the database using an encryption key that is unavailable to the database server.

The Provider Token can be invalidated at any time through any number of possible paths. On GitClear, you can visit "Settings" -> "Repo Providers" -> "Revoke Access", or "Settings" -> "Account" -> "Close Account," either of which will delete your Provider Token(s). Your git provider (GitHub, GitLab, Bitbucket or Azure Devops) will also provide a means to invalidate the token from their side. After the token has been invalidated, all GitClear access to your data is permanently severed unless you move to re-establish the connection at a later point.

🔎 On the Waydev Credentials Breach reported by Zdnet and others

Many GitClear customers have arrived following the large-scale breach of Waydev in 2020 and are justifiably curious about how GitClear's security would have fared under a similar attack. It's alleged that Waydev's leak exposed more than 7.5m customer records to the dark web, affirming how critical it is to consider how one's git provider credentials are being handled.

Provider Tokens were the vector used by hackers. While the hacker's methods have not yet been disclosed by Waydev (now a year later, this detail will probably never be known), it's possible to reconstruct the basic attack design through third-party reporting on the event from Zdnet and blueoptima.

According to builtwith.com, Waydev was built with a PHP web server. PHP allows developers to decide between many possible options in how they will sanitize input queries made to their database. In the case of Waydev, the SQL injection prevention mechanism they chose to integrate with PHP failed to do its job. In the words of Waydev's CEO, a "blind SQL injection vulnerability" allowed the hackers to "gain access to the database," where the Provider Tokens were lifted from.

The glaring failure in the case of such a breach is the SQL injection protection. Preventing against SQL injection is no trivial task--entire libraries and frameworks exist for the specific purpose of sanitizing user input prior to feeding it to the database. In the case of GitClear, we rely upon Ruby on Rails, one of the most popular and well-tested web frameworks used across the web, to sanitize our SQL queries. Ruby on Rails' main defense against SQL injection is that the developer is never writing "raw" SQL queries -- between "user input" and "database query" are several layers of systems that control for unpredictable user input (including use of whitelisted parameters by default).

The secondary failure was a system design that apparently allowed unencrypted tokens to be obtained directly from the database. While it is possible that the attackers could have used unrevealed tactics to gain SSH access to Waydev servers, the simpler explanation would be that the tokens had been stored without being encrypted at rest, or they were encrypted in such a way that the decryption credentials could be obtained through the database. This is why GitClear specifically stores each Provider Token with a row-based initialization vector and an encrypted token--these must be combined with a decryption key which is not accessible to the database server--let alone the database itself.

No company can ever assume they've accounted for every possible attack vector. Several systems must work in coordination. An ongoing effort must be made to encrypt wherever possible, and automate notifications when unusual access patterns emerge. We hope that being as transparent as possible in our security design will compel our customers to reach out to us if they have ideas on how we can improve upon it.

Git Access

We do not clone your repo or utilize other forms of direct git access. GitClear works through secure provider APIs to analyze git commit details. This approach prevents GitClear from ever needing to ever possess a full copy of the repo's contents [1].

Code Storage

GitClear is engineered to work without storing code content. When we analyze a commit, we create a non-decryptable (one-way transform) MD5 representation of your code content. This lets us continue to identify connections between unique code lines without needing to persist those lines' actual code content.

Many teams wish to have their code line content temporarily cached on behalf of being able to review specific commit details, e.g., in the Commit Activity Browser. For these customers we provide the option to control the length of time that code line content will be cached through "Settings" -> "Commit Processing." Current options are "No cached code," "Cache up to two weeks" (the default selection) or "Cache up to three months."

Cached code content is encrypted in the database using an encryption key that is unavailable to the database server. After the caching window passes, code line content is flushed, leaving the connection graph between the code lines without possessing the code content itself.

All data in disconnected git repos is purged from our database within one business day.

Enterprise Edition

Customers using GitClear Enterprise receive all the protections described above, applied in the context of their own cloud or data center.

Unless the customer has enabled exception tracking for debugging (an option during product installation), no code data is transmitted to GitClears servers at any time when using Enterprise Edition. If the customer enables exception tracking (recommended but not necessary), an exception will trigger a small amount of data, usually less than 1kb, to be sent to GitClear's secure error tracking server. This data assists GitClear developers to diagnose system errors, without needing access to the Enterprise customer's application, database, or logs.

Product Security

There are several steps we take to help customers precisely manage access to sensitive data in GitClear.

Oauth2 Login & Two-Factor Authentication (2FA)

Access to GitClear is made available through an Oauth2 login at the customer's git provider. We strongly recommend that customers enable two-factor authentication at their git source provider. Setting up 2fa at the git provider ensures the same security requirement is applied when logging into GitClear as logging into the git provider (since they are one in the same).

GitClear never requests nor receives the customer's password during the login process. Access from GitClear to the customer's repos can be revoked at any time through the git provider's settings.

Permissions

Upon creating a GitClear account, only the customer themselves will be able to see the data they have imported initially. To collaborate with their team, the admin invites select users to GitClear (typically via an email invite) at a particular access level (called a "role"). User roles available to admins on GitClear include:

  • Contributor. No access to GitClear. This role exists to allow the admin to control which contributors' commit activity will be shown to team members with a higher-level role.
  • Developer. Can view aggregated team reports and their own commit stats (such as their past year of Diff Delta). Can not view the commit stats of other individual team members or any reports that list names of their team members. When viewing the list of other Developers present in their team, alphabetic sorting will be the only available sort option.
  • Manager or Lead Developer. Can view aggregated team reports, as well as individual developer reports for any member of their team. Can not view stats comparing across teams or comparing their team to industry averages.
  • Executive or Director. Can view aggregated team reports, individual developer reports, and reports that compare team stats to industry averages. Can not view stats for members of the company who were not included in their team by the admin.
  • Admin. Controls who gets put on which team. By default, the admin is given access to an "All Contributors" team that contains all contributors in all the repos that were selected for import.

As a general philosophy, we direct user attention toward learning and discovering instead of comparing, though we do make available selected comparative reports as have been requested by executive customers.

Password and Credential Storage

Access to GitClear's production site is made available only through Oauth2 login at a git provider. This allows us to avoid storing passwords, and protects our customers from password-reset attacks.

Safeguards used to protect git access tokens are discussed above, in the "Provider Tokens" section.

On Enterprise installations, we allow login through SAML or email/password. In these cases, passwords are salted and encrypted at rest using the Bcrypt algorithm recommended by security experts.

Uptime

Our systems have uptime of 99% or higher, and we proactively post status updates for production incidents to our Twitter account.

Credit Cards and Payments

Customer payment details (i.e., credit cards) are stored in a secure vault on Stripe. Credit cards details are submitted to Stripe directly without passing through GitClear servers by using a Stripe iframe that's embedded in the payment form hosted by GitClear.

Stripe's infrastructure for receiving, storing, decrypting, and transmitting card numbers runs in separate hosting infrastructure, and doesn't share credentials with Stripe's primary services.

Network & Server Security

There are multiple layers of protection keep your data secure at the Network & Server level. Our approach includes physical security at our colo facility, encrypting all data in transit (via HTTPS), industry standard protection including firewalls, network vulnerability scanning, network security monitoring, and intrusion detection systems.

Data Hosting & Physical Security

GitClear hosts its infrastructure at a data center hosted by Digital Fortress in Seattle. Per Digital Fortress documentation, security measures on premises include

  • 24/7 on-site security staff
  • 24/7 key card access
  • 24/7 camera surveillance and 90 day video retention
  • CCTV with closed circuit monitoring in two NOCs
  • Access with biometric entry to facility

GitClear's database server has no public IP address and is only available through a private network within our server rack at Digital Fortress.

SSH Access

SSH access to GitClear's private internal network first requires VPN login to get past a firewall layer and receive an IP endpoint through which server login can be attempted. All VPN logins are logged, emailed, and flagged for review by GitClear DevOps team members.

After establishing VPN access, GitClear servers additionally require a private SSH key to access; password login is disabled on all servers. Access to individual servers is logged and recorded to a searchable cloud-based logging system that allows logins to be reviewed and audited at any time.

Login as root is disabled across all GitClear servers.

System Logging

We collect and upload system logs used to audit SSH access and otherwise monitor system access and functionality. All application logging is filtered for 10+ sensitive parameters, including all permutations of Provider Tokens sent by git providers and the payment tokens delivered by Stripe.

External Services

The following external services are used by GitClear to provide our SaaS version at https://www.gitclear.com:

  • AppSignal. Used to monitor performance and to capture selected system logs for security auditing.
  • Amazon S3. Used for storing repo avatars and database backups. Database backups are stored encrypted in a private bucket of an S3 account that requires two-factor authentication and is limited to Devops personnel with specific access clearance. Database backups do not contain the decryption keys needed to view Provider Tokens and other sensitive data that is encrypted at rest (then encrypted again as part of the database backup process).
  • Google Analytics. Google Analytics is used on GitClear's SaaS site https://www.gitclear.com to gauge which content pages are resonating with prospective customers. GitClear parameterizes Google Analytics not to store a customer's IP address, and does not include remarketing tags.

Additional Security Details & Policies

GitClear has worked with industry-leading security groups such as NCC Group to audit and fortify our security systems. This section describes security steps we've taken that didn't fit into the sections above.

NCC Group Audit

In 2020, GitClear contracted NCC Group, a global expert in cyber security, to perform a security review of the entire security infrastructure, including threat prevention, mitigation, and recovery. They chose the industry-standard NIST Risk Management Framework to assess GitClear's security posture across more than 100 different dimensions.

In the report summary (available upon request), our security measures are described by NCC Group as "above average as compared to the assessments regularly completed by NCC Group." A sampling of the security measures on which GitClear was recognized with a perfect score included "Identify: Business Environment (ID.BE)," "Protect: Access Control (PR.AC)," "Respond: Response Planning (RS.RP)," and "Respond: Analysis (RS.AN)."

We were found to have a security measure "Not in place" on 6% of the 106 security dimensions analyzed in this comprehensive report. The 6% "Not in place" were:

  • Voluntary information sharing occurs with external stakeholders to achieve broader cybersecurity situational awareness. We regularly read & review security updates from numerous popular developer sources such as Hacker News. We are also subscribed to updates from GitHub when our project dependencies are found to have a security consideration warranting upgrade.
  • Antivirus software is installed. We use Linux servers secured behind a firewall. These servers don't download and execute code from arbitrary web locations.
  • Configuration change control processes in place. We utilize ansible to ensure we possess a versioned history of decisions that influenced server configuration decisions. The NIST assessment seems to pertain specifically to possessing business documentation of configuration control, which would afford no benefit at our current company size.
  • Third-party stakeholders (e.g., suppliers, customers, partners) know their roles & responsibilities. We have not created an explicit policy for this yet as its more convenient to communicate with our small set of shareholders directly. Small team benefits.
  • Security awareness training program available for all users. Developer code is audited by PR, and security conversations are a mainstay of our Slack communication; we also have a suite of documentation that describe known security considerations pertaining to specific areas of implementation.
  • Organizational communication and data flows are mapped. Once our company size reaches 100 we expect to have someone on staff to handle this.

Google Vendor Self-Assessment

GitClear has submitted and passed the Google Vendor Self-Assessment Quiz, which measure almost 100 different dimensions of application security. A copy of our completed VSAQ is available upon request.

Data Access Policy

Our policy is that no employee shall access customer data without the explicit permission granted via Settings -> "Allow GitClear employees to access data."

Pentest and Threat Scanning

We partner with a security services vendor to perform annual threat scanning of the GitClear website. Most recently, the WAS Web Application Report was used to assess the susceptibility of GitClear to attack over thousands of sustained access requests spanning several hours. The report found no significant security vulnerabilities in GitClear's infrastructure.

ISO 27001 2022 and SOC2 Certification

As of Q1 2024, GitClear has addressed the requisite controls to pass the international gold standard, an ISO 27001 external audit. We are currently progressing through the internal audit with the expectation to then move on to the external ISO 27001 audit by Q2 2024.

We further have reached agreement with a service provider and with external auditors to undertake a SOC2 audit before the end of 2024.

Upon request, our auditors can produce a written progress letter to testify to our commitment to gaining these industry certifications in the near future. As the rest of this page attests, GitClear has been security-focused since its inception, which expedites the accreditation process.

Reporting an Issue

If you discover a security vulnerability, please email us at [email protected]. We guarantee response within one business day. If you would like to send an encrypted message, please sign using our public PGP key:

-----BEGIN PGP PUBLIC KEY BLOCK----- mQENBF4fv4EBCADXdJp6Js9cqplvShgFGzNAK8PtYfJhjtw+9vTBu+MR97w4x0Fl Q9VEbHSdF/OBLNDFszOeCEQ3SeHqoSOfsJMS8PO/ROP3SrUvRYWrIcOYUkR0sxg8 JChzT58S87fjGiZ1fuAyk+LKtb859JUzqGlDX2e2+baW6lPxXnv2Avl0cbLiv+a6 lVRmQzXBAsHgRWV3VoB3HLw+niKVy8UHH9Jgsc/Uyu9GPb7lh4FBFCVvgLES7mK9 0pq1rtHx9bCj9W+E9JLfvDpTvag0ldpQpcqKuYZMZcvmBRDA8SdTMkEXmUQUiRH6 pDUZxLX2sSm5B78SW4sqF/Xuhs+UFs/drtdRABEBAAG0IEJpbGwgSGFyZGluZyA8 YmlsbEBnaXRjbGVhci5jb20+iQFUBBMBCAA+FiEEoHNj2HR+aFrMpQEjMXVZ5QIa fBYFAl4fv4ECGwMFCQPCZwAFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQMXVZ 5QIafBbUUQf/a4ODpt9kNgpIWVtTVKGSHlTctrcEjfLqi/1bFienvhrW2omvq3Pr qpYToUZ9qK1TfZcvnjh+0F5IWeiNhIl13kmII+19vbjri5iDNQ0up3tpMeu+hJX5 3RHT/3WwT3F6gI6f+r3iFlyy5xq+q6t7TitRPguuGQFm1VXabiTfNZxhLhPJQz6q 2GXR6fYgToc52CtIDMozNF3vmV7qeV4k5IOXmcN4yL51bf3muOGTYstmG4hZmJg4 NyCYOL5/0TxLMIPH9D1K4j3/0BdXBO9+9LdYD4HVLFG0PSGki/jrEdoEhWFJU9tO ffnep82IuRVTJb1R+vSnbYZBB/buWHKCqLkBDQReH7+BAQgA5xOyz6J1SrYFim5X ZXxblJfT4gwVhdfttugQ+p/7HpP9KlhEZhN+sVULR2A2nslFxMRtuk4Mtx1sWS/5 CyQmPFAHdITaig955pqighl5iv3ztDVgjdaUyveU/S3WTowDjKm/SP6H/PBFFL9G miHJcA6KTLDU4eqRT//68LG4L/Yvc2jVF1NTNRKs0ZpzRNbqWoSp3g4u3lrrgbVg 0LTbHx7tKh+eKYI/roHaKZ5cqlD5yYLB2rrEzmvXPK9szujZh+7OQ25K9ObsRsMB p/EMNnjEAAKXpwqG6t7zmJHca1Nhz+V1FVdJWCQp9cEsaQ8i3FeXF6Nj1SH3+SJP bya2LwARAQABiQE8BBgBCAAmFiEEoHNj2HR+aFrMpQEjMXVZ5QIafBYFAl4fv4EC GwwFCQPCZwAACgkQMXVZ5QIafBYTKAf+OYDxAvJ6fmAAuYakXqBs5I7ZjFJq7NnV iXXXv2G1lkd5FrbeExeE5Q41l4UaNMEYphjXEconoMGza2DJABZm94RihKNI09Gd v8W45RgfC7N05HSudLQYsHmLEdA6h+wPBKivEU72nM98cQKqi6byu6mbFgwCfxOg dHdKt9G4E4+2HkDJc4BYcAHTqgsaOdOhHaqtdQCkDErxSwNMQXOEIPrHPkWs2zeZ eRpzLx72FQ2J9zfJ46UWg6CmX7OE2vYByQGSVYKlDca0/HVhiIAM/sfLAZFXPVLO bQvE6UxikoQQZ/NF5ljcxkkm3qZm3P1/F7X5axZCxFze+iaPAwbHsg== =21ks

We are open to discussing payment to security testers who can present us with significant security vulnerabilities that we can verify exist in our product. Though we are still a relatively small company, we aim to make payouts in the $500-1000 range for valid issues brought to our attention. We are also open to publicly recognizing contributions made by security researchers who responsibly disclose improvement opportunities to us.

Footnotes

[1] This is to say, we never make a copy of the git contents of a repo. That said, it could be theoretically possible for the entire contents of a repo to end up in a single commit, in an edge case such as the repo's first commit.