Rusty compares providers

Pluralsight (nee GitPrime) vs GitClear vs Pinpoint vs Code Climate

Comparing top-tier alternatives for code development, KPIs, and metrics by their data sources used

Published January 2, 2020

Whether you're an engineering manager, CEO, or developer, 2020 is a great time to consider the benefits that code metrics could offer your team. The "developer metrics" space (alternately called "engineering intelligence") has been thriving since 2016, when GitPrime and GitClear began development. Since then, GitPrime was purchased for $170m and companies like Pinpoint have raised $15m in VC capital. These gaudy numbers reflect the growing popularity of using data to drive up engineering throughput.
The rise of these solutions makes sense when you consider the economics. A 100-member engineering team in the U.S. will cost millions annually -- let's estimate $20m. That's a lot of money to spend blindly. For less than 1% of that amount, any of these four developer metric processors can impart actionable ideas to optimize that huge resource investment. Managers regularly report gains of 10-20% in throughput when adopting these tools. A little data can go a long way toward recognizing the circumstances that maximize team output.
Since "price" is the most common starting point that newcomers want to use when orienting themselves in this ecosystem, we'll begin with an apples-to-apples comparison of the cost of the four top-tier code metric providers [1]. We will then investigate the three data sources that sit quietly behind the ongoing explosion of developer performance data, noting the pros and cons of each.

Pricing Comparison

As of January 2020, here are the prices you'd pay to use each provider:
Pluralsight gitclear pinpoint code climate pricing compared
Pricing comparison of each code data processor's cheapest plan. Click image to see full page version. Source: publicly visible pricing pages.
As this table illustrates, the differences in "cost per developer" among the top tier providers are pretty negligible. Unless you're a small team (in which case GitClear has a pretty distinct pricing advantage), price ought not be the deciding factor in which product you select.

What lurks beneath 100+ metrics? Three data sources

If pricing doesn't prove a clear favorite among choices, what does?
Beneath the hundreds of names the four data providers attach to their metrics, there lurks only three sources from which the data is pulled. Those sources are: lines of code, issues => pull requests, and commits.
A fast and informative way to compare Pluralsight alternatives is by considering how they transform their data sources into metrics they consider "important." Below are the major metrics touted by Pluralsight alternatives. They are grouped by data source:
  • Lines of Code. Popular metrics derived from Lines of Code include "Impact" aka "Line Impact", "Code Churn," and the inverse of Code Churn, "Efficiency."
  • Issues => Pull Requests. "Cycle time," "Issue Throughput," and "Lead Time" are common names for metrics derived this data source.
  • Commits. "Active Days" (aka "Coding Days"), "Time to Open," and "Big/Risky" work designation come from here.
We can (and do) quibble elsewhere about whether these three data sources are reliable enough to drive career-impacting choices. For now, we'll make use of this simplified grouping to ask how the data can be noisy or polluted, and how the providers compare across the data sources they use.

Metrics Mapped to Data Sources

To compile the following data, our team pored over hundreds of help pages, glossaries, and blog posts at each data processor. We stitched all this data together into a concept we're calling "Metric Maps." Metric Maps simply name the metric that a provider advertises, and map it back to the data source from whence it came.
The metrics featured in these Metric Maps [2] leaned heavily on blog posts made by the CEOs of GitPrime/Pluralsight, GitClear, and Pinpoint to determine which metrics were considered most important by each provider.

Data Source: Lines of Code

Among the three data sources, "Lines of Code" (henceforth LoC) is the most signal-rich. This takeaway becomes apparent as one probes the shortcomings of the other choices. But, to tap into LoC requires serious precision. It isn't a metric that gives up its insights without a good fight.
The care needed to extract reliable takeaways from LoC is reflected by the lengthy list of "Data Quality Factors" that sprawl down the right side of the Metric Map. We conclude that there are at least nine desirable factors that a data processor would incorporate when processing lines of code. They are:
Loc metric comparison
Metrics that derive from Lines of Code. Interactive version available here.
By the looks of it, the years of work that GitClear has dedicated to rinsing noise from LoC pays off when comparing how broadly GitClear utilizes LoC throughout its metrics. The GitClear LoC processing engine allows it to deliver insights spanning from "tech debt" to "most productive hours of the week" (not to mention a cohort report) that data processors with less precise LoC processing struggle to provide.
After GitClear, Pluralsight is the clear runner-up in terms of productive effort spent sanitizing lines of code. Their help page that lists the specific factors they consider proves that they check many of the boxes necessary to interpret lines reliably.

Data Sources: Issues and Pull Requests

After Lines of Code, the next most promising data sources to be mined for engineering insights are "Issues" and "Pull Requests." We group these terms together as a data source because, collectively, they describe the process by which Product Managers get their ideas implemented. When these data sources are at their best, they're measuring collaboration.
All of the four providers offer the basic ability to track how many issues are being resolved by developer. All encourage teams to submit small pull requests. All but Pinpoint measure how eagerly developers review pull requests that are assigned to them. Here are all the metrics this data source informs:
Issues metric comparison
Metrics that derive from issue tracking (Jira) and pull requests. Interactive version available here.
There are no shortage of metrics purporting to deliver wisdom from issue/PR data. Pinpoint single-handedly delivers nearly 15 distinct metrics that tap Jira as the data source. The struggle with these metrics is how to normalize the data within -- let alone between -- the teams and regions that use it.
This is why "Issues resolved" was recently featured as one of the four worst developer metrics. It's inevitable that the magnitude of work associated with an issue will vary by more than 10x -- go look at the time required by your 10 most recently resolved issues if you doubt it. Absent a calibration mechanism (a la story points, which can at least normalize within a team), any metrics that rely on this data source will be polluted by variance.
On the plus side, issue => pull request metrics are indisputably value-adding to reduce delay in providing or addressing pull request comments. All of the data processors have some form of spotlighting developers whose collaborative spirit could be improved.

Data Source: Commits

We're not going to spend too many words on this section. Commits are a dicey data source, and any metric that is informed by them should be viewed skeptically.
The problem with commits as a metric is explained in greater depth in the previously mentioned worst developer metrics blog post. The short version of the problem is: commits have no intrinsic signal. They don't correspond to any particular truth. They serve only to measure how often a developer prefers to save their work. Thus, using commits as a data source is at best misleading.
At worst, using commits as a metric can become a wedge that drives a team of developers apart. Since commits are the most trivially "gameable" [3] of the three data sources, when a dev team knows they are being measured by commits, they would be reasonable to wonder if their colleagues will change their commit patterns to excel on this most synthetic of measurements.
Commit metric comparison
Metrics that derive from commits. Interactive version available here.
For these reasons, we're not showing the Metric Map for commits by default. You're welcome to click the link above if you're itching to know what providers purport to extract from how often committers choose to save their work.

Top Code Metrics: In the CEOs' own words

We used blog posts by each providers' CEO as one data point to interpret which metrics each data source deems "most important." More details about metric names that we highlight, our methodology is described here.
The CEO of GitPrime (now at Pluralsight), writes "In our experience, we’ve found the following five developer metrics are essential for all software managers," listing lead time, churn, impact, active days, and efficiency. Money quote: "We suggest focusing on these particular metrics because you can’t track everything, and not every measurement is a key metric."
Meanwhile, the CEO of GitClear writes that they rely on a "single, reliable metric" called Line Impact to drive all other metrics. Money quote: "Spreading data across many metrics makes it impossible to look at data across secondary dimensions. For example, if you want to know who's the most prolific test writer, you can rank all developers' Line Impact on the secondary dimension of 'test code written.' If you want to compare code velocity, you simply combine the primary metric with 'time passed.' None of this is any use if you don't have a reliable metric serving as your primary axis."
At Pinpoint, their CEO calls out five metrics, backlog change, throughput, cycle time, workload balance, and defect ratio. Money quote: "Imagine software engineering as a pipeline. Ideas come in, and what goes out is (hopefully) quality software to solve a problem or create opportunities for the business. Framed this way, what business leaders want to know about engineering is: how much do we get through the pipeline, how quickly do we do it, how good is what comes out?"
Finally, Code Climate's CEO hasn't proffered an opinion, but their Head of Product Marketing, writes (and illustrates) that their visualizations are driven by coding days, pushes per day, time to open, rework, pull request throughput, and cycle time. Money quote: "[Surfacing Issues] is the category in which the two analytics tools differ most. Velocity, with PR-related metrics at the core of the product, does a better job drawing attention (inside and outside of the app) to actual artifacts of work that could be stuck or problematic. GitPrime, with mostly people-focused metrics, draws attention to contributors who could be stuck or problematic."
Here's how these CEOs' preferred metrics map back to data source:
Ceo data sources pie Ceo data sources by provider
Summary of data sources underlying CEO-preferred metrics. Read more about methodology.
It looks like a relatively even split between data sources utilized. "Issues" edges "Lines of Code" and "Commits" since it is the preferred data source underlying all five metrics highlighted by Pinpoint's CEO.

Related articles

Footnotes

[1] If interpreted broadly, there are perhaps 10 different products that could be said to deliver "developer metrics" or "engineering intelligence." Selecting which of these are "top tier" is admittedly a judgement call. The sources informing or judgement were a combination of A) popular sentiment / user adoption B) documentation available and C) business transparency. Waydev, in particular, might be knocking on the door of being credibly called a top tier provider, if not for shenanigans like plagiarizing GitPrime's 'How Impact Works' page in their corresponding help document.
[2] The general format of the metric maps are that, if we could find any documented reference where a provider mentions a metric, we'd include their logo under the metric's name. Likewise, for the "Data Quality Factors" side of the map, if we could find any blog or help page where the data processor mentions that they incorporate a particular factor, we added their logo with a link to the page where the factor was mentioned (click through to the "interactive" version of the metric map to get access to the backlinks). Because our research process relies on the data processor to report which data quality factors they consider, we might erroneously be withholding credit for a data quality considerations if we couldn't find the documentation about it. The view of the editorial board is that this is a feature. If your product supports amazing functionality, it's beholden on you to report that in a Google-able way. Note that we apply this standard to ourselves as well. We've avoided including the GitClear logo except where we have a page we can link to substantiate the logo. If you represent one of these products and have additional links to show that you recognize a particular metric or data quality factor, email hello -at- gitclear.com and we'll get the metric map updated accordingly. We take seriously the opportunity to showcase the strengths and weaknesses of each product.
[3] That is, it's trivial for any developer to double their commit count by changing their personal preferences such that they dedicate a few minutes per day to gaming this synthetic metric.