Pluralsight Flow (nee GitPrime) vs GitClear vs Pinpoint vs Code Climate

Comparing top-tier alternatives for code development, KPIs, and metrics by their data sources used

Updated February 8, 2021

Whether you're an engineering manager, CEO, or developer, 2021 is a great time to consider the benefits that code metrics could offer your team. The "developer metrics" space (alternately called "engineering intelligence") has been thriving since 2016, when GitPrime and GitClear began development. Since then, PluralSight purchased GitPrime for $170m and companies like Pinpoint have raised $15m in VC capital. These gaudy numbers reflect the growing popularity of using data to drive up engineering throughput.
The rise of these solutions makes sense when you consider the economics. A 100-member engineering team in the U.S. will cost millions annually -- let's estimate $20m. That's a lot of money to spend blindly. For less than 1% of that amount, any of these four developer metric processors can impart actionable ideas to optimize that huge resource investment. Managers regularly report gains of 10-20% in throughput when adopting these tools. A little data can go a long way toward recognizing the circumstances that maximize team output.
Since "price" is the most common starting point that newcomers want to use when orienting themselves in this ecosystem, we'll begin with an apples-to-apples comparison of the cost of the four top-tier code metric providers [1]. We will then investigate the three data sources that sit quietly behind the ongoing explosion of developer performance data, noting the pros and cons of each.

Pricing Comparison

As of September 2020, here are the prices you'd pay to use each provider:
Pricing comparison of each code data processor's cheapest plan. Click image to see full page version. Source: publicly visible pricing pages.
As this table illustrates, the differences in "cost per developer" among the top tier providers are pretty substantial. Especially if you're a small team, GitClear offers lower costs for those who want to get started without extended (= expensive) hand-holding from the sales team.

What lurks beneath 100+ metrics? Three data sources

If pricing doesn't prove a clear favorite among choices, what does?
Beneath the hundreds of names the four data providers attach to their metrics, there lurks only three sources from which the data is pulled. Those sources are: lines of code, issues => pull requests, and commits.
A fast and informative way to compare Pluralsight alternatives is by considering how they transform their data sources into metrics they consider "important." Below are the major metrics touted by Pluralsight alternatives. They are grouped by data source:
  • Lines of Code. Popular metrics derived from Lines of Code include "Impact" aka "Diff Delta", "Code Churn," and the inverse of Code Churn, "Efficiency."
  • Issues => Pull Requests. "Cycle time," "Issue Throughput," and "Lead Time" are common names for metrics derived this data source.
  • Commits. "Active Days" (aka "Coding Days"), "Time to Open," and "Big/Risky" work designation come from here.
We can (and do) quibble elsewhere about whether these three data sources are reliable enough to drive career-impacting choices. For now, we'll make use of this simplified grouping to ask how the data can be noisy or polluted, and how the providers compare across the data sources they use.

Metrics Mapped to Data Sources

To compile the following data, our team pored over hundreds of help pages, glossaries, and blog posts at each data processor. We stitched all this data together into a concept we're calling "Metric Maps." Metric Maps simply name the metric that a provider advertises, and map it back to the data source from whence it came.

Data Source: Lines of Code

Among the three data sources, "Lines of Code" (henceforth LoC) is the most signal-rich. This takeaway becomes apparent as one probes the shortcomings of the other choices. But, to tap into LoC requires serious precision. It isn't a metric that gives up its insights without a good fight.
The care needed to extract reliable takeaways from LoC is reflected by the lengthy list of "Data Quality Factors" that sprawl down the right side of the Metric Map. We conclude that there are at least nine desirable factors that a data processor would incorporate when processing lines of code. They are:
Metrics that derive from Lines of Code. Interactive version available here.
After GitClear, Pluralsight is the clear runner-up in terms of productive effort spent sanitizing lines of code. Their help page that lists the specific factors they consider proves that they check many of the boxes necessary to interpret lines reliably.

Data Sources: Issues and Pull Requests

After Lines of Code, the next most promising data sources to be mined for engineering insights are "Issues" and "Pull Requests." We group these terms together as a data source because, collectively, they describe the process by which Product Managers get their ideas implemented. When these data sources are at their best, they're measuring collaboration.
All of the four providers offer the basic ability to track how many issues are being resolved by developer. All encourage teams to submit small pull requests. All measure how eagerly developers review pull requests that are assigned to them. Here are all the metrics this data source informs:
Metrics that derive from issue tracking (Jira) and pull requests. Interactive version available here.
There are no shortage of metrics purporting to deliver wisdom from issue/PR data. The struggle with these metrics is how to normalize the data within -- let alone between -- the teams and regions that use it.
This is why "Issues resolved" was recently featured as one of the four worst developer metrics. It's inevitable that the magnitude of work associated with an issue will vary by more than 10x -- go look at the time required by your 10 most recently resolved issues if you doubt it. Absent a calibration mechanism (a la story points, which can at least normalize within a team), any metrics that rely on this data source will be polluted by variance.
On the plus side, issue => pull request metrics are indisputably value-adding to reduce delay in providing or addressing pull request comments. All of the data processors have some form of spotlighting developers whose collaborative spirit could be improved.

Data Source: Commits

We're not going to spend too many words on this section. Commits are a dicey data source, and any metric that is informed by them should be viewed skeptically.
The problem with commits as a metric is explained in greater depth in the previously mentioned worst developer metrics blog post. The short version of the problem is: commits have no intrinsic signal. They don't correspond to any particular truth. They serve only to measure how often a developer prefers to save their work. Thus, using commits as a data source is at best misleading.
At worst, using commits as a metric can become a wedge that drives a team of developers apart. Since commits are the most trivially "gameable" [3] of the three data sources, when a dev team knows they are being measured by commits, they would be reasonable to wonder if their colleagues will change their commit patterns to excel on this most synthetic of measurements.
Metrics that derive from commits. Interactive version available here.
For these reasons, we're not showing the Metric Map for commits by default. You're welcome to click the link above if you're itching to know what providers purport to extract from how often committers choose to save their work.

Closing thoughts

GitClear, Pluralsight Flow, and Code Climate Velocity collectively tap three data sources to help you understand what's happening on a development team. Picking a provider with a thoughtful approach to which data sources are used is essential to whether you can rely on your data to drive decision-making. The evidence reviewed here suggests that lines of code are the most signal-rich data source, given the data normalization challenges that hinder reliability of issues/pull requests as a data source.
If you'd like to learn more about how GitClear calculates Diff Delta, or the average range of daily Diff Delta, click here to learn more.

Related articles

Footnotes

[1] If interpreted broadly, there are perhaps 10 different products that could be said to deliver "developer metrics" or "engineering intelligence." Selecting which of these are "top tier" is admittedly a judgement call. The sources informing or judgement were a combination of A) popular sentiment / user adoption B) documentation available and C) business transparency. Waydev, in particular, might be knocking on the door of being credibly called a top tier provider, if not for shenanigans like plagiarizing Pluralsight Flow's 'How Impact Works' page in their corresponding help document.
[2] The general format of the metric maps are that, if we could find any documented reference where a provider mentions a metric, we'd include their logo under the metric's name. Likewise, for the "Data Quality Factors" side of the map, if we could find any blog or help page where the data processor mentions that they incorporate a particular factor, we added their logo with a link to the page where the factor was mentioned (click through to the "interactive" version of the metric map to get access to the backlinks). Because our research process relies on the data processor to report which data quality factors they consider, we might erroneously be withholding credit for a data quality considerations if we couldn't find the documentation about it. The view of the editorial board is that this is a feature. If your product supports amazing functionality, it's beholden on you to report that in a Google-able way. Note that we apply this standard to ourselves as well. We've avoided including the GitClear logo except where we have a page we can link to substantiate the logo. If you represent one of these products and have additional links to show that you recognize a particular metric or data quality factor, email hello -at- gitclear.com and we'll get the metric map updated accordingly. We take seriously the opportunity to showcase the strengths and weaknesses of each product.
[3] That is, it's trivial for any developer to double their commit count by changing their personal preferences such that they dedicate a few minutes per day to gaming this synthetic metric.