Saturday, December 30, 2017

Pricing Cryptocurrencies Is Basically Impossible

This is NOT investment advice. This is a web developer, with no background in economics or finance, thinking out loud.

With the price of bitcoin skyrocketing, and other cryptocurrencies following it up, there's a pretty obvious question: what are cryptocurrencies actually worth?

Back in the day, Rick Falkvinge argued that bitcoin's killer app was international money transfers, and, therefore, that the total value of all bitcoin in circulation should be equal to the total value of the international money transfer market, which he estimated at 1% of total money in circulation. I liked this argument and found it compelling enough to buy a few bitcoins at $14 (which I then sold at around $60, because at the time I didn't take it seriously). But there are multiple problems with this argument.

The first is that there are many other cryptocurrencies in circulation, and bitcoin's value would have to be determined not only relative to its use case, but also relative to competing cryptocurrencies which were similarly well-suited for that use case. The second problem is that Falkvinge compared money currently in circulation with the total number of bitcoin expected to ever exist. The third problem is that this is basically a streetlight effect. There's no compelling argument that I'm aware of which makes the case that international money transfers are the only use case for Bitcoin specifically or cryptocurrencies in general. Logically, even if Falkvinge's argument is correct — which I don't know for sure — since there are probably other use cases, the total value of all cryptocurrency in circulation should be greater than the total value of the international money transfer market.

And again, that's the value of cryptocurrency in general, not bitcoin in particular.

Another, equally inconclusive way to estimate bitcoin's value is to look at how the market behaves when there are no compelling reasons to suspect that influential pump-and-dump scams are underway. In a YouTube video, economist Neil Gandal charts the impact of two such pricing scams. This was in a presentation given at an academic conference on blockchains at Berkeley in September; bitcoin's subsequent surge in price also arouses serious pump-and-dump questions. If you look at the time period free of pump-and-dump scams, however, you see a range from around $300 to $500 per bitcoin. A person could argue that this is the "true" price of bitcoin, but removing fradulent influences from an analysis does not magically grant omniscience to the remaining market participants, especially not in the context of an emerging technology with untested potential, so this argument is still very flawed.

A further compounding factor in any attempt to come up with a plausible price for cryptocurrencies is that the prices of cryptocurrencies in general go up whenever the price of bitcoin goes up. You might believe, for example, that blockchain applications hold sufficient potential to justify at least some optimism re: cryptocurrencies, and that Ethereum is a much more sophisticated platform for building these applications than Bitcoin. But you don't see the price of Ether go up when the price of bitcoin goes down. The market reflects surges in optimism and skepticism re: the general topic far more than relative evaluations of the different instances of the concept. And it's an apples to oranges comparison anyway; not only do the tokens exist in different quantities, but also, bitcoin has an absolute limit on the number of coins produced (unless the network changes its mind about that), while Ethereum plans to always release new Ether.

With all this, it's essentially impossible to invest responsibly in cryptocurrency, since the uncertainty is so high that there's no way to determine any plausible "correct" price for the stuff. Falkvinge's argument for a million-dollar bitcoin value was basically this: take 1% to 10% of the (loosely estimated) total money currently in supply, worldwide — which he says is $60 trillion — and match it against the finite limit of about 6 million bitcoins that will eventually ever be in circulation. But this is apples to oranges again: a projected number in the future (bitcoin in circulation) vs. a number (very loosely) based on today's economy. And even if this math were flawless, coinmarketcap.com estimates that the current dollar value of cryptocurrency in circulation is $568,661,188,172.

So, using a modified version of Falkvinge's argument, if the total value of cryptocurrency should ultimately be 1% of $60 trillion, well, hooray, the future is already here. One percent of $60 trillion is $600,000,000,000, which is essentially how much money is in cryptocurrency right now — although that's including bitcoin prices which arguably appear to be artificially inflated via a scam involving Tether. Falkvinge's argument included an enormous fudge factor: basically, if bitcoin is ever used for anything above and beyond international money transfer, then its total share of the money supply could go up from 1% to as high as 10%. Although Falkvinge was by far the most disciplined thinker in bitcoin at the time, this is still not a super rigorous approach. He basically pulled that order of magnitude fudge factor straight out of his ass.

So it could be that we've reached the useful limit of cryptocurrency in circulation, and it could be that we're only one-tenth of the way there. It could be that we have the "correct" amount of money in the sector, but it's not allocated to the "correct" coins. Plenty of other interpretations exist, and might be true instead. This is just the degree of certainty we get from the only analysis I've seen that ever even tried to do any math; more mainstream commentators seem to go with "we don't know yet" and call it a day, and that does seem to me to be a more realistic approach. If the only math you can do involves made-up numbers, it's better to admit you don't have enough information to make a precise estimate.

The best lens I've seen for guessing at the values of cryptocurrencies comes from Twitter snark. A popular trend in snarking about bitcoin in particular is to say that it's like a 1990s dot-com stock, except without a company attached. This is supposed to be a dis but it's easily the most flattering thing you can say about cryptocurrencies. If I could have bought Rails stock back in 2005, I would have. To me, the uncertainty in Internet stocks has always been the companies attached; the underlying open source technologies are much easier to evaluate. From that point of view, bitcoin looks like Java; it sucks, but people are going to build a lot of stuff with it. And Ethereum reminds me of SmallTalk; it's trying to build too much of a world around itself to ever really become what it wants to be, but it's still an impressive leap forward.

This is all very much not investment advice. But I am totally qualified to give you advice on how to talk about tech without sounding like a jackass, even if I haven't always followed that advice myself. So here's some advice in that vein: if you're a dev and you're curious about cryptocurrencies, read the whitepapers and look at the code. Although the dot-com crash was overall a very overdue market correction, which cleared out a bunch of trash companies and let the real leaders emerge, at the same time, plenty of complete idiots made a mint during the dot-com boom, and plenty of great technologies tanked in the crash. The only certainty in any of this is that randomness will play a part. But if you know which cryptocurrencies are better designed than others, it'll at least be a useful piece of the picture. It might make all the difference in the world.

Saturday, August 5, 2017

Drive Refactors with a Git Pre-Push Hook

When I'm writing code, if I see something I want to fix, I write a quick FIXME comment. When I'm working on my own projects, this is no big deal. But I don't like adding FIXMEs to code at work, or in open source repos. And really, I prefer not to keep them hanging around in my own repos, either. So I set up a git pre-push hook. It looks like this:

 ᕕ(ᐛ)ᕗ cat .git/hooks/pre-push
#!/bin/bash

if ag fixme . -i --ignore tmp/ --ignore log/ --ignore node_modules/ --ignore vendor/; then
  echo
  echo 'not pushing! clean up your fixmes'
  echo
  exit 1
else
  exit 0
fi

This code uses ag to make sure that there are no FIXMEs in my code, skipping dependency directories since they're not my problem, and either returns a Unix failure code, interrupting the push, if FIXMEs are present, or a success code, allowing the push to proceed, if they aren't.

In other words, if I tell git to push code which contains a FIXME, git effectively says no.

This worked well for a while. But soon, at work, somebody else added a FIXME to a repo. So I commented out my hook for a few days or weeks, and then, when I had a spare second, I fixed their FIXME. But soon enough, somebody else added a FIXME, and it was harder to fix. So I commented out the hook, and forgot all about it.

Eventually, though, I had to do some pretty complicated work, and I generated a lot of FIXMEs in the process. As an emergency hack, I uncommented the hook, did my main work, tried to push my branch, and got automatically chastised for trying to push a FIXME. So I went in, fixed them, and pushed the branch.

This is my routine now. If I'm dealing with a big enough chunk of work that I can't take a break to fix a FIXME, I just uncomment the hook for the period of time that I'm working on that topic branch. It's a really good way to stay on task while also putting together a to-do list for refactoring. You basically can't finish the branch until you go through the FIXME to-do list, but you're 100% free to ignore that to-do list, and focus on the primary task, until it's time to push the branch. So you can easily build refactoring into every topic branch without getting distracted (which is the primary risk of refactoring as you go).

It's probably pretty easy to modify this hook so that it only looks at unstaged changes, so I might do that later on, but it works pretty well as a habit already.

Sunday, June 4, 2017

Reddit Users Are A Subset Of Reddit Users

An interesting blog post on view counting at Reddit reminded me of an old post I wrote almost ten years ago. Reddit engineer Krishnan Chandra wrote:

Reddit has many visitors that consume content without voting or commenting. We wanted to build a system that could capture this activity by counting the number of views a post received.

He goes on to describe the scaling and accuracy problems in this apparently simple goal. But he never addresses a few really weird assumptions the Reddit team appears to have made: that every Reddit user is logged in, or even has an account in the first place; that every Reddit user has only one account; and that every account is only ever used by one person.

As I wrote back in the ancient days of 2008:

Assuming a one-to-one mapping between users and people utterly disregards practical experience and common sense. One human could represent themselves to your system with several logins. Several humans could represent themselves as one login. Both these things could happen. Both these things probably will happen.

In other words, "users" aren't real. Accounts and logins are real. The people who use those accounts and logins are also real. But the concept of "users" is a social fiction, and that includes the one-to-one mapping "users" implies between accounts and people. In some corners of the vast Reddit universe, creating more than one "user" per account is as common as it is in World of Warcraft.

As I say, this is all from a blog post back in 2008. More recently, in 2016, I decided that the best way to use Reddit (and Hacker News as well) is without logging in. In other words, the best use case is not to be a "user":

reading Reddit without logging in is much, much more pleasant than being a "user" of the site in the official sense. The same is true of Hacker News; I don't get a lot out of logging in and "participating" in the way that is officially expected and encouraged. Like most people who use Hacker News, I prefer to glance at the list of stories without logging in, briefly view one or two links, and then snark about it on Twitter...

neither of these sites seems to acknowledge that using the sites without being a "user" is a use case which exists. In the terms of both sites, a "user" seems to be somebody who has a username and has logged in. But in my opinion, both of these sites are more useful if you're not a "user." So the use case where you're not a "user" is the optimal use case.


On the Reddit blog, discussing all the technical aspects of view counting, Chandra said the team's goal was to make it easy for people to understand how much activity their posts were driving, and to include more passive activity like viewing without commenting. But there's this obvious logical disconnect here, where Reddit excludes from this analysis any user who isn't logged in. You could even make the case that they'd be better served just pulling the info from Google Analytics, although that'd be naive.

First, given the scale involved, there's likely to be enough accuracy issues that a custom solution would be a good idea in either case. Second, Reddit's view-counting effort made no attempt to identify when two or more accounts belonged to the same human being, or when two or more human beings shared the same login. An off-the-shelf analytics solution could probably provide insight into that, but not definitive answers.

My favorite explanation for all this is that it's just a simple institutional blind spot: people who work at Reddit are probably so used to thinking that people who don't log in "don't count" that it seemed perfectly reasonable to literally not count them. The concept of "users" is such a useful social fiction that the company's employees probably just genuinely forgot it was a social fiction at all (or never realized it in the first place). However, if your goal is to provide view counting with accuracy, skimming over these questions makes your goal impossible to achieve.

Monday, January 16, 2017

JS Testing In Rails Apps: The Problem Space

I've got to do several things before my new book, Modern Front-End Development with Ruby on Rails and Puppies, is finished. It's currently only in version 0.2! One of the biggest things is to delve into JS testing in serious depth. The "with Puppies" part of my book's title doesn't add any extra challenges to that task, but the "with Ruby on Rails" part totally does.

The Rails culture has very high standards and expectations when it comes to the sophistication and specificity of its testing options. Say you've got a big Rails app which has presenters, decorators, and serializers. That's three different categories of view model objects — and standard methods of testing these objects exist both for RSpec and for classic TDD syntax (i.e., the assert style of Test::Unit and minitest). So you can go out and find, for example, specific, opinionated guidance on how to write tests for presenters using minitest. And I've obviously defined a 3x2 grid here, with 3 categories of view model object, and 2 categories of testing strategy. All six slots in the grid represent a strain of thinking within the Rails culture that lots of people are working on in detail. If you want to research any one of those six slots, there's a ton of prior art.

And that's just for view model objects. What about objects which move business logic out of the framework? Those can go in lib, or app/services, or app/interactors, or several other places. Want to test those quickly and well? There are projects which give you several different ways to do it. If you see open source development as a form of research, then you have this huge body of ongoing research. All these different strategies for separating business logic from the framework, and all the different ways of testing the objects which implement these different strategies — they all represent ongoing research projects into the best way to structure an application, and the best way to design an application, and/or secure that application, and/or prove its correctness (depending on how you see the purpose of testing in the first place, which itself is an area of ongoing "research," in this sense).

The common structure of Rails apps creates a shared vocabulary, not just for application logic, but for testing as well. You can have subtle, specific discussions about how to compartmentalize, organize, and structure responsibilities and processes across Rails apps, even when those apps do very different things.

Compare that to JavaScript. How many JavaScript applications share a common structure? Most don't.

The size of the Ruby community also helps here, in comparison to JavaScript. "The JavaScript community" basically means the same thing as "every programmer on the planet who ever works on or near the web." In fact, it's larger than that; if you want to script Adobe applications, for example, you use JavaScript. So for Ruby, you have a small community, where the overwhelming majority of programmers work within the same application structure, and where virtually everybody agrees that testing is important. For JavaScript, you have a group of people which is far too large to ever function as an actual community. Most applications have relatively idiosyncratic structures, and many of them change radically from month to month. What kind of consensus can emerge there?

The good news is that there are plenty of JS programmers who value testing, and plenty of JS test frameworks, to enable that. But the bad news is that many of these testing systems never get much further than unit testing. There's no Capybara of JavaScript, for example — not really — which is pretty crazy, because front-end coding is the type of work where you would expect to find a Capybara being developed.

Say you've got a Rails app with a React front-end. Maybe you're using Rails controllers for the regular web stuff and Grape for the API stuff that your React code consumes. So you can put your Grape specs in spec/requests and write simpler specs, and keep your controller specs in the more usual spec/controllers, and deal with the unfortunate downsides, i.e., the slowness, the intricate connection(s) to the framework, et cetera. But when you want to test your JS, you don't get these kinds of fine-grained distinctions for free. You can use moxios to mock your axios requests, but that's kind of as sophisticated as it gets. You have unit testing, and you have mocks, and there you go. It's like that scene in The Blues Brothers, when they ask "what kind of music do you usually have here?" and the bartender says, "we got both kinds of music, country and western!"

With JS, there are many ways to do mocks and stubs, and there are many ways to do unit testing, but there's nothing as purpose-specific as spec/requests vs spec/controllers. You're also flying blind somewhat when it comes to distinctions as fine and precise as when to use a presenter, when to use a decorator, and when to use a serializer (which is basically just a presenter for APIs).

One of the benefits of something like React is that it's an improvement just to get a sub-community (or subculture) of JS devs who all use roughly the same application structure, because that common vocabulary permits the subculture to develop more sophisticated and specific testing libraries. As far as that goes, Angular and Vue and other frameworks have a similar positive influence. However, this is especially a strength of React in my opinion, and in fact, I need to update my book to reflect that. For some reason, when people talk about React, they talk a lot about FRP and virtual DOMs, but when you get past the hand-waving, the actual day-to-day work of React is very object-oriented. There are a lot of tradeoffs to consider with OOP, but one great positive is that OOP systems present very testable surfaces. It's really easy to see where you start testing, when you're dealing with objects.

Elm has this "common application structure" advantage as well, but with Elm, you have types, and types relieve a lot of the pressure on tests. That's not to say that you don't need tests when you have types. But since the compiler is guaranteeing you certain behavior, you don't need tests to secure that behavior. People use tests to design systems, to secure systems (and prevent regressions), to document systems, and for a few other purposes. Elm's types make it easy to get most of your security cheaply. Elm also pushes you towards better designs in ways that are more subtle and which I couldn't perfectly articulate just yet.

However, in both these cases, this isn't a solution, but rather a reason to be optimistic that the cacophony which has prevented solutions from arising will soon diminish, at least somewhat.

I'll have more to say about this in the future; for now, this is just an overview of how the problem space for JS testing in Rails apps differs quite a bit from Ruby testing in Rails apps.