Elmord's Magic Valley

Software, lingüística e rock'n'roll. Às vezes em Português, sometimes in English.

Some impressions about Go

2020-12-13 10:06 +0000. Tags: comp, prog, golang, in-english

A couple of months ago, I decided to rewrite the implementation of Fenius from scratch… in Go. I’ve also been working on a web project in Go at work. In this post, I write some of my reflections about the language.

First, a bit of a disclaimer. This post may end up sounding too negative; yet I chose to write the implementation of Fenius in Go for a reason, and I don’t regret this decision. Therefore, despite all the complaints I have about the language, I still think it’s a useful tool to have in my toolbox. With that in mind, here we go.

Also, a bit of context for those who don’t follow this blog regularly: Fenius is a programming language I am designing and playing with in my free time. The goal is to mix elements of functional and object-oriented programming and Lisp-style macros in a non-Lisp syntax, among other things. In its current incarnation, the language is implemented as an interpreter in written Go.

Why did I choose Go for this project?

I’ve been curious about Go for a long time, but had never taken the time to play with it. I don’t have much patience for following tutorials, so for me the most effective way of learning a new programming language is to pick some project and try to code it in the language, and learn things as I go.

I realized Fenius would be a good match for Go for a bunch of reasons:

Compared to higher-level languages (such as Common Lisp):

Compared to lower-level languages (such as C):

In summary, I see Go basically as a garbage-collected, memory-safe language with a small runtime, somewhat above C in abstraction level, but not much above. This can be either good or bad (or sometimes one and sometimes the other), depending on the requirements of your project.

(Another reason for using Go, which is unrelated to any of the features of the language itself, is that Go is used for a bunch of things where I work, so learning it would be useful for me professionally. And indeed, the experience I acquired working on the Fenius interpreter has been hugely useful at work so far.)

With all that said, Go does leave something to be desired in many respects, could be better designed in others, and just plain annoys me in others. Let’s investigate those.

Do repeat yourself

Go bills itself as a simple programming language, and simple it is. However, one thing it made me reflect about is that there is more than one way to go about simplicity. Scheme, for instance, also aims at being a simple programming language; and yet Scheme is far more expressive than Go. Now, “expressiveness” is a vague concept, so let’s try to make this more concrete. What I’m after here is an idea that might be called abstraction power: the ability to abstract repeating patterns in the code into reusable entities. Go leaves a lot to be desired in this department. Whereas Scheme is a simple language that gives you a basic set of building blocks from which you can build higher-level abstractions, Go is a simple language that pretty much forces everything to stay at the simple level. Whereas Scheme is simple but open-ended, “open-ended” is about the last word I would use to describe Go.

The thing is, Go is this way by design: whether or not you like this (and I don’t), it is an intentional rather than accidental part of the design of the language. And it does have some benefits: because there are fewer (no) ways to extend the language, it’s also easier to exclude certain behaviors when analyzing what a piece of code does. For example, recently at work, while trying to figure out how GORM works, someone wondered if GORM closed database connections automatically when the database handler went out of scope, and I was able to say I didn’t think that was possible, simply because there is no mechanism in Go that could be used to achieve that.1 Likewise, if you have something of the form someStruct.SomeField, you can be sure all this will do is read a memory location, not run arbitrary code. Of course, this has a flip side: anyone accessing someStruct.SomeField really depends on the struct having this field; it cannot be replaced by a property method in the future. You either have to live with that, or write an accessor method and always use that instead of accessing the field directly in the rest of the program, just like in plain ol’ Java.

The while problem

Go’s if has a two-clause version which allows you to initialize some variables and use them in the if condition. This works particularly well with the Go stategy of signaling errors and the results of some builtin constructs by returning multiple values. One common example is the “comma ok” idiom: the statement value, ok := someMap[key] sets value to the value of the key in the map (or a zero value if the key is not present), and ok to a boolean indicating whether the key was present in the map. Combined with the two-clause if form, this allows you to write:

if value, ok := someMap[key]; ok {
    fmt.Printf("Key is present and has value %v\n", value)
} else {
    fmt.Printf("Key is not present\n")
}

where ok is set in the first clause and used as a condition in the second. Likewise, switch also has a two-clause form.

Given that, you might expect there would be an analogous construct for loops. In fact, even in C and similar languages, one can write things like:

int ch;
while (ch = getchar(), ch != EOF) {
    putchar(ch);
}

where the while condition assigns a variable and uses it in the loop condition. Surely Go can do the same thing, right?

Alas, it can’t. The problem is that Go painted itself into a corner by merging the traditional functions of while into the for construct. Basically, Go has a three-clause for which is equivalent to C’s for:

for i := 0; i < 10; i++ {
    fmt.Println(i)
}

and a one-clause for which is equivalent to C’s while:

for someExpressionProducingABoolean() {
    fmt.Println("still true")
}

but because of this, the language designers are reluctant to add a two-clause version, since it could easily be confused with the three-clause version – just type an extra ; at the end, and you have an empty third clause, which changes the behavior of the first clause to run just once before the loop rather than before every iteration. This could be easily avoided by having a separate while keyword in the language for the one- and two-clause versions, but I very much doubt this will ever be introduced, even in Go 2.

The issue comes up again every now and then. The solution usually offered is to use the three-clause for and repeat the same code in the first and third clauses, i.e., the equivalent of doing:

 for (ch = getchar(); ch != EOF; ch = getchar()) {
     putchar(ch)
 }

i.e., “do repeat yourself”, or using a zero-clause for (which is equivalent to a while (true)) and an explicit break to get out of the loop. Incidentally, in the thread above, one of the Go team members replies that this kind of thing is not possible in other C-like languages either, but as we saw above, C actually can handle this situation because C has the comma operator and assignment is an expression, not a statement, which allows you to write stuff like while (ch = getchar(), ch != EOF), whereas Go neither has the comma operator, nor does it have assignment as an expression. I’m not arguing that it should have these, but rather that the lack of these elements makes a two-clause while more desirable in Go than it is in C.

Iteration, but not for you

There are many operations that are only available for builtin types, and you cannot implement them for your custom types. Consider, for example, iteration: Go has an iteration construct that looks like this:

for key, value := range iterableThing {
    fmt.Printf("The key is %v, and the value is %v", key, value)
}

but it only works for arrays/slices, maps, strings and channels; you cannot make your own types iterable. Given that Go has interfaces, it would be easy for the language to include a standard Iterator interface which user-defined types could implement; but that’s not the case. (On a second thought, that’s actually not possible because Go does not have generics, and the return type of the iterator depends on the thing being iterated over.) If you want to write any sort of custom iteration, you will have to make do with regular function calls, a problem that is aggravated by the lack of a two-clause while (as seen above), which might allow you to test if there are more elements and get the next element at the same time.

Generics, but not for you

This is one of the most frequent complaints people have about Go. Go has no form of parametric polymorphism (a.k.a. generics): there is no way, for example, to define a function that works on lists of X for any type X, or to define a new type “binary tree” parameterized by the type of the elements you want to store in the tree.

If you are defining a new container data type and you want to be able to store elements of any type inside it, one option is to define the container’s elements as having type interface{}, i.e., the empty interface, which is satisfied by every type. This is roughly equivalent to using Object in Java. By doing this, you give up any static type safety when dealing with the container’s elements, and you have to cast the elements back to their original type when extracting them from the container, so basically you are left with a dynamically-typed language except with more boilerplate. The alternative, of course, is to repeat yourself and just write multiple versions of the functions and data structures you need, specialized for the types you happen to need.

Another option, seriously offered as an alternative by the language designers, is to write a code generator to generate specialized versions of the functions and data structures you need. No, Go does not have macros; what this entails is actually writing a program yourself that spits out a .go file with the content you want. Besides being much more work and being harder to maintain (although there are projects around that can do this for you; you just have to make sure to run the damn program every time you make changes to the original struct), it does not really help distributing libraries containing generic types.

Now, the funny thing is that the builtin types (arrays, slices, maps and channels) are type-parametric, and there is a number of builtin functions in Go, such as append and copy, that are generic as well, so, once again, Go has this feature, because it’s useful, but it’s only available for the builtin types and functions. This special-casing of builtin types is one of the most annoying aspects of Go’s design to me.

Now, unlike some fervorous Go proponents, the language designers themselves do recognize the lack of generics as a problem, and have done so for a long time; they have just been unsure how best to add them to Go’s design and afraid of adding in a bad design and then being stuck with supporting it forever, since Go makes strong guarantees about backwards compatibility, which is all perfectly reasonable. It looks like Go 2 will likely come with support for generics; we just don’t really know when that will happen.

Error handling

This is another classic of Go complaints, and with reason – it’s the other main problem that is serious enough to be recognized by the language designers, and may get better in Go 2. Until that happens, though, we are stuck with the Go 1 style of error handling.

In Go, errors are typically reported by returning multiple values. For example, a function like os.Open returns two values: an open file handler (which may be nil if an error occurred and the file could not be opened), and an error value indicating which error, if any, has happened. Typical use looks like this:

func doSomethingWithFile() int {
    file, err := os.Open("/etc/passwd")
    if err != nil {
        log.Panicf("Error opening file: %v", err)
    }
    // ... do something with file ...
    return 42;
}

or you can make your function return an error value itself, so you can pass it on for the caller to handle:

func doSomethingWithFile() (int, error) {
    file, err := os.Open("/etc/passwd")
    if err != nil {
        return 0, err
    }
    // ... do something with file ...
    return 42, nil
}

There are many problems with this approach. The most obvious one is that this quickly becomes a repetitive pile of if err != nil { return nil, err } after anything that may return an error, which distracts from the actual business logic. There is no way to abstract this repetition away, since you can’t pass the result of a multiple-values function as an argument to another function without assigning it to variables first, and a subfunction would not be able to return from the main function anyway. Macros could help here, but Go does not have them.

The second problem is that you don’t return either a value or an error (as you would do with Rust’s Result type, which is either an Ok(some_result) or an Err(some_error)); you return both a value and an error, which means you still have to return a value even when there is no value to return. For reference types, you can return nil; for other types, you typically return the zero value of that type (e.g., 0 for integers, "" for strings, a struct with zero-valued fields, etc.) The zero value is often a perfectly valid value that can occur in non-error situations as well, so if you make a mistake in handling the error, you may end up silently passing a zero value as a valid value to the rest of the program, rather than blowing up like an exception would.

This is partly mitigated by the fact that in Go it is an error to declare a variable and not use it, so you are forced to do something with the err you just created – unless an err already exists in scope, in which case your value, err := foo() will just reuse the existing err and no error will be generated if you don’t do anything with it. Moreover, functions that only have side-effects but don’t return anything other than an error (or do return some other value but the value is rarely used) are not protected by this. Perhaps the most common example are the fmt.Print* functions, which return the number of bytes written and an error value, but I’ve never seen this error value handled – it would become an utter mess if you were to do the if err != nil { ... } rigmarole after every print, so no one does, and print errors just get silently ignored by the vast majority of programs.

The third problem is that a function returning an error type does not really tell you anything about which errors it can return. This is also a problem with exceptions in most languages, but Go’s approach to error values feels even more unstructured. Consider for example the net package: it has a zillion little functions, most of which can return an error; almost none of them document which errors they can return. At least in POSIX C (which uses an even more primitive error value system, typically returning -1 and setting a global errno variable to the appropriate error code), you have manpages listing every possible error you can get from the function. In Go, I suppose the usual strategy is to find out the errors you care about and handle these, and pass the ones you don’t recognize up to the caller. That’s basically the strategy of exceptions, except done manually by you, with a lot of repetitive clutter in the code.

To be fair, the situation can be somewhat ameliorated through strategic use of the panic/recover mechanism, which is like exceptions except you’re not supposed to use them like exceptions. panic is usually used for fatal situations that mean the program cannot proceed. For situations that are supposed to be recoverable, you’re supposed to use error values. Now, what counts as recoverable or not depends on the circumstances. In general, you don’t want to call panic from a library (unless you hit an assertion violation or some other indicator of a bug), because you want library users to be able to treat the errors produced by your library. But in application code, where you know which situations are going to be handled and which are not, you can often use panic more liberally to abort on situations where execution cannot proceed and reduce the set of possible error values you pass up to the caller to only those the caller is expected to handle. Then you can use recover as a catch-all for long-running programs, to log the error and keep running (the Gin web framework, for instance, will automatically catch panics, log them and return a 500 to the client by default). I don’t know if this is considered idiomatic Go or not, but I do know that it makes code cleaner in my experience.

There is also precedent for using panic for non-local exits in the standard library: the encoding/json package uses panic to jump out of recursive calls when encountering an error, and then recover to turn the panic into a regular error value for users of the library.

No inheritance

Go has no inheritance; instead, it emphasizes the use of interfaces and composition. This is an interesting design choice, but it does cause problems sometimes. So far I have been in two situations where having something akin to an “abstract struct” from which I could inherit would have made my code simpler.

The first situation was in the Fenius interpreter: the abstract syntax tree (AST) generated by the parser has 8 different types of nodes, each of which is a struct type, some of which have subfields that are AST nodes themselves (for example, an AstBlock contains a list of AST nodes representing statements inside the block). To handle this, I define an AST interface which every node type implements. Now, one thing that every AST node has in common is a Location field. But an interface cannot require a satisfying type to have specific fields, only specific methods. Therefore, if I want the location of an AST node to be accessible regardless of its type, the only option I have is to add a Location() method to the interface (which I actually call Loc(), because I cannot have a field and a method with the same name), and implement it for each node type, so I have 8 identical method definitions of the form:

func (ast AstIdentifier) Loc() Location { return ast.Location }

in the code, one for each node type.

The second situation was in the web project at work, where I implemented a simple validation package to validate incoming JSON request bodies. In this package, I define a bunch of types representing different types of fields, such as String, Integer, Boolean, etc. Usage looks like this:

var FormValidator = validation.Map{
    "name": validation.String{Required: true, MaxLength: 50},
    "age": validation.Integer{Required: false, MinValue: 0},
}

All of these types have in common a boolean Required field. But again, since there is no inheritance, given a validator type there is no generic way for me to access the Required field. The only way is to implement a method returning the field for every validator type (or to use reflection and give up type safety).

Now, Go has an interesting feature, which is that you can embed other types in a struct, and you can even access the fields and methods of the embedded struct without naming it explicitly, so in principle I could do something like:

type BaseValidator struct {
    Required bool
}

type String struct {
    BaseValidator
    MaxLength int
}

and now if I instantiate a String struct s, I can even write s.Required without naming the embedded struct! This could solve my problem, except that when initializing the struct, I cannot write just String{Required: true}: I have to write String{BaseValidator: BaseValidator{Required: true}}, which ruins my pretty Map definition.

Another thing that could solve my problem is writing a constructor function for the String type, but since Go does not have keyword arguments, that does not look pretty either. The only solution that looks pretty in the client code is to repeat myself in the package code.

No love for unfinished programs

In Go, it is a compilation error to define a variable and not use it, or to import a module and not use it. I do think it’s worthwhile to ensure that these things are not present in finished code (the one that goes to code review and gets deployed); that’s why we have linters. But requiring it during development is a pain in the ass. Say you are debugging a piece of code. Comment out something to see what happens? Code does not compile because a variable is not in use. Or you add some debug prints, run the code, see what happens, comment out the debug print, run again… code does not compile because you import fmt and don’t use it. These seemingly minor but frequent annoyances break your flow during development.

There are lots of interesting invariants that are useful to ensure are respected in finished programs, but which will be violated at various points during development, between the time you check out the repository and the time you have a new version ready to be deployed. It is my long-standing position that running unfinished programs is a useful thing to be able to do; this is a topic I might revisit in a future blog post. It is okay when a language rejects an incomplete program for technical reasons (e.g., the implementation cannot ensure run-time safety for code that calls non-existent functions, or calls a function with the wrong argument types). What annoys me is when a language goes out of its way to stop you from running code that it would otherwise be perfectly capable of running. Java’s checked exceptions and Go’s unused variable/import checks fall into this category. This could be easily solved by having a compiler switch to disable those checks during development, but alas, no.

At the same time, a struct constructor with missing fields is not an error, not even a warning, so if you forget a field, or add a new field to the struct and forget to update some place that constructs it, you get no help from the language; not even golint will tell you about it. (Yes, there are useful use cases for omitting struct fields, but I would expect at least a linter option to detect this.)

One-letter identifiers are the norm

And this is enshrined in the Go Code Review Comments page from the Go wiki:

Variable names in Go should be short rather than long. This is especially true for local variables with limited scope. Prefer c to lineCount. Prefer i to sliceIndex.

Prefer c to lineCount? Why? It is general wisdom that code is read more often than it’s written, so it pays off to use descriptive variable names. It may be super clear for you, today, that c is a line count, but what about people new to the code base, or your future self 6 months from now? Is there any clarity gained by using c instead of lineCount? Is the code simpler?

As for i instead of sliceIndex… well, sure, since sliceIndex says very little about the slice’s purpose anyway. Depending on the context, there may be a better name than both i and sliceIndex to give to this variable. But I do grant that i may be an okay name for a slice index in a simple loop (although slice indexes don’t really appear that much anyway, since you can iterate over the values directly).

Testing

The only good thing I can say about Go’s testing infrastructure is that it exists; that’s about it. It is afflicted by Go’s obsession with single-letter identifiers (it defines types such as testing.T for tests, testing.B for benchmarks, testing.M for main test context). It provides no assert functions; you’re supposed to write an explicit if and panic to indicate test failures. (There is a popular library called Testify that provides asserts and also shows diffs between expected and found values.)

Despite doing very little, it also does too much. For instance, it caches test results by default (!). You can disable this behavior: “The idiomatic way to disable test caching explicitly”, I quote, “is to use -count=1.” (!!) It also runs tests from different packages in parallel by default, which makes for all sorts of fun if your tests use a database – the main one being spending a day figuring out why your tests don’t work, since this fact is not particularly prominent in documentation, i.e., it is not something you are likely to find out unless you are specifically looking for it. (You can disable parallelism, or use one of various third-party packages with different solutions to tests involving databases.)

The attitude

This one is very subjective, and not related to the language itself, but it just rubs me wrong when I see the Go designers speaking of Go as if it truly stood out from the crowd. Even when recognizing other languages, they seem to want to position Go as a pioneer in a great new wave of programming languages. From the Go FAQ:

We were not alone in our concerns. After many years with a pretty quiet landscape for programming languages, Go was among the first of several new languages—Rust, Elixir, Swift, and more—that have made programming language development an active, almost mainstream field again.

The Go project started by the end of 2007 and went public in 2009. Was the programming language landscape really that silent in the preceding years? Without doing any research other than checking the dates on Wikipedia, I can think of D (2001), Groovy (2003), Scala (2004), F# (2005), Vala (2006), Clojure (2007), and Nim (2008). So no, we were not in any kind of programming language dark ages before Go came along inaugurating a great renaissance of programming languages.

Recently I watched a video in which Rob Pike speaks of the fact that Go numeric constants work like abstract numbers without a specific type, so you can use 1 in a place expecting an int or a byte or a float64 without relying on type conversion rules, as a “relatively novel idea”. Guys, Haskell has had this at least since 1990. These ideas are not new, you have just been oblivious to the rest of the world.

Of course, Go does bring its own share of new ideas, and new ways to combine existing ideas. It just annoys me when they see themselves as doing something truly exceptional and out of the ordinary.

So why do I keep using this language?

And yet, despite all of the above, I still think Go was a good choice for implementing the Fenius interpreter, and I still think it’s a good choice in a variety of situations. So I think it’s appropriate to finish this post with some counterpoints to the above. Why do I keep using Go, despite all of the above problems?

First of all, it gets the job done. It is often the case that practical considerations, often having more to do with a language’s runtime and environment than with the language itself, lead to the choice of a given language for a job. For example, PHP has a terrible language design, but it’s super easy to deploy, so it makes sense to choose PHP for some tasks in some circumstances, even though there are plenty of better languages available. As for Go, regardless of any of the problems mentioned before, it does give me a lightweight memory-safe garbage-collected runtime, native self-contained executables, and does not try to hide the operating system from me. These characteristics make Go a good choice for my particular use case. (I should also note that, despite the above comparison with PHP, a lot of thought has been put into Go’s design, even if I disagree with many of the design choices.)

Second, in almost every respect in which Go is bad, C is even worse. So if you come to Go with a perspective of “I want something like C but less annoying”, Go actually delivers. And I would rather program in Go than in C++, even though C++ does not have many of the problems mentioned above, because the problems it does have are even worse. When I think from this perspective, I’m actually glad Go exists in this world, because it means I have fewer reasons to write C or C++.2

In fact, when you realize that Go came about as a reaction to C++, the relentless obsession with (a certain kind of) simplicity makes a lot more sense. C++ is a behemoth of a language, and it gets bigger with every new standard. Go is not only a very simple language, it makes it hard to build complex abstractions on the top of it, almost like a safeguard against C++-ish complexity creeping in. One can argue the reaction was too exaggerated, but I can understand where they are coming from.

There is a final bit of ambivalent praise I want to give Go, related to the above. I think Go embodies the Unix philosophy in a way no other recently designed language that I know of does. This is not an unambiguously good thing, mind you; it brings to my mind the worse is better concept, an interesting view of Unix and C by someone from outside of that tradition (and an essay with a fascinating story in itself). But Go had key Unix people among its designers – Ken Thompson, the inventor of Unix himself; and Rob Pike, who worked on Plan 9 –, and it shows. For good and for bad, Go is exactly the kind of language you would expect Unix people to come up with if they sat down to design a higher-level successor to C. And notwithstanding all my misgivings about the language, I can respect that.

_____

1 Recently I learned it is possible to set a finalizer on an object, but they are not deterministic or related to scoping. I do find it a bit surprising that Go has finalizers, though.

2 If I did not need garbage collection, Rust would be a good option for this project as well. But as I mentioned in the beginning, I do need a garbage collector because Fenius is garbage-collected. If I were to implement it in a non-garbage-collected language, I would have to write a garbage collector for Fenius myself, whereas with Go or other garbage-collected languages, I can get away with relying on the host language’s garbage collector. I think of Rust and Go as complementary rather than in opposition, but that’s maybe a topic for another post.

Comentários / Comments (0)

Deixe um comentário / Leave a comment

Main menu

Posts recentes

Comentários recentes

Tags

em-portugues (213) comp (140) prog (69) in-english (53) life (47) unix (36) pldesign (35) lang (32) random (28) about (27) mind (25) lisp (23) mundane (22) fenius (20) web (18) ramble (17) img (13) rant (12) hel (12) scheme (10) privacy (10) freedom (8) copyright (7) bash (7) academia (7) music (7) esperanto (7) lash (7) home (6) shell (6) mestrado (6) emacs (5) conlang (5) misc (5) editor (4) book (4) worldly (4) etymology (4) android (4) latex (4) politics (4) php (4) security (3) film (3) kbd (3) tour-de-scheme (3) wrong (3) c (3) network (3) lows (2) poem (2) cook (2) physics (2) wm (2) treta (2) llvm (2) comic (2) philosophy (2) translation (1) golang (1) german (1) en-esperanto (1) audio (1) old-chinese (1) x11 (1) kindle (1) pointless (1) perl (1)

Elsewhere

Quod vide


Copyright © 2010-2021 Vítor De Araújo
O conteúdo deste blog, a menos que de outra forma especificado, pode ser utilizado segundo os termos da licença Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Powered by Blognir.