Elmord's Magic Valley

Software, lingüística e rock'n'roll. Às vezes em Português, sometimes in English.

Adventures with Fenius and Common Lisp

2023-01-22 00:05 +0000. Tags: comp, prog, pldesign, fenius, lisp, in-english

I started playing with Fenius (my hobby, vaporware programming language) again. As usual when I pick up this project again after a year or two of hiatus, I decided to restart the whole thing from scratch. I currently have a working parser and a very very simple interpreter that is capable of running a factorial program. A great success, if you ask me.

This time, though, instead of doing it in Go, I decided to give Common Lisp a try. It was good to play a bit with Go, as I had wanted to become more familiar with that language for a long time, and I came out of the experience with a better idea of what the language feels like and what are its strong and weak points. But Common Lisp is so much more my type of thing. I like writing individual functions and testing and experimenting with them as I go, rather than writing one whole file and then running it. I like running code even before it’s complete, while some functions may still be missing or incomplete, to see if the parts that are finished work as expected, and to modify the code according to these partial results. Common Lisp is made for this style of development, and it’s honestly the only language I have ever used where this kind of thing is not an afterthought, but really a deeply ingrained part of the language. (I think Smalltalk and Clojure are similar in this respect, but I have not used them.) Go is very much the opposite of this; as I discussed in my previous Go post, the language is definitely not conceived with the idea that running an incomplete program is a useful thing to do.

Common Lisp macros, and the ability to run code at compile time, also opens up some interesting ways to structure code. One thing I’m thinking about is to write a macro to pattern-match on AST nodes, which would make writing the interpreter more convenient than writing lots of field access and conditional logic to parse language constructs. But I still have quite a long way to go before I can report on how that works out.

What kind of language I’m trying to build?

This is a question I’ve been asking myself a lot lately. I’ve come to realize that I want many different, sometimes conflicting things from a new language. For example, I would like to be able to use it to write low-level things such as language runtimes/VMs, where having control of memory allocation would be useful, but I would also like to not care about memory management most of the time. I would also like to have some kind of static type system, but to be able to ignore types when I wish to.

In the long term, this means that I might end up developing multiple programming languages along the way focusing on different features, or maybe even two (or more) distinct but interoperating programming languages. Cross-language interoperability is a long-standing interest of mine, in fact. Or I might end up finding a sweet spot in the programming language design space that satisfies all my goals, but I have no idea what that would be like yet.

In the short term, this means I need to choose which aspects to focus on first, and try to build a basic prototype of that. For now, I plan to focus on the higher-level side of things (dynamically-typed, garbage-collected). It is surprisingly easier to design a useful dynamic programming language than a useful static one, especially if you already have a dynamic runtime to piggy-back on (Common Lisp in my case). Designing a good static type system is pretty hard. For now, the focus should be on getting something with about the same complexity as R7RS-small Scheme, without the continuations.

F-expressions

One big difference between Scheme/Lisp and Fenius, however, is the syntax. Fenius currently uses the syntax I described in The Lispless Lisp. This is a more “C-like” syntax, with curly braces, infix operators, the conventional f(x,y) function call syntax, etc., but like Lisp S-expressions, this syntax can be parsed into an abstract syntax tree without knowing anything about the semantics of specific language constructs. I’ve been calling this syntax “F-expressions” (Fenius expressions) lately, but maybe I’ll come up with a different name in the future.

If you are not familiar with Lisp and S-expressions, think of YAML. YAML allows you to represent elements such as strings, lists and dictionaries in an easy-to-read (sorta) way. Different programs use YAML for representing all kinds of data, such as configuration files, API schemas, actions to run, etc., but the same YAML library can be used to parse or generate those files without having to know anything about the specific purpose of the file. In this way, you can easily write scripts that consume or produce YAML for these programs without having to implement parsing logic specific for each situation. F-expressions are the same, except that they are optimized for representing code: instead of focusing on representing lists and dictionaries, you have syntax for representing things like function calls and code blocks. This means you can manipulate Fenius source code with about the same ease you can manipulate YAML.

(Lisp’s S-expressions work much the same way, except they use lists (delimited by parentheses) as the main data structure for representing nested data.)

Fenius syntax is more complex than Lisp-style atoms and lists, but it still has a very small number of elements (8 to be precise: constants, identifiers, phrases, blocks, lists, tuples, calls and indexes). This constrains the syntax of the language a bit: all language constructs have to fit into these elements. But the syntax is flexible enough to accomodate a lot of conventional language constructs (see the linked post). Let’s see how that will work out.

One limitation of this syntax is that in constructions like if/else, the else has to appear in the same line as the closing brace of the then-block, i.e.:

if x > 0 {
    print("foo")
} else {
    print("bar")
}

Something like:

if x > 0 {
    print("foo")
}
else {
    print("bar")
}

doesn’t work, because the else would be interpreted as the beginning of a new command. This is also one reason why so far I have preferred to use braces instead of indentation for defining blocks: with braces it’s easier to tell where one command like if/else or try/except ends through the placement of the keyword in the same line as the closing brace vs. in the following line. One possibility that occurs to me now is to use a half-indentation for continuation commands, i.e.:

if x > 0:
    print("foo")
  else:
    print("bar")

but this seems a bit cursed error-prone. Another advantage of the braces is that they are more REPL-friendly: it’s easier for the REPL to know when a block is finished and can be executed. By contrast, the Python REPL for example uses blank lines to determine when the input is finished, which can cause problems when copy-pasting code from a file. Copy-pasting from the REPL into a file is also easier, as you can just paste the code anywhere and tell your text editor to reindent the whole code. (Unlike the Python REPL, which uses ... as an indicator that it’s waiting for more input, the Fenius REPL just prints four spaces, which makes it much easier to copy multi-line code typed in the REPL into a file.)

Versioning

Fenius (considered as a successor of Hel) is a project that I have started from scratch and abandoned multiple times in the past. Every time I pick it up again, I generally give it a version number above the previous incarnation: the first incarnation was Hel 0.1, the second one (which was a completely different codebase) was Hel 0.2, then Fenius 0.3, then Fenius 0.4.

This numbering scheme is annoying in a variety of ways. For one, it suggests a continuity/progression that does not really exist. For another, it suggests a progression towards a mythical version 1.0. Given that this is a hobby project, and of a very exploratory nature, it’s not even clear what version 1.0 would be. It’s very easy for even widely used, mature projects to be stuck in 0.x land forever; imagine a hobby project that I work on and off, and sometimes rewrite from scratch in a different language just for the hell of it.

To avoid these problems, I decided to adopt a CalVer-inspired versioning scheme for now: the current version is Fenius 2023.a.0. In this scheme, the three components are year, series, micro.

The year is simply the year of the release. It uses the 4-digit year to make it very clear that it is a year and not just a large major version.

The series is a letter, and essentially indicates the current “incarnation” of Fenius. If I decide to redo the whole thing from scratch, I might label the new version 2023.b.0. I might also bump the version to 2023.b.0 simply to indicate that enough changes have accumulated in the 2023.a series that it deserves to be bumped to a new series; but even if I don’t, it will eventually become 2024.a.0 if I keep working on the same series into the next year, so there is no need to think too much about when to bump the series, as it rolls over automatically every year anyway.

The reason to use a letter instead of a number here is to make it even less suggestive of a sequential progression between series; 2023.b might be a continuation of 2023.a, or it might be a completely separate thing. In fact it’s not unconceivable that I might work on both series at the same time.

The micro is a number that is incremented for each new release in the same series. A micro bump in a given series does imply a sequential continuity, but it does not imply anything in terms of compatibility with previous versions. Anything may break at any time.

Do I recommend this versioning scheme for general use? Definitely not. But for a hobby project that nothing depends on, this scheme makes version numbers both more meaningful and less stressful for me. It’s amazing how much meaning we put in those little numbers and how much we agonize over them; I don’t need any of that in my free time.

(But what if Fenius becomes a widely-used project that people depend on? Well, if and when this happens, I can switch to a more conventional versioning scheme. That time is certainly not anywhere near, though.)

Implementation strategies

My initial plan is to make a rudimentary AST interpreter, and then eventually have a go at a bytecode interpreter. Native code compilation is a long-term goal, but it probably makes more sense to flesh out the language first using an interpreter, which is generally easier to change, and only later on to make an attempt at a serious compiler, possibly written in the language itself (and bootstrapped with the interpreter).

Common Lisp opens up some new implementation strategies as well. Instead of writing a native code compiler directly, one possibility is to emit Lisp code and call SBCL’s own compiler to generate native code. SBCL can generate pretty good native code, especially when given type declarations, and one of Fenius’ goals is to eventually have an ergonomic syntax for type declarations, so this might be interesting to try out, even if I end up eventually writing my own native code compiler.

This also opens up the possibility of using SBCL as a runtime platform (in much the same way as languages like Clojure run on top of the JVM), and thus integrating into the Common Lisp ecosystem (allowing Fenius code to call Common Lisp and vice-versa). On the one hand, this gives us access to lots of existing Common Lisp libraries, and saves some implementation work. On the other hand, this puts some pressure on Fenius to stick to doing things the same way as Common Lisp for the sake of compatibility (e.g., using the same string format, the same object system, etc.). I’m not sure this is what I want, but might be an interesting experiment along the way. I would also like to become more familiar with SBCL’s internals as well.

EOF

That’s it for now, folks! I don’t know if this project is going anywhere, but I’m enjoying the ride. Stay tuned!

Comentários / Comments (1)

The Real Chad Incarnate, 2023-01-27 18:15:11 +0000 #

https://thumbs.gfycat.com/DesertedActiveJaguar-size_restricted.gif

Just putting the shit show back together, babe! I do hope you get the reference. If not, well... You're wuss!

Six thumbs up for the "f-expressions" thingy


Deixe um comentário / Leave a comment

Main menu

Posts recentes

Comentários recentes

Tags

em-portugues (213) comp (144) prog (70) in-english (58) life (48) pldesign (36) unix (36) lang (32) random (28) about (28) mind (26) lisp (24) mundane (22) fenius (21) web (20) ramble (18) img (13) hel (12) rant (12) scheme (10) privacy (10) freedom (8) copyright (7) academia (7) music (7) lash (7) esperanto (7) bash (7) shell (6) home (6) mestrado (6) android (5) misc (5) emacs (5) conlang (5) politics (4) latex (4) etymology (4) book (4) editor (4) worldly (4) php (4) kbd (3) security (3) tour-de-scheme (3) c (3) film (3) wrong (3) network (3) cook (2) poem (2) wm (2) treta (2) comic (2) philosophy (2) lows (2) llvm (2) physics (2) golang (1) perl (1) old-chinese (1) x11 (1) en-esperanto (1) audio (1) translation (1) kindle (1) german (1) pointless (1)

Elsewhere

Quod vide


Copyright © 2010-2023 Vítor De Araújo
O conteúdo deste blog, a menos que de outra forma especificado, pode ser utilizado segundo os termos da licença Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Powered by Blognir.