[This post is part of a series on Ruby semantics.]
In the third installment of this series, we are going to have a look on one of Ruby’s most prominent features: blocks.
A block is a piece of code that can be invoked with arguments and produce a value. Blocks can be written like this:
{|arg1, arg2, ...| body}
Or like this:
do |arg1, arg2, ...| body end
These forms are equivalent, except for precedence:
f g { block }
is interpreted as f(g { block })
(the block is passed to g
), while
f g do block end
is interpreted as
f(g) { block }
(the block is passed to f
). The
|arguments|
can be omitted if the block takes no arguments.
My impression is that the do
syntax is preferred for
multi-line blocks.
Blocks are kind of like anonymous functions, but they are
not really first-class: a bare block like { puts 42 }
on
its own is a syntax error, and {}
is interpreted as an
empty dictionary (hash in Ruby terminology). The only place a
block can appear is at the end of a method call, like
f(x, y) { puts 42 }
or f x, y do puts 42 end
.
This will make the block available to the method, which can use it in a
number of ways.
Within the method, yield(arg1, arg2, ...)
will invoke
the block with the given arguments; whatever the block returns is the
result of the yield
call. The number of passed arguments
generally does not have to match the number of arguments expected by the
block: extra arguments are ignored, and missing arguments are assigned
nil
. (The only exception seems to be keyword arguments
declared by the block without a default value; these will raise an error
if not passed.)
def one_two_three yield 1 yield 2 yield 3 end one_two_three {|x| puts x*10} # prints 10, 20, 30 # We can also use the value produced by the block within the method. def map_one_two_three [yield(1), yield(2), yield(3)] end map_one_two_three {|x| x*10} # => [10, 20, 30]
The method does not have to declare that it accepts a block: you can
pass a block to any method, it’s just that some will do
something useful with it and some won’t. Within the method, you can test
if it was called with a block by calling the block_given?
predicate. Many methods from the Ruby standard library can be called
with or without a block and adapt their behavior accordingly. For
example, open("somefile")
returns a file object, but
open("somefile") {|f| ...}
opens the file, passes the file
object as an argument to the block, and closes the file when the block
finishes (analogous to using with
in Python). Another
example is the Array
constructor:
Array.new
with no arguments returns an empty
array;Array.new(5)
returns a 5-element array initialized to
nil
;Array.new(5) {|i| i*i}
returns a 5-element array,
calling the block with each array index to initialize the corresponding
array position, in this case resulting in
[0, 1, 4, 9, 16]
.Yet another example is the times
method of the
Integer
class. With a block, it calls the block
n
times (where n
is the integer), passing an
iteration counter to the block as an argument:
irb(main):024:0> 5.times {|i| puts "Hello number #{i}!" } Hello number 0! Hello number 1! Hello number 2! Hello number 3! Hello number 4! => 5
If you don’t need the iteration counter, you can just pass a block taking no arguments (and now we can see why Ruby allows block arguments not to match exactly with the values they are invoked with):
irb(main):025:0> 5.times { puts "Hello!" } Hello! Hello! Hello! Hello! Hello! => 5
And finally, if you don’t pass it any block, it returns an
Enumerator
instance, which supports a bunch of methods,
such as map
or sum
:
irb(main):035:0> 5.times.map {|x| x*x} => [0, 1, 4, 9, 16] irb(main):036:0> 5.times.sum # 0 + 1 + 2 + 3 + 4 => 10
Another way a method can use a block is by declaring an
&argument
in its argument list: in this case, the block
will be reified into a Proc
object and will be available as
a regular object to the method:
# This is equivalent to the `yield` version. def one_two_three(&block) block.call(1) block.call(2) block.call(3) end
Conversely, if you have a Proc
object and you want to
pass it to a method expecting a block, you can use the
&
syntax in the method call:
# Make a Proc out of a block... tenfold = proc {|x| puts x*10} # ...and pass it to a procedure expecting a block. # This works with either version of one_two_three. one_two_three(&tenfold) # prints 10, 20, 30
In the above example, we also see another way we can turn a block
into a Proc
object: by passing it to the builtin
proc
method.
Blocks can see the local variables that were defined at the time the block was created. Assignment to such variables modify the variable outside the block. Assignment to any other variable creates a local variable visible within the block and any nested blocks, but not outside.
x = 1 1.times { puts x # this is the outer x x = 2 # this is still the outer x y = 3 # this is local to the block 1.times { puts y # this is the outer y y = 4 # this is still the outer y } puts y # prints 4 } puts x # prints 2 puts y # error: undefined variable
An exception to this are the block parameters: a block parameter is always a fresh variable, even if a local variable with the same name already exists. (Before Ruby 1.9, this was not the case: a block parameter with the same name as a local variable would overwrite the variable when the block was called.)
You can explicitly ask for a fresh variable to be created by declaring them in the parameter list after a semicolon:
x = 1 y = 2 1.times {|i; x, y| # i is the block argument; x and y are fresh variables x = 10 y = 20 puts x # prints 10 puts y # prints 20 } puts x # prints 1 puts y # prints 2
Within the block, next
can be used to return from the
block back to the yield
that invoked it. If an argument is
passed to next
, it will become the value returned by the
yield
call:
def do_stuff result = yield 1 puts "The result is #{result}" end do_stuff { puts "I'm here" next 42 puts "This line will never run" } # prints: # I'm here # The result is 42
This construct is analogous to a continue
in Python
loop. For example:
5.times {|i| if i%2 == 0 next # skip even numbers end puts i } # prints: # 1 # 3
Although it is more idiomatic to use the postfixed if
in
this case:
5.times {|i| next if i%2 == 0 # skip even numbers puts i }
break
within the block can be used to return from
the method that called the block. Again, if an argument is passed
to break
, it becomes the return value of the method. For
example:
def do_stuff result = yield 1 puts "The result is #{result}" end x = do_stuff { puts "I'm here" break 42 puts "This line will never run" }
In this code, do_stuff
invokes the block, which prints
I'm here
and causes do_stuff
to return 42
immediately. Nothing else is printed; the The result is ...
line won’t run. The return value (42) is assigned to x
.
redo
jumps back to the beginning of the block. It
accepts no arguments. I’m sure this is useful in some circumstance,
though my creativity partly fails me, and partly does not see why this
would be useful only within blocks. But now you know it
exists.
return
within a block returns from the method the
block is in. For example:
def do_stuff result = yield 1 puts "The result is #{result}" end def foo do_stuff { puts "I'm here" return 42 # returns from `foo` puts "This line will never run" } puts "And neither will this" end foo # prints "I'm here" and returns 42
lambda
is similar to proc
: it takes a block
and returns a Proc
object. Unlike proc
:
return
within the lambda
returns from the
block itself, not the enclosing method.The shortcut syntax -> x, y { body }
is equivalent to
lambda {|x, y| body }
.
There are many equivalent ways of calling a Proc
object:
irb(main):001:0> p = proc {|x| x+1} => #<Proc:0x00007f9c4b7b1698 (irb):1> irb(main):002:0> p.call(5) => 6 irb(main):003:0> p.(5) => 6 irb(main):004:0> p[5] => 6
If a block declares no arguments, the names _1
,
_2
, …, _9
can be used to refer to arguments by
number:
irb(main):014:0> [1,2,3,4,5].map { _1 * _1 } => [1, 4, 9, 16, 25]
If such a block is turned into a lambda
, the resulting
procedure will require as many arguments as the highest argument number
used:
irb(main):021:0> lambda { _9 } => #<Proc:0x00007f9c4b79c518 (irb):21 (lambda)> irb(main):022:0> lambda { _9 }.call(1) (irb):22:in `block in <top (required)>': wrong number of arguments (given 1, expected 9) (ArgumentError)
If a block using return
in its body is reified into a
Proc
object using proc
, and the
Proc
object escapes the method it was created in, and is
invoked afterwards, the return
will cause a
LocalJumpError
:
def m p = proc { return 42 } # If we called p.call() here, it would cause `m` to return 42. # But instead, we will return `p` to the caller... p end p = m # ...and call it here, after `m` has already returned! p.call() # error: in `block in m': unexpected return (LocalJumpError)
That’s all for today, folks. There is still plenty to cover: classes,
modules, mixins, the singleton class, eval
and
metaprogramming shenanigans. I plan to write about these Real Soon
Now™.
[This post is part of a series on Ruby semantics.]
I’m still trying to wrap my head around all the intricacies of variable/name scope in Ruby. These notes are part of my attempt to figure it all out, so take it with a grain of salt, and feel free to send corrections and additions in the comments.
As I explained in the previous post, the focus of these notes is not on how to use the language, but rather on how it works. This post in particular will deal with a lot of corner cases, which are helpful to figure out what the interpreter is doing. Let’s go!
Ruby has a bunch of different types of variables and variable-like things, distinguished by their initial characters:
Local variables begin with a lowercase ASCII letter, an underscore, or a non-ASCII character (i.e., a Unicode codepoint above 127). Any non-ASCII character can be used in an identifier, even things like the zero width space (U+200B). Local variables are visible in the scope they were defined in and nested scopes, kinda (more on that later).
Constants begin with an uppercase ASCII
character. Constants belong to the class or module they are defined in
(which is Object
in the top-level). They cannot be defined
or redefined from within methods, but they can be redefined
outside of methods (with a warning).
Instance variables begin with @
,
like @foo
. They belong to the current object (i.e.,
self
).
Class variables begin with @@
, like
@@foo
. They belong to the class they are defined in and are
shared with all of its subclasses (if a subclass mutates the variable,
the superclass will reflect the mutation).
These are not the same as class instance
variables, which are not a distinct variable type, but are
simply the instance variables of the class object. Remember, classes are
objects too (instances of Class
), and therefore have their
own instance variables as well, which are distinct from the instance
variables of the instances of the class. Class instance variables are
not shared with subclasses, because each subclass is a distinct object,
with its own (class) instance variables.
Class variables cannot be accessed from the top-level: unlike
constants, they don’t implicitly refer to Object
’s class
variables in that case. I’m not sure why this inconsistency exists, but
it might be because class variables are shared with the subclasses, and
therefore defining a class variable on Object
by accident
would affect almost every class in Ruby, whereas a constant with the
same name can be defined in a subclass with no issues.
Finally, global variables begin with
$
, like $foo
, and are visible across the whole
program.
Unlike Python, there is no per-file global scope. Global variables
($foo
) are true program-wide globals. Constants, instance
variables and class variables are properties of various objects: when
you define one of those, you are effectively mutating the
class/module/instance they were defined in, and the effects will be
visible in other places where these objects are used. You can
define local variables at the top-level, but they won’t be visible
inside any class or method defition, nor is there any concept of
importing the variables defined in a different file: when you
require
another file, you will be able to see the
effects of running that file (such as defining constants,
instance variables and class variables, which, again, are object
mutation rather than what you would think of as variable
definition in Python or Scheme), but local variables defined at the file
top-level won’t be visible outside it.
The allowed names for local variables and constants are also allowed
method names. Because Ruby does not require parentheses in a
method call, and also allows the receiver to be omitted
(self.f()
can be written as f()
, which can be
written as just f
), a bare identifier like foo
could be either a method name or a variable/constant name. How does Ruby
distinguish those?
First, if the parentheses are used (foo()
) , or if there
are arguments after the identifier, with or without parentheses
(foo 42
), then foo
is unambiguously interpreted
as a method name.
If there are neither parentheses nor arguments, and the identifier
begins with a lowercase ASCII letter or an underscore, it will be
interpreted as a local variable if there has been a variable assignment
to that identifier within the lexical scope of the reference. So in
foo = 42; foo
, the second foo
is a local
variable. This disambiguation happens at parse time, and is
based on the textual appearance of an assignment in the scope
of the reference, regardless of whether the assignment is actually
executed at runtime. So, for example:
def foo "I'm a method" end if false foo = "I'm a local variable" end p foo # Prints nil!
When Ruby sees the assignment to foo
in the code, it
creates a local variable for it, even if the assignment does not run.
The variable is initialized with nil
.
Note that foo()
here would still invoke the method, even
though there is a local variable with the same name. You might ask: what
if I have a local variable whose value is a function (e.g., a
lambda
)? How do I call it? In this case, you have to invoke
foo.call()
:
def foo "I'm a method" end foo = lambda { "I'm a lambda" } p foo() # "I'm a method" p foo # #<Proc:...> p foo.call() # "I'm a lambda"
This is similar to how in Common Lisp, there are distinct namespaces
for functions and variables, and you need to use
(funcall foo)
to call a function stored in a variable.
However, because the parentheses are not mandatory in Ruby, it has to do
some extra work to guess what you want when it sees a bare
identifier.
What about constants with the same name as methods? In this case, the rules are different: Ruby treats an uppercase-initial identifier as a constant unless there are parentheses or arguments:
def A "I'm a method" end A # error: uninitialized constant A A() # "I'm a method"
Previously, I said that local variables are visible in the scope they
were defined in and nested scopes. That’s not quite true,
though, because a lot of syntactic constructs start a clean slate on
local variables. For example, local variables defined outside a
class
declaration are not visible inside it:
x = 1 class Foo x # error: undefined local variable or method `x' for Foo:Class (NameError) end
The same applies to module
and def
:
class Foo x = 1 def m x end end Foo.new.m # error: in `m': undefined local variable or method `x' for #<Foo:...> (NameError)
Neither will the variable be accessible via Foo.x
,
Foo::x
, or anything else. It will be visible for code that
runs within the class
declaration, though:
class Foo x = 1 puts x # this is fine A = x # and so is this: it initializes the constant `A` with 1 end
Even though Ruby allows multiple declarations of the same class, and
each subsequent declaration modifies the existing class rather than
defining a new one, local variables declared within one
class
declaration will not be visible to
subsequent declarations of the same class:
class Foo x = 1 end class Foo puts x # error: in `<class:Foo>': undefined local variable or method `x' for Foo:Class (NameError) end
But note that constants work fine in this case:
class Foo A = 1 end class Foo puts A # prints 1 end
This is because constants are a property of the class object, so a constant declaration mutates the class object and therefore its effect is persistent, whereas local variables only exist within the lexical/textual scope where they were declared.
Speaking of which, constant scope resolution is the one thing I’m having the hardest time figuring out. It does mostly what you would expect in normal situations, but it does so by quite strange means. What seems to be going on is that Ruby uses lexical scope to determine the dynamic resolution order of the constant. Let me show what I mean.
Classes can be nested, and you can use the constants of the outer class in the inner one:
class A X = 1 class B def m X end end end puts A::B.new.m # prints 1
You can do this even if the constant definition is not textually
within the same class
declaration as the method
definition:
class A X = 1 end class A class B def m X end end end puts A::B.new.m # still prints 1
But if you define the method directly in A::B
without
syntactically nesting it within A
, then it doesn’t
work:
class A X = 1 end class A::B def m X end end puts A::B.new.m # error: in `m': uninitialized constant A::B::X (NameError)
This resolution is dynamic, though. Let’s go back to our previous example:
class A X = 1 class B def m X end end end puts A::B.new.m # still prints 1
The method is getting the constant defined in A
. Let’s
now add a constant X
to B
:
class A::B X = 2 end
And now if we call the method:
A::B.new.m # prints 2!
Now method m
refers to a constant that did not exist at
the time it was defined. In other words, it searches for X
at runtime in all classes the method was textually
nested in. (Remember that if you define m
directly in
A::B
without textually nesting it in both classes, it only
looks up in B
.)
What about inheritance? Let’s define some classes:
class One X = 1 end class Two X = 2 end class A < One X = 10 class B < Two X = 20 def m X end end end puts A::B.new.X # prints 20
Now let’s go about removing constants and seeing what happens:
irb(main):022:0> A::B.send(:remove_const, :X) => 20 irb(main):023:0> A::B.new.m => 10
It prefers the constant of the outer class over the one from the inheritance chain. Let’s remove that one as well:
irb(main):024:0> A.send(:remove_const, :X) => 10 irb(main):025:0> A::B.new.m => 2
Ok, after exhausting the outer class chain, it falls back to the inheritance chain. What if we remove it from the superclass as well?
irb(main):026:0> Two.send(:remove_const, :X) => 2 irb(main):027:0> A::B.new.m (irb):16:in `m': uninitialized constant A::B::X (NameError)
So it doesn’t try the inheritance chain of the outer class.
One last check: what if you redefine a constant in a subclass but do not redefine the method?
class A X = 10 class B X = 20 def m X end end end class C < A::B X = 30 end puts C.new.m # prints 20
So it looks up based on where the method is defined, not the class it’s called from.
In summary, when Ruby sees a reference to a constant, it tries to find it:
Accessing an undefined local variable raises an “undefined local variable or method” error. (Because of the ambiguity between variables and method names mentioned before, the error message mentions both cases here.) Similarly, accessing an undefined constant is an error.
Accessing an uninitialized global variable produces nil
.
If you run the code with warnings enabled (ruby -w
), you
will also get a warning about it.
Accessing an uninitialized instance variable produces
nil
and no warning. There used to be one but it
was removed in Ruby
3.0.
Finally, accessing an uninitialized class variable raises an error (just like locals and constants, but unlike instance variables).
That’s all for today, folks. I did not even get to blocks in this post, but they’ll have to wait for a post of their own. Stay tuned!
[This post is part of a series on Ruby semantics.]
I’ve been studying Ruby recently for a job opportunity. The job did not pan out in the end, and therefore I’ll probably not continue with my Ruby studies, but I want to write down some things I learned before I forget them.
The focus of these notes is not on how to use the language, but rather on how it works, i.e., the language semantics. This may end up making the language seem weirder than it actually is in practice, because a lot of the examples will be dealing with corner cases. I will be writing this from a Python (and sometimes Lisp) perspective.
Functions and methods, though superficially similar, work very
differently in Python and Ruby. In both languages, x.f(a)
mean “call method f
of object x
with argument
a
”, but it works quite differently behind the scenes:
x.f
evaluates to a bound
method object, and then that function-like object is called with
a
as an argument.f
with argument
a
to object x
. There is no intermediate bound
method object.In Ruby, x.f
on its own is equivalent to
x.f()
, i.e., send the message f
with no
arguments to object x
. In general, parentheses can be
omitted from method calls if there is no ambiguity.
f()
on its own is equivalent to self.f()
.
Whereas in Python, self
is an argument of the function that
implements a method and has to be defined explicitly, in Ruby
self
is a keyword that refers to the current object and is
always available.
Likewise, def
always defines a method. Whereas in Python
def
defines a function in the local scope, in Ruby
def
defines a method in the current class. So, for
example:
class Foo def g def h 42 end end end
This defines a class Foo
with a method g
,
which, when called, defines method h
in class
Foo
. So, afterwards:
irb(main):008:0> x = Foo.new => #<Foo:0x00007f72a83dd3b8> irb(main):009:0> x.h # Method does not exist yet. (irb):9:in `<main>': undefined method `h' for #<Foo:0x00007f72a83dd3b8> (NoMethodError) from /usr/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>' from /usr/bin/irb:25:in `load' from /usr/bin/irb:25:in `<main>' irb(main):010:0> x.g # When g is called, h is defined. => :h irb(main):011:0> x.h # Now h exists. => 42
(In this sense, Python’s def
is more like Scheme’s
define
, whereas Ruby’s def
is more like Common
Lisp’s defun
or defmethod
.)
If a new Foo
object is instantiated now, it will have
access to method h
already, since the def h
defined it in the class Foo
, not in the instance
x
:
irb(main):012:0> y = Foo.new => #<Foo:0x00007f72a842a3c0> irb(main):013:0> y.h => 42
What if you use def
at the top-level outside of a class?
Well, in that case, self
refers to the main
object, which is an instance of Object
(the base class of
most Ruby classes). So a method defined at the top-level is a method of
Object
! For example, let’s define a hello
method with no arguments (again, the parentheses around the arguments
can be omitted):
def hello puts "Hello, world!" end
And now we can call it:
irb(main):019:0> hello Hello, world! => nil
But since the method was defined as a method of Object
,
won’t it be available on every object?
irb(main):020:0> 4.hello (irb):20:in `<main>': private method `hello' called for 4:Integer (NoMethodError) from /usr/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>' from /usr/bin/irb:25:in `load' from /usr/bin/irb:25:in `<main>'
Note that the call fails not because the method is not defined, but
because the method is private. We can override the access
control by using the send
method to send the message
explicitly to the object:
irb(main):021:0> 4.send(:hello) Hello, world! => nil
And there we go. The code at the top-level effectively runs as if it were inside a:
class Object private <... your code goes here ...> end
Note how code that looks superficially like Python and seems to work the same way is actually doing so by very different means. For example, consider a piece of code like:
def multiply(x, y) x * y end class DeepThought def compute_answer() multiply(6, 7) end end puts DeepThought.new().compute_answer() # prints 42
The method compute_answer
uses the multiply
method defined at the top-level. In Python, the equivalent code works by
searching for multiply
in the current environment, finding
it at the global scope, and calling the function bound to it. In Ruby,
this works by defining multiply
as a method of
Object
, and because DeepThought
inherits from
Object
by default, it has multiply
as a
method. We could have written self.multiply(6, 7)
and we
would get the same result.
This means you can easily clobber someone else’s method definitions
if you define a method at the top-level. I guess it’s okay to do that if
you’re writing a standalone script that won’t be used as part of
something bigger, but if you’re writing a library, or a piece of a
program consisting of multiple files, you probably want to wrap all your
method definitions within a class
or module
definition. I plan to talk about those in a future blog post.
See you next time!
Over the last couple of months (but mainly over the last four weeks
or so), I’ve been working on the Fenius interpreter,
refactoring it and adding features. The latest significant feature was
the ability to import Common Lisp packages, and support for keyword
arguments in a Common-Lisp-compatible way, i.e., f(x, y=z)
ends up invoking (f x :y z)
, i.e., f
with
three arguments, x
, the keyword :y
, and
z
. Although this can lead to weird results if keyword
arguments are passed where positional arguments are expected or
vice-versa (a keyword like :y
may end up being interpreted
as a regular positional value rather than as the key of the next
argument), the semantics is exactly the same as in Common Lisp, which
means we can call Common Lisp functions from Fenius (and vice-versa)
transparently. Coupled with the ability to import Common Lisp packages,
this means that we can write some useful pieces of code even though
Fenius still doesn’t have much in its standard library. For example,
this little script accepts HTTP requests and responds with a message and
the parsed data from the request headers (yes, I know that it’s not even
close to fully supporting the HTTP standard, but this is just a
demonstration of what can be done):
# Import the Common Lisp standard functions, as well as SBCL's socket library. let lisp = importLispPackage("COMMON-LISP") let sockets = importLispPackage("SB-BSD-SOCKETS") # We need a few Common Lisp keywords (think of it as constants) # to pass to the socket library. let STREAM = getLispValue("KEYWORD", "STREAM") let TCP = getLispValue("KEYWORD", "TCP") # Import an internal function from the Fenius interpreter. # This should be exposed in the Fenius standard library, but we don't have much # of a standard library yet. let makePort = getLispFunction("FENIUS", "MAKE-PORT") # Add a `split` method to the builtin `Str` class. # This syntax is provisional (as is most of the language anyway). # `@key start=0` defines a keyword argument `start` with default value 0. method (self: Str).split(separator, @key start=0) = { if start > self.charCount() { [] } else { let position = lisp.search(separator, self, start2=start) let end = (if position == [] then self.charCount() else position) lisp.cons( lisp.subseq(self, start, end), self.split(separator, start=end+separator.charCount()), ) } } # Listen to TCP port 8000 and wait for requests. let main() = { let socket = sockets.makeInetSocket(STREAM, TCP) sockets.socketBind(socket, (0,0,0,0), 8000) sockets.socketListen(socket, 10) serveRequests(socket) } # Process one request and call itself recursively to loop. let serveRequests(socket) = { print("Accepting connections...") let client = sockets.socketAccept(socket) print("Client: ", client) let clientStream = sockets.socketMakeStream(client, input=true, output=true) let clientPort = makePort(stream=clientStream, path="<client>") let request = parseRequest(clientPort) clientPort.print("HTTP/1.0 200 OK") clientPort.print("") clientPort.print("Hello from Fenius!") clientPort.print(request.repr()) lisp.close(clientStream) sockets.socketClose(client) serveRequests(socket) } # Remove the "\r" from HTTP headers. We don't have "\r" syntax yet, so we call # Common Lisp's `(code-char 13)` to get us a \r character (ASCII value 13). let strip(text) = lisp.remove(lisp.codeChar(13), text) # Define a structure to contain data about an HTTP request. # `@key` defines the constructor as taking keyword (rather than positional) arguments. record HttpRequest(@key method, path, headers) # Read an HTTP request from the client socket and return an HttpRequest value. let parseRequest(port) = { let firstLine = strip(port.readLine()).split(" ") let method = firstLine[0] let path = firstLine[1] let protocolVersion = firstLine[2] let headers = parseHeaders(port) HttpRequest(method=method, path=path, headers=headers) } # Parse the headers of an HTTP request. let parseHeaders(port) = { let line = strip(port.readLine()) if line == "" { [] } else { let items = line.split(": ") # todo: split only once let key = items[0] let value = items[1] lisp.cons((key, value), parseHeaders(port)) } } main()
Having reached this stage, it’s easier for me to just start trying to use the language to write small programs and get an idea of what is missing, what works well and what doesn’t, and so on.
One open question going forward is how much I should lean on Common Lisp compatibility. In one direction, I might go all-in into compatibility and integration into the Common Lisp ecosystem. This would give Fenius easy access to a whole lot of existing libraries, but on the other hand would limit how much we can deviate from Common Lisp semantics, and the language might end up being not much more than a skin over Common Lisp, albeit with a cleaner standard library. That might actually be a useful thing in itself, considering the success of ReasonML (which is basically a skin over OCaml).
In the opposite direction, I might try to not rely on Common Lisp too much, which means having to write more libraries instead of using existing ones, but also opens up the way for a future standalone Fenius implementation.
I quit my job about 6 months ago. My plan was to relax a bit and work on Fenius (among other things), but I’ve only been able to really start working on it regularly over the last month. I’ve been mostly recovering from burnout, and only recently have started to get back my motivation to sit down and code things. I’ve also been reading stuff on Old Chinese (and watching a lot of great videos from Nathan Hill’s channel), and re-reading some Le Guin books, as well as visiting and hosting friends and family.
I would like to go on with this sabbatical of sorts, but unfortunately money is finite, my apartment rental contract ends by the end of July, and the feudal lord wants to raise the rent by over 40%, which means I will have to (1) get a job in the upcoming months, and (2) probably move out of Lisbon. I’m thinking of trying to find some kind of part-time job, or go freelancing, so I have extra time and braincells to work on my personal projects. We will see how this plays out.
That’s all for now, folks! See you next time with more thoughts on Fenius and other shenanigans.
I started playing with Fenius (my hobby, vaporware programming language) again. As usual when I pick up this project again after a year or two of hiatus, I decided to restart the whole thing from scratch. I currently have a working parser and a very very simple interpreter that is capable of running a factorial program. A great success, if you ask me.
This time, though, instead of doing it in Go, I decided to give Common Lisp a try. It was good to play a bit with Go, as I had wanted to become more familiar with that language for a long time, and I came out of the experience with a better idea of what the language feels like and what are its strong and weak points. But Common Lisp is so much more my type of thing. I like writing individual functions and testing and experimenting with them as I go, rather than writing one whole file and then running it. I like running code even before it’s complete, while some functions may still be missing or incomplete, to see if the parts that are finished work as expected, and to modify the code according to these partial results. Common Lisp is made for this style of development, and it’s honestly the only language I have ever used where this kind of thing is not an afterthought, but really a deeply ingrained part of the language. (I think Smalltalk and Clojure are similar in this respect, but I have not used them.) Go is very much the opposite of this; as I discussed in my previous Go post, the language is definitely not conceived with the idea that running an incomplete program is a useful thing to do.
Common Lisp macros, and the ability to run code at compile time, also opens up some interesting ways to structure code. One thing I’m thinking about is to write a macro to pattern-match on AST nodes, which would make writing the interpreter more convenient than writing lots of field access and conditional logic to parse language constructs. But I still have quite a long way to go before I can report on how that works out.
This is a question I’ve been asking myself a lot lately. I’ve come to realize that I want many different, sometimes conflicting things from a new language. For example, I would like to be able to use it to write low-level things such as language runtimes/VMs, where having control of memory allocation would be useful, but I would also like to not care about memory management most of the time. I would also like to have some kind of static type system, but to be able to ignore types when I wish to.
In the long term, this means that I might end up developing multiple programming languages along the way focusing on different features, or maybe even two (or more) distinct but interoperating programming languages. Cross-language interoperability is a long-standing interest of mine, in fact. Or I might end up finding a sweet spot in the programming language design space that satisfies all my goals, but I have no idea what that would be like yet.
In the short term, this means I need to choose which aspects to focus on first, and try to build a basic prototype of that. For now, I plan to focus on the higher-level side of things (dynamically-typed, garbage-collected). It is surprisingly easier to design a useful dynamic programming language than a useful static one, especially if you already have a dynamic runtime to piggy-back on (Common Lisp in my case). Designing a good static type system is pretty hard. For now, the focus should be on getting something with about the same complexity as R7RS-small Scheme, without the continuations.
One big difference between Scheme/Lisp and Fenius, however, is the syntax. Fenius currently uses the syntax I described in The Lispless Lisp. This is a more “C-like” syntax, with curly braces, infix operators, the conventional f(x,y)
function call syntax, etc., but like Lisp S-expressions, this syntax can be parsed into an abstract syntax tree without knowing anything about the semantics of specific language constructs. I’ve been calling this syntax “F-expressions” (Fenius expressions) lately, but maybe I’ll come up with a different name in the future.
If you are not familiar with Lisp and S-expressions, think of YAML. YAML allows you to represent elements such as strings, lists and dictionaries in an easy-to-read (sorta) way. Different programs use YAML for representing all kinds of data, such as configuration files, API schemas, actions to run, etc., but the same YAML library can be used to parse or generate those files without having to know anything about the specific purpose of the file. In this way, you can easily write scripts that consume or produce YAML for these programs without having to implement parsing logic specific for each situation. F-expressions are the same, except that they are optimized for representing code: instead of focusing on representing lists and dictionaries, you have syntax for representing things like function calls and code blocks. This means you can manipulate Fenius source code with about the same ease you can manipulate YAML.
(Lisp’s S-expressions work much the same way, except they use lists (delimited by parentheses) as the main data structure for representing nested data.)
Fenius syntax is more complex than Lisp-style atoms and lists, but it still has a very small number of elements (8 to be precise: constants, identifiers, phrases, blocks, lists, tuples, calls and indexes). This constrains the syntax of the language a bit: all language constructs have to fit into these elements. But the syntax is flexible enough to accomodate a lot of conventional language constructs (see the linked post). Let’s see how that will work out.
One limitation of this syntax is that in constructions like if/else, the else
has to appear in the same line as the closing brace of the then-block, i.e.:
if x > 0 { print("foo") } else { print("bar") }
Something like:
if x > 0 { print("foo") } else { print("bar") }
doesn’t work, because the else
would be interpreted as the beginning of a new command. This is also one reason why so far I have preferred to use braces instead of indentation for defining blocks: with braces it’s easier to tell where one command like if/else or try/except ends through the placement of the keyword in the same line as the closing brace vs. in the following line. One possibility that occurs to me now is to use a half-indentation for continuation commands, i.e.:
if x > 0: print("foo") else: print("bar")
but this seems a bit cursed error-prone. Another advantage of the braces is that they are more REPL-friendly: it’s easier for the REPL to know when a block is finished and can be executed. By contrast, the Python REPL for example uses blank lines to determine when the input is finished, which can cause problems when copy-pasting code from a file. Copy-pasting from the REPL into a file is also easier, as you can just paste the code anywhere and tell your text editor to reindent the whole code. (Unlike the Python REPL, which uses ...
as an indicator that it’s waiting for more input, the Fenius REPL just prints four spaces, which makes it much easier to copy multi-line code typed in the REPL into a file.)
Fenius (considered as a successor of Hel) is a project that I have started from scratch and abandoned multiple times in the past. Every time I pick it up again, I generally give it a version number above the previous incarnation: the first incarnation was Hel 0.1, the second one (which was a completely different codebase) was Hel 0.2, then Fenius 0.3, then Fenius 0.4.
This numbering scheme is annoying in a variety of ways. For one, it suggests a continuity/progression that does not really exist. For another, it suggests a progression towards a mythical version 1.0. Given that this is a hobby project, and of a very exploratory nature, it’s not even clear what version 1.0 would be. It’s very easy for even widely used, mature projects to be stuck in 0.x land forever; imagine a hobby project that I work on and off, and sometimes rewrite from scratch in a different language just for the hell of it.
To avoid these problems, I decided to adopt a CalVer-inspired versioning scheme for now: the current version is Fenius 2023.a.0. In this scheme, the three components are year, series, micro.
The year is simply the year of the release. It uses the 4-digit year to make it very clear that it is a year and not just a large major version.
The series is a letter, and essentially indicates the current “incarnation” of Fenius. If I decide to redo the whole thing from scratch, I might label the new version 2023.b.0. I might also bump the version to 2023.b.0 simply to indicate that enough changes have accumulated in the 2023.a series that it deserves to be bumped to a new series; but even if I don’t, it will eventually become 2024.a.0 if I keep working on the same series into the next year, so there is no need to think too much about when to bump the series, as it rolls over automatically every year anyway.
The reason to use a letter instead of a number here is to make it even less suggestive of a sequential progression between series; 2023.b might be a continuation of 2023.a, or it might be a completely separate thing. In fact it’s not unconceivable that I might work on both series at the same time.
The micro is a number that is incremented for each new release in the same series. A micro bump in a given series does imply a sequential continuity, but it does not imply anything in terms of compatibility with previous versions. Anything may break at any time.
Do I recommend this versioning scheme for general use? Definitely not. But for a hobby project that nothing depends on, this scheme makes version numbers both more meaningful and less stressful for me. It’s amazing how much meaning we put in those little numbers and how much we agonize over them; I don’t need any of that in my free time.
(But what if Fenius becomes a widely-used project that people depend on? Well, if and when this happens, I can switch to a more conventional versioning scheme. That time is certainly not anywhere near, though.)
My initial plan is to make a rudimentary AST interpreter, and then eventually have a go at a bytecode interpreter. Native code compilation is a long-term goal, but it probably makes more sense to flesh out the language first using an interpreter, which is generally easier to change, and only later on to make an attempt at a serious compiler, possibly written in the language itself (and bootstrapped with the interpreter).
Common Lisp opens up some new implementation strategies as well. Instead of writing a native code compiler directly, one possibility is to emit Lisp code and call SBCL’s own compiler to generate native code. SBCL can generate pretty good native code, especially when given type declarations, and one of Fenius’ goals is to eventually have an ergonomic syntax for type declarations, so this might be interesting to try out, even if I end up eventually writing my own native code compiler.
This also opens up the possibility of using SBCL as a runtime platform (in much the same way as languages like Clojure run on top of the JVM), and thus integrating into the Common Lisp ecosystem (allowing Fenius code to call Common Lisp and vice-versa). On the one hand, this gives us access to lots of existing Common Lisp libraries, and saves some implementation work. On the other hand, this puts some pressure on Fenius to stick to doing things the same way as Common Lisp for the sake of compatibility (e.g., using the same string format, the same object system, etc.). I’m not sure this is what I want, but might be an interesting experiment along the way. I would also like to become more familiar with SBCL’s internals as well.
That’s it for now, folks! I don’t know if this project is going anywhere, but I’m enjoying the ride. Stay tuned!
A while ago I wrote a couple of posts about an idea I had about how to mix static and dynamic typing and the problems with that idea. I've recently thought about a crazy solution to this problem, probably too crazy to implement in practice, but I want to write it down before it flees my mind.
Just to recapitulate, the original idea was to have reified type variables in the code, so that a generic function like:
let foo[T](x: T) = ...
would actually receive T
as a value, though one that would be passed automatically by default if not explicitly specified by the programmer, i.e., when calling foo(5)
, the compiler would have enough information to actually call foo[Int](5)
under the hood without the programmer having to spell it out.
The problem is how to handle heterogeneous data structures, such as lists of arbitrary objects. For example, when deserializing a JSON object like [1, "foo", true]
into a List[T]
, there is no value we can give for T
that carries enough information to decode any element of the list.
The solution I had proposed in the previous post was to have a Dynamic
type which encapsulates the type information and the value, so you would use a List[Dynamic]
here. The problem is that every value of the list has to be wrapped in a Dynamic
container, i.e., the list becomes [Dynamic(1), Dynamic("foo"), Dynamic(true)]
.
But there is a more unconventional possibility hanging around here. First, the problem here is typing a heterogeneous sequence of elements as a list. But there is another sequence type that lends itself nicely for this purpose: the tuple. So although [1, "foo", true]
can't be given a type, (1, "foo", true)
can be given the type Tuple[Int, Str, Bool]
. The problem is that, even if the Tuple
type parameters are variables, the quantity of elements is fixed statically, i.e., it doesn't work for typing an arbitrarily long list deserialized from JSON input, for instance. But what if I give this value the type Tuple[*Ts]
, where *
is the splice operator (turns a list into multiple arguments), and Ts
is, well, a list of types? This list can be given an actual type: List[Type]
. So now we have these interdependent dynamic types floating around, and to know the type of the_tuple[i]
, the type stored at Ts[i]
has to be consulted.
I'm not sure how this would work in practice, though, especially when constructing this list. Though maybe in a functional setting, it might work. Our deserialization function would look like (in pseudo-code):
let parse_list(input: Str): Tuple[*Ts] = { if input == "" { () # Returns a Tuple[], and Ts is implicitly []. } elif let (value, rest) = parse_integer(input) { (value, *parse_list(rest)) # If parse_list(rest) is of type Tuple[*Ts], # (value, *parse_list(rest)) is of type Tuple[Int, *Ts]. } ... }
For dictionaries, things might be more complicated; the dictionary type is typically along the lines of Dict[KeyType, ValueType]
, and we are back to the same problem we had with lists. But just as heterogeneous lists map to tuples, we could perhaps map heterogeneous dictionaries to… anonymous records! So instead of having a dictionary {"a": 1, "b": true}
of type Dict[Str, T]
, we would instead have a record (a=1, b=true)
of type Record[a: Int, b: Bool]
. And just as a dynamic list maps to Tuple[*Ts]
, a dynamic dictionary maps to Record[**Ts]
, where Ts
is a dictionary of type Dict[Str, Type]
, mapping each record key to a type.
Could this work? Probably. Would it be practical or efficient? I'm not so sure. Would it be better than the alternative of just having a dynamic container, or even specialized types for dynamic collections? Probably not. But it sure as hell is an interesting idea.
In the previous post, I discussed an idea I had for handling dynamic typing in a primarily statically-typed language. In this post, I intend to first, describe the idea a little better, and second, explain what are the problems with it.
The basic idea is:
For example, consider a function signature like:
let f[A, B](arg1: Int, arg2: A, arg3: B, arg4): Bool = ...
This declares a function f
with two explicit type parameters A
and B
, and four regular value parameters arg1
to arg4
. arg1
is declared with a concrete Int
type. arg2
and arg3
are declared as having types passed in as type parameters. arg4
does not have an explicit type, so in effect it behaves as if the function had an extra type parameter C
, and arg4
has type C
.
When the function is called, the type arguments don't have to be passed explicitly; rather, they will be automatically provided by the types of the expressions used as arguments. So, if I call f(42, "hello", 1.0, True)
, the compiler will implicitly pass the types Str
and Float
for A
and B
, as well as Bool
for the implicit type parameter C
.
In the body of f
, whenever the parameters with generic types are used, the corresponding type parameters can be consulted at run-time to find the approprate methods to call. For example, if arg2.foo()
is called, a lookup for the method foo
inside A
will happen at run-time. This lookup might fail, in which case we would get an exception.
This all looks quite beautiful.
The problem is when you introduce generic data structures into the picture. Let's consider a generic list type List[T]
, where T
is a type parameter. Now suppose you have a list like [42, "hello", 1.0, True]
(which you might have obtained from deserializing a JSON file, for instance). What type can T
be? The problem is that, unlike the case for functions, there is one type variable for multiple elements. If all type information must be encoded in the value of the type parameter, there is no way to handle a heterogeneous list like this.
Having a union type here (say, List[Int|Str|Float|Bool]
) will not help us, because union types require some way to distinguish which element of the union a given value belongs to, but the premise was for all type information to be carried by the type parameter so you could avoid encoding the type information into the value.
For a different example, consider you want to have a list objects satisfying an interface, e.g., List[JSONSerializable]
. Different elements of the list may have different types, and therefore different implementations of the interface, and you would need type information with each individual element to be able to know at run-time where to find the interface implementation for each element.
Could this be worked around? One way would be to have a Dynamic
type, whose implementation would be roughly:
record Dynamic( T: Type, value: T, )
The Dynamic
type contains a value and its type. Note that the type is not declared as a type parameter of Dynamic
: it is a member of Dynamic
. The implication is that a value like Dynamic(Int, 5)
is not of type Dynamic[Int]
, but simply Dynamic
: there is a single Dynamic
type container which can hold values of any type and carries all information about the value's type within itself. (I believe this is an existential type, but I honestly don't know enough type theory to be sure.)
Now our heterogeneous list can simply be a List[Dynamic]
. The problem is that to use this list, you have to wrap your values into Dynamic
records, and unwrap them to use the values. Could it happen implicitly? I'm not really sure. Suppose you have a List[Dynamic]
and you want to pass it to a function expecting a List[Int]
. We would like this to work, if we want static and dynamic code to run along seamlessly. But this is not really possible, because the elements of a List[Dynamic]
and a List[Int]
have different representations. You would have to produce a new list of integers from the original one, unwrapping every element of the original list out of its Dynamic
container. The same would happen if you wanted to pass a List[Int]
to a function expecting a List[Dynamic]
.
All of this may be workable, but it is a different experience from regular gradual typing where you expect this sort of mixing and matching of static and dynamic code to just work.
[Addendum (2020-05-31): On the other hand, if I had an ahead-of-time statically-typed compiled programming language that allowed me to toss around types like this, including allowing user-defined records like Dynamic
, that would be really cool.]
That's all I have for today, folks. In a future post, I intend to explore how interfaces work in a variety of different languages.
Hello, fellow readers! In this post, I will try to write down some ideas that have been haunting me about types, methods and namespaces in Fenius.
I should perhaps start with the disclaimer that nothing has really happened in Fenius development since last year. I started rewriting the implementation in Common Lisp recently, but I only got to the parser so far, and the code is still not public. I have no haste in this; life is already complicated enough without one extra thing to feel guilty about finishing, and the world does not have a pressing need for a new programming language either. But I do keep thinking about it, so I expect to keep posting ideas about programming language design here more or less regularly.
A year ago, I pondered whether to choose noun-centric OO (methods belong to classes, as in most mainstream OO languages) or verb-centric OO (methods are independent entities grouped under generic functions, as in Common Lisp). I ended up choosing noun-centric OO, mostly because classes provide a namespace grouping related methods, so:
This choice has a number of problems, though; it interacts badly with other features I would like to have in Fenius. Consider the following example:
Suppose I have a bunch of classes that I want to be able to serialize to JSON. Some of these classes may be implemented by me, so I can add a to_json()
method to them, but others come from third-party code that I cannot change. Even if the language allows me to add new methods to existing classes, I would rather not add a to_json()
method to those classes because they might, in the future, decide to implement their own to_json()
method, possibly in a different way, and I would be unintentionally overriding the library method which others might depend on.
What I really want is to be able to declare an interface of my own, and implement it in whatever way I want for any class (much like a typeclass in Haskell, or a trait in Rust):
from third_party import Foo interface JSONSerializable { let to_json() } implement JSONSerializable for Foo { let to_json() = { ... } }
In this way, the interface serves as a namespace for to_json()
, so that even if Foo
implements its own to_json()
in the future, it would be distinct from the one I defined in my interface.
The problem is: if I have an object x
of type Foo
and I call x.to_json()
, which to_json()
is called?
One way to decide that would be by the declared type of x
: if it's declared as Foo
, it calls Foo
's to_json()
, and JSONSerializable
's to_json()
is not even visible. If it's declared as JSONSerializable
, then the interface's method is called. The problem is that Fenius is supposed to be a dynamically-typed language: the declared (static) type of an object should not affect its dynamic behavior. A reference to an object, no matter how it was obtained, should be enough to access all of the object's methods.
One way to conciliate things would be to make it so that the interface wraps the implementing object. By this I mean that, if you have an object x
of type Foo
, you can call JSONSerializable(x)
to get another object, of type JSONSerializable
, that wraps the original x
, and provides the interface's methods.
Moreover, function type declarations can be given the following semantics: if a function f
is declared as receiving a parameter x: SomeType
, and it's called with an argument v
, x
will be bound to the result of SomeType.accept(v)
. For interfaces, the accept
method returns an interface wrapper for the given object, if the object belongs to a class implementing the interface. Other classes can define accept
in any way they want to implement arbitrary casts. The default implementation for class.accept(v)
would be to return v
intact if it belongs to class
, and raise an exception if it doesn't.
Another option is to actually go for static typing, but in a way that still allows dynamic code to co-exist more or less transparently with it.
In this approach, which methods are visible in a given dot expression x.method
is determined by the static type of x
. One way to see this is that x
can have multiple methods, possibly with the same name, and the static type of x
acts like a lens filtering a specific subset of those methods.
What happens, then, when you don't declare the type of the variable/parameter? One solution would be implicitly consider those as having the basic Object
type, but that would make dynamic code extremely annoying to use. For instance, if x
has type Object
, you cannot call x+1
because +
is not defined for Object
.
Another, more interesting solution, is to consider any untyped function parameter as a generic. So, if f(x)
is declared without a type for x
, this is implicitly equivalent to declaring it as f(x: A)
, for a type variable A
. If this were a purely static solution, this would not solve anything: you still cannot call addition on a generic value. But what if, instead, A
is passed as a concrete value, implicitly, to the function? Then our f(x: A)
is underlyingly basically f(x: A, A: Type)
, with A
being a type value packaging the known information about A
. When I call, for instance, f(5)
, under the hood the function is called like f(5, Int)
, where Int
packages all there is to know about the Int
type, including which methods it supports. Then if f
's body calls x+1
, this type value can be consulted dynamically to look up for a +
method.
Has this been done before? Probably. I still have to do research on this. One potential problem with this is how the underlying interface of generic vs. non-generic functions (in a very different sense of 'generic function' from CLOS!) may differ. This is a problem for functions taking functions as arguments: if your function expects an Int -> Int
function as argument and I give it a A -> Int
function instead, that should work, but underlyingly an A -> Int
takes an extra argument (the A
type itself). This is left as an exercise for my future self.
One very interesting aspect of this solution is that it's basically the opposite of typical gradual typing implementations: instead of adding static types to a fundamentally dynamic language, this adds dynamic powers to a fundamentally static system. All the gradual typing attempts I have seen so far try to add types to a pre-existing dynamic language, which makes an approach like this one less palatable since one wants to be able to give types to code written in a mostly dynamic style, including standard library functions. But if one is designing a language from scratch, one can design it in a more static-types-friendly way, which would make this approach more feasible.
I wonder if better performance can be achieved in this scenario, since in theory the static parts of the code can happily do their stuff without ever worrying about dynamic code. I also wonder if boxing/unboxing of values when passing them between the dynamic and static parts of the code can be avoided as well, since all the extra typing information can be passed in the type parameter instead. Said research, as always, will require more and abundant funding.
Fenius now has syntax for functional record updates! Records now have a with(field=value, …) method, which allows creating a new record from an existing one with only a few fields changed. For example, if you have a record:
fenius> record Person(name, age) <class `Person`> fenius> let p = Person("Hildur", 22) Person("Hildur", 22)
You can now write:
fenius> p.with(age=23)
Person("Hildur", 23)
to obtain a record just like p but with a different value for the age field. The update is functional in the sense that the p is not mutated; a new record is created instead. This is similar to the with() method in dictionaries.
Another new trick is that now records can have custom printers. Printing is now performed by calling the repr_to_port(port) method, which can be overridden by any class. Fenius doesn't yet have much of an I/O facility, but we can cheat a bit by importing the functions from the host Scheme implementation:
fenius> record Point(x, y) <class `Point`> fenius> import chezscheme # Define a new printing function for points. fenius> method Point.repr_to_port(port) = { chezscheme.fprintf(port, "<~a, ~a>", this.x, this.y) } # Now points print like this: fenius> Point(1, 2) <1, 2>
A native I/O API is coming Real Soon Now™.
I'm a bit tired today, so the post will be short.
ready? go!
In Scheme, it is conventional for procedures returning booleans to have names ending in ?
(e.g., string?
, even?
), and for procedures which mutate their arguments to have names ending in !
(e.g., set-car!
, reverse!
). This convention has also been adopted by other languages, such as Ruby, Clojure and Elixir.
I like this convention, and I've been thinking of using it in Fenius too. The problem is that ?
and !
are currently operator characters. ?
does not pose much of a problem: I don't use it for anything right now. !
, however, is a bit of a problem: it is part of the !=
(not-equals) operator. So if you write a!=b
, it would be ambiguous whether the !
should be interpreted as part of an identifier a!
, or as part of the operator !=
. So my options are:
Consider !
exclusively as an identifier character. This would mean changing !=
to something else, like /=
(as in Haskell or Common Lisp) or <>
(as in Pascal) or ~=
(as in Lua). But !=
is familiar to a lot of people, and I'm not sure it's a good idea to change it.
Consider !
exclusively as an operator character. Then the !
convention is gone and has to be replaced by something else. Currently, when you import a Scheme module into Fenius, the names are converted such that names ending in ?
are prefixed with is
(e.g., even?
becomes isEven
), and names ending in !
are prefixed with do
(e.g., reverse!
becomes doReverse
). [Update: I will be switching to snake_case
rather than camelCase
soon.]
Consider !
as an identifier character if it follows with other identifier characters, but as an operator character when not preceded by other identifier characters (in much the same way as digits are considered part of an identifier if they immediately follow other identifier characters, but as numbers if not). So you would have to write a != b
with spaces instead. I don't like this very much.
What do you think? Which of these you like best? Do you have other ideas? Feel free to comment.
In other news, I started to make available a precompiled Fenius binary (amd64 Linux), which you can try out without having to install Chez Scheme first. You should be aware that the interpreter is very brittle at this stage, and most error messages are in terms of the underlying implementation rather than something meaningful for the end user, so use it at your own peril. But it does print stack traces in terms of the Fenius code, so it's not all hopeless.
Copyright © 2010-2024 Vítor De Araújo
O conteúdo deste blog, a menos que de outra forma especificado, pode ser utilizado segundo os termos da licença Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
Powered by Blognir.