Elmord's Magic Valley

Computers, languages, and computer languages. Às vezes em Português, sometimes in English.

Switching to the i3 window manager

2020-07-19 16:00 +0100. Tags: comp, unix, wm, emacs, in-english

After almost 3 years using EXWM as my window manager, I decided to give i3 a try. And after two weeks using it, I have to say I'm definitely sticking with it.

I had actually tried i3 years ago, but I had never used a tiling window manager before, and for some reason it didn't click for me at the time. This time, after a long time using EXWM (which I picked up more easily at the time since all the commands were the regular Emacs window/buffer commands I already knew), i3 was quite easy to pick up. So here I am.

Why not EXWM?

EXWM has a lot going for it, mainly from the fact of running in Emacs, and therefore benefitting from the general powers that all things built on Emacs have: it's eminently hackable and customizable (and you can generally see the results of your hacks without even restarting it), and can be integrated in your Emacs workflow in various ways (I gave some examples in my previous EXWM post).

However, it also has some drawbacks. EXWM does not really do much in the way of managing windows: essentially, EXWM just turns all your windows into Emacs buffers, and the window management tasks proper (splitting, deciding which Emacs window will display a new X window, etc.) is the built-in window management Emacs uses for its own windows. Which can of course be customized to death, but is not particularly great for large numbers of windows, in my opinion.

Another problem with EXWM is that if Emacs hangs for any reason (e.g., waiting for TRAMP to open a remote SSH file, or syntax highlighting choking on an overly long line in a JSON file or a Python shell), your whole graphical session freezes, because EXWM does not get an opportunity to react to X events while Emacs is hung doing other stuff. This will happen more or less often depending on the kinds of tasks you do with Emacs. (Also, if you have to kill Emacs for any reason, you kill your entire graphical session, though this can be avoided by starting Emacs like this in your .xsession:

until emacs; do :; done

an idea I wish I had had earlier.)

Finally, EXWM is glitchy. Those glitches don't manifest too often, and it's hard to separate the glitches that come naturally with it from the ones caused by my own hacks, but the fact is that I got tired of the glitchiness and hanging, and also I was lured by i3's tabs support, so I decided to switch.

First steps into i3

The first time you start i3, it presents you with a dialog asking whether you want to use Alt or Win (the 'Windows' key, a.k.a. Super) as the modifier key for i3 shortcuts. I recommend choosing Super here, since it will avoid conflicts with shortcuts from applications. i3 will then generate a config file at ~/.config/i3/config.

The generated config file contains all the default keybindings; there are no extra keybindings other than those listed in this file. This is good because you can peruse the config file to have a general idea of what keybindings exist and how their corresponding commands are expressed. That being said, the i3 User's Guide is quite good as well, and you should at least skim over it to get an idea of i3's abilities.

One peculiar thing about the standard keybindings is the use of Super+j/k/l/; to move to the window to the left, down, up, and right, respectively. That's shifted one key to the right of the traditional h/j/k/l movement commands used by Vim and some other programs. The documentation justifies this as being easier to type without moving away from the home row (and also, Super+h is used to set horizontal window splitting), but I ended up changing this to Super+h/j/k/l simply for the convenience of having bindings similar to what other applications use (and then moving horizontal splitting to Super+b, right beside Super+v for vertical splitting).

Unlike EXWM or some other window managers, the i3 config file is not a full-fledged programming language, so it's not as flexible as those other WMs. However, i3 has a trick up its sleeve: the i3-msg program, which allows sending commands to a running i3. Thanks to i3-msg, you can do tasks that require more of a programming language (e.g., conditional execution) by writing small shell scripts. For example, I have a script called jump-to-terminal.sh, which is just:

#!/bin/bash
i3-msg '[class="terminal"] move container to workspace current, focus' | grep -q true || x-terminal-emulator

i.e., try to find an open terminal window and move it to the current workspace; if the operation does not succeed (because there is no open terminal window), open a new terminal. I can then bind this script to a shortcut in the i3 config file. (I've actually changed script later to not move the window to the current workspace, but it shows how you can string multiple i3 commands together applying to the same window.)

Containers, tabs, and more

i3 uses the concept of containers to organize windows: windows themselves are containers (containing the actual X11 window), but the whole workspace is itself a container that can be split into multiple subcontainers that can be arbitrarily nested. Containers can either use a split layout (windows are tiled horizontally or vertically within the container), or a tabbed layout (i3 shows a tab bar at the top of the container, and each contained window is a tab), or a stacked layout (which is the same as the tabbed layout but the tab titles are placed in separate lines rather than side-by-side). You can switch the layout of the current container with the shortcuts Super+w (tabbed), Super+s (stacked), and Super+e (split, toggling between horizontal and vertical tiling).

(Note that what i3 calls "horizontal split container" is a split container with horizontal tiling orientation, i.e., windows are laid out side-by-side. This can be confusing if you expect "horizontal split" to mean that the splitting line will be horizontal. This is the same terminology that Emacs uses for window splitting, but the opposite of Vim.)

Containers can be arbitrarily nested, and you can have different layouts in each subcontainer. For example, you could have your workspace divided into two horizontally-tiled containers, and have a tabbed layout in one of the subcontainers. Note that because of this, it's important to know which container you have selected when you use the layout-changing commands. The colors of the borders tell you that, but it takes a while to get used to paying attention to it. i3 comes with a pre-defined binding Super+a to select the parent of the current container, but not one to select a child; I have found it useful to bind Super+z to focus child for this purpose.

Unnesting containers

The commands Super+v and Super+h (Super+b in my modified keymap) select a tiling orientation for new windows opened in the current container. (Again, the border colors tell you which mode is active.) It implicitly turns the current container into a nested container, so that new windows will become siblings of the current window. It is very easy to create nested containers by accident in this way, especially when you are just starting with i3. Those show up like i3: H[emacs] in the window title (i.e., a horizontally-tiled container containing just an emacs window), and you can even get into multiple levels of nested containers with a single window inside. In these situations, it is useful to have a command to move the current window back to its parent container. Surprisingly, i3 does not have a built-in command for that, but it is possible to concoct one from existing commands (based on this StackExchange answer):

# Move container to parent
bindsym $mod+Shift+a mark subwindow; focus parent; focus parent; mark parent; [con_mark="subwindow"] focus; move window to mark parent; [con_mark="subwindow"] focus; unmark

What this does is to use i3 marks (which are like Vim marks, allowing you to assign labels to windows) to mark the current window and its parent's parent, and then moving the window to inside its parent's parent (i.e., it becomes a sibling of its current parent).

The status bar

In EXWM, I had recently implemented a hack to display desktop notifications in the Emacs echo area. I hate desktop notifications appearing on the top of what I'm doing (especially when I'm coding), and I had most of them disabled for this reason until recently, but Slack notifications are useful to see at work. With this hack, I could finally have non-obtrusive desktop notifications. I was only going to switch to i3 if I could find a way to have similar functionality in it.

i3 does not exactly have an echo line, but it does have a desktop bar which shows your workspaces to the left, tray icons to the right, and the output of a status command in the middle. The status command can be any command you want, and the status line shows the last line of output the command has printed so far, so the command can keep updating it. i3bar actually supports two output formats: a plain-text one in which every line is displayed as-is in the status line, and a JSON-based format which allows specifying colors, separators and other features in the output.

This means that you can write a script to listen for D-Bus desktop notifications and print them as they come, together with whatever else you want in the status line (such as a clock and battery status), and blanking them after a while, or when a 'close notification' message is received. I have done just that, and it works like a charm. (It requires python3-pydbus to be installed.) The only problem with this is that the content of the status line is aligned to the right (because it is meant to be used for a clock and stuff like that), and there is no way to make it aligned to the left, so I actually pad the message to be shown with spaces to a length that happens to fit my monitor. It is sub-optimal, but it works well enough.

Conclusion

I'm pretty happy with the switch to i3. Although I've lost the deep integration with Emacs, it has actually been an improvement even for my Emacs usage, since i3 tabs supplement Emacs's lack of tabs better than any tabbing package I have seen for Emacs. (Having tabs for all programs, including things like Evince, is really nice.) If you are interested in tiling window managers and are willing to spend a few days getting used to it, I definitely recommend it.

2 comentários / comments

Type parameters and dynamic types

2020-05-30 17:27 +0100. Tags: comp, prog, pldesign, fenius, in-english

In the previous post, I discussed an idea I had for handling dynamic typing in a primarily statically-typed language. In this post, I intend to first, describe the idea a little better, and second, explain what are the problems with it.

The idea

The basic idea is:

For example, consider a function signature like:

let f[A, B](arg1: Int, arg2: A, arg3: B, arg4): Bool = ...

This declares a function f with two explicit type parameters A and B, and four regular value parameters arg1 to arg4. arg1 is declared with a concrete Int type. arg2 and arg3 are declared as having types passed in as type parameters. arg4 does not have an explicit type, so in effect it behaves as if the function had an extra type parameter C, and arg4 has type C.

When the function is called, the type arguments don't have to be passed explicitly; rather, they will be automatically provided by the types of the expressions used as arguments. So, if I call f(42, "hello", 1.0, True), the compiler will implicitly pass the types Str and Float for A and B, as well as Bool for the implicit type parameter C.

In the body of f, whenever the parameters with generic types are used, the corresponding type parameters can be consulted at run-time to find the approprate methods to call. For example, if arg2.foo() is called, a lookup for the method foo inside A will happen at run-time. This lookup might fail, in which case we would get an exception.

This all looks quite beautiful.

The problem

The problem is when you introduce generic data structures into the picture. Let's consider a generic list type List[T], where T is a type parameter. Now suppose you have a list like [42, "hello", 1.0, True] (which you might have obtained from deserializing a JSON file, for instance). What type can T be? The problem is that, unlike the case for functions, there is one type variable for multiple elements. If all type information must be encoded in the value of the type parameter, there is no way to handle a heterogeneous list like this.

Having a union type here (say, List[Int|Str|Float|Bool]) will not help us, because union types require some way to distinguish which element of the union a given value belongs to, but the premise was for all type information to be carried by the type parameter so you could avoid encoding the type information into the value.

For a different example, consider you want to have a list objects satisfying an interface, e.g., List[JSONSerializable]. Different elements of the list may have different types, and therefore different implementations of the interface, and you would need type information with each individual element to be able to know at run-time where to find the interface implementation for each element.

Could this be worked around? One way would be to have a Dynamic type, whose implementation would be roughly:

record Dynamic(
    T: Type,
    value: T,
)

The Dynamic type contains a value and its type. Note that the type is not declared as a type parameter of Dynamic: it is a member of Dynamic. The implication is that a value like Dynamic(Int, 5) is not of type Dynamic[Int], but simply Dynamic: there is a single Dynamic type container which can hold values of any type and carries all information about the value's type within itself. (I believe this is an existential type, but I honestly don't know enough type theory to be sure.)

Now our heterogeneous list can simply be a List[Dynamic]. The problem is that to use this list, you have to wrap your values into Dynamic records, and unwrap them to use the values. Could it happen implicitly? I'm not really sure. Suppose you have a List[Dynamic] and you want to pass it to a function expecting a List[Int]. We would like this to work, if we want static and dynamic code to run along seamlessly. But this is not really possible, because the elements of a List[Dynamic] and a List[Int] have different representations. You would have to produce a new list of integers from the original one, unwrapping every element of the original list out of its Dynamic container. The same would happen if you wanted to pass a List[Int] to a function expecting a List[Dynamic].

All of this may be workable, but it is a different experience from regular gradual typing where you expect this sort of mixing and matching of static and dynamic code to just work.

[Addendum (2020-05-31): On the other hand, if I had an ahead-of-time statically-typed compiled programming language that allowed me to toss around types like this, including allowing user-defined records like Dynamic, that would be really cool.]

EOF

That's all I have for today, folks. In a future post, I intend to explore how interfaces work in a variety of different languages.

4 comentários / comments

Types and Fenius

2020-05-19 21:35 +0100. Tags: comp, prog, pldesign, fenius, in-english

Hello, fellow readers! In this post, I will try to write down some ideas that have been haunting me about types, methods and namespaces in Fenius.

I should perhaps start with the disclaimer that nothing has really happened in Fenius development since last year. I started rewriting the implementation in Common Lisp recently, but I only got to the parser so far, and the code is still not public. I have no haste in this; life is already complicated enough without one extra thing to feel guilty about finishing, and the world does not have a pressing need for a new programming language either. But I do keep thinking about it, so I expect to keep posting ideas about programming language design here more or less regularly.

So, namespaces

A year ago, I pondered whether to choose noun-centric OO (methods belong to classes, as in most mainstream OO languages) or verb-centric OO (methods are independent entities grouped under generic functions, as in Common Lisp). I ended up choosing noun-centric OO, mostly because classes provide a namespace grouping related methods, so:

This choice has a number of problems, though; it interacts badly with other features I would like to have in Fenius. Consider the following example:

Suppose I have a bunch of classes that I want to be able to serialize to JSON. Some of these classes may be implemented by me, so I can add a to_json() method to them, but others come from third-party code that I cannot change. Even if the language allows me to add new methods to existing classes, I would rather not add a to_json() method to those classes because they might, in the future, decide to implement their own to_json() method, possibly in a different way, and I would be unintentionally overriding the library method which others might depend on.

What I really want is to be able to declare an interface of my own, and implement it in whatever way I want for any class (much like a typeclass in Haskell, or a trait in Rust):

from third_party import Foo

interface JSONSerializable {
    let to_json()
}

implement JSONSerializable for Foo {
    let to_json() = {
         ...
    }
}

In this way, the interface serves as a namespace for to_json(), so that even if Foo implements its own to_json() in the future, it would be distinct from the one I defined in my interface.

The problem is: if I have an object x of type Foo and I call x.to_json(), which to_json() is called?

One way to decide that would be by the declared type of x: if it's declared as Foo, it calls Foo's to_json(), and JSONSerializable's to_json() is not even visible. If it's declared as JSONSerializable, then the interface's method is called. The problem is that Fenius is supposed to be a dynamically-typed language: the declared (static) type of an object should not affect its dynamic behavior. A reference to an object, no matter how it was obtained, should be enough to access all of the object's methods.

Solution 1: Interface wrappers

One way to conciliate things would be to make it so that the interface wraps the implementing object. By this I mean that, if you have an object x of type Foo, you can call JSONSerializable(x) to get another object, of type JSONSerializable, that wraps the original x, and provides the interface's methods.

Moreover, function type declarations can be given the following semantics: if a function f is declared as receiving a parameter x: SomeType, and it's called with an argument v, x will be bound to the result of SomeType.accept(v). For interfaces, the accept method returns an interface wrapper for the given object, if the object belongs to a class implementing the interface. Other classes can define accept in any way they want to implement arbitrary casts. The default implementation for class.accept(v) would be to return v intact if it belongs to class, and raise an exception if it doesn't.

Solution 2: Static typing with dynamic characteristics

Another option is to actually go for static typing, but in a way that still allows dynamic code to co-exist more or less transparently with it.

In this approach, which methods are visible in a given dot expression x.method is determined by the static type of x. One way to see this is that x can have multiple methods, possibly with the same name, and the static type of x acts like a lens filtering a specific subset of those methods.

What happens, then, when you don't declare the type of the variable/parameter? One solution would be implicitly consider those as having the basic Object type, but that would make dynamic code extremely annoying to use. For instance, if x has type Object, you cannot call x+1 because + is not defined for Object.

Another, more interesting solution, is to consider any untyped function parameter as a generic. So, if f(x) is declared without a type for x, this is implicitly equivalent to declaring it as f(x: A), for a type variable A. If this were a purely static solution, this would not solve anything: you still cannot call addition on a generic value. But what if, instead, A is passed as a concrete value, implicitly, to the function? Then our f(x: A) is underlyingly basically f(x: A, A: Type), with A being a type value packaging the known information about A. When I call, for instance, f(5), under the hood the function is called like f(5, Int), where Int packages all there is to know about the Int type, including which methods it supports. Then if f's body calls x+1, this type value can be consulted dynamically to look up for a + method.

Has this been done before? Probably. I still have to do research on this. One potential problem with this is how the underlying interface of generic vs. non-generic functions (in a very different sense of 'generic function' from CLOS!) may differ. This is a problem for functions taking functions as arguments: if your function expects an Int -> Int function as argument and I give it a A -> Int function instead, that should work, but underlyingly an A -> Int takes an extra argument (the A type itself). This is left as an exercise for my future self.

Gradual typing in reverse

One very interesting aspect of this solution is that it's basically the opposite of typical gradual typing implementations: instead of adding static types to a fundamentally dynamic language, this adds dynamic powers to a fundamentally static system. All the gradual typing attempts I have seen so far try to add types to a pre-existing dynamic language, which makes an approach like this one less palatable since one wants to be able to give types to code written in a mostly dynamic style, including standard library functions. But if one is designing a language from scratch, one can design it in a more static-types-friendly way, which would make this approach more feasible.

I wonder if better performance can be achieved in this scenario, since in theory the static parts of the code can happily do their stuff without ever worrying about dynamic code. I also wonder if boxing/unboxing of values when passing them between the dynamic and static parts of the code can be avoided as well, since all the extra typing information can be passed in the type parameter instead. Said research, as always, will require more and abundant funding.

Comentários / Comments

Irish nouniness

2020-04-10 23:21 +0100. Tags: lang, in-english

At work, when we make changes to some code, we make a pull request and send the link for others to review the changes, usually accompanied by a message such as "Can you review?", "Please review", or other variations depending on how inspired we are feeling on each particular occasion. A couple of weeks ago, I considered sending the message in Irish just for fun. I barely know any Irish, though, so I resorted to Google Translate. Upon entering the supposedly simple sentence "Can you review?" into it, however, I was confronted with the following translation:

An féidir leat athbhreithniú a dhéanamh?

And I was twice perplexed. At first, I was surprised because I got more words than I was expecting to get back. But the more I stared at this sentence, the more perplexed I got, for a different reason: I couldn't identify the verbs in it, at least not in the form and places I expected to see them.

Let's go through this sentence in order. 'An' is, among other things, an interrogative particle. Usually the verb is the first thing in the sentence in Irish, but the interrogative particle (and other question words), if present, comes before the verb. The next word, 'féidir', however, is not a verb: it is a noun meaning 'ability, possibility'. You can see where this is going: instead of using a verb like can, we phrase the question in terms of the ability to do something. So, no verb involved here. But whose ability?

The next word is 'leat', which is an inflected preposition: it is the preposition 'le' ('with') in the second person singular (you).1 This is actually part of a possessive construction in Irish: Irish does not have a verb to have; one way (among many) to express possession is to say that the possessed thing is 'with' the possessor. So, 'you have the ability' translates as 'is féidir leat', meaning literally '(there) is ability with you'. When asking a question, though, the question word 'an' is added and the verb 'is' can be omitted. So, 'an féidir leat?' = 'is there ability with you?' = 'can you?'.

The next word is 'athbhreithniú'. This is a noun meaning 'review', the verbal noun corresponding to the verb 'athbhreithnigh' meaning 'to review'. A verbal noun is a noun describing the action denoted by a verb. This is similar to a gerund in English: for the verb to sing, there is the corresponding action of singing. There is an important difference between an English-style gerund and a Celtic-style verbal noun, though: while a gerund can take an object, as in "I like singing that song", a verbal noun cannot: instead, you have to phrase it as something akin to "I like the singing of that song", introducing the object with a genitive/possessive construction. To give another example, the verbal noun for the verb construct would be less like constructing ("constructing houses takes time") and more like construction ("the construction of houses takes time"). Irish does not have gerunds or infinitives: it does everything with verbal nouns.

The next word is 'a', and I don't really know what this little word is doing here, though I have some hunches.2 I will ignore it for now.

Finally, the last word is 'dhéanamh'. That's the verbal noun of 'déan', meaning to do. So the sentence is actually phrased in terms of doing a review, rather than reviewing directly.3 But even to do is presented as a verbal noun here, so it's actually the doing of a review.

So the full sentence:

An féidir leat athbhreithniú a dhéanamh?

if translated literally, would be:

Is there possibility with you of the doing of a review?

And since the verb is is omitted after the question particle, there is no verb (other than verbal nouns) in this sentence.

Nouns everywhere

Irish reliance on verbal nouns means that sentences tend to look much more nouny in Irish than in the average Indo-European language. But that's only one element of it. Irish uses nouns in many other constructions where we tend to expect verbs in European languages. States of mind seem to be particularly prone to be rendered as nouns rather than verbs in Irish. For instance (and with the help of Google Translate, so take these translations with a grain of salt):

Some of these have verbal equivalents, but the noun form seems to be preferred.

Conclusion

The conclusion is that Irish, and Celtic languages in general, seem to love nouns far more than their non-Celtic Indo-European cousins. In the grand scheme of the world's languages, this may not be too weird: some languages are known to lean towards the verby side of things, so it should not come as too great a surprise that some lean towards nouns. Nevertheless, in the context of Indo-European languages, the Celtic family really seems to stand out in its penchant for nouniness.

_____

1 If you know Portuguese or Spanish, this is similar to how these languages have a special form of the preposition com/con (with) combined with the personal pronouns: comigo/conmigo (with me), contigo (with you), etc. The difference is that in Irish, every preposition has inflected forms for each person.

2 If I understand correctly, what we have here is a cleft sentence construction. A cleft sentence is when you turn a sentence like I wrote the book into it was me who wrote the book. Basically you bring a bit of the sentence into focus (in this case, me) by moving everything else into a subclause. English does this for emphasis (in our example, to emphasize it was me and not anyone else). Some languages, like Portuguese, regularly do this for questions: como é que eu chego no aeroporto? (lit. how is it that I arrive in the airport?). And in Irish this seems to be mandatory in a variety of situations I don't understand very well at all.

3 After I wrote this, I realized that that's because the original sentence lacks an object, which actually sounds weird even in English ("can you review what?"), so Translate probably rendered it as "doing a review" as a way to get rid of the missing object. If I enter "Can you review it?" instead, Translate gives "An féidir leat é a athbhreithniú?", without resorting to "dhéanamh".

1 comentário / comment

#255

2020-03-19 22:53 +0000. Tags: about, life, mind, ramble, in-english

It's been a while since I last posted here. Many things have happened in the meantime, including a global pandemic. But also a move to Lisbon, a new job, and meeting lots of great people. Not a bad beginning of year, on balance.

Life has not been quite the piece of cake either. Every person has their problems, ghosts and flaws that we have to learn to overcome and to live with – those things are not mutually exclusive, really. Turns out you can practice both self-compassion and self-improvement.

This blog has been around for 8 years now. I'm glad it has lasted so long, and I hope it lasts yet longer. My presence in various social networks has waxed and waned over the years, but this blog (and the website as a whole) has been my permanent corner on the Internet. A consistent bit of life (and a good one at that) across changes in residence (six, if I count correctly), jobs, moving from academia to industry, and now moving from one country to another. Readership has also been relatively consistent – small but consistent – and I'm glad to have you fellow readers to share stuff with.

I hope to start posting more frequently here again. I'll be happy if I succeed in posting at least once or twice per month, but I won't promise any numbers.

As the first post of 2020, I'll finish with a late new year, but timely equinox resolution:

To not live ruled by fear.

Wish everyone a happy equinox, and may the sun shine for us in complicated times.

1 comentário / comment

Chez Scheme vs. SBCL: a comparison

2019-11-14 11:06 -0300. Tags: comp, prog, lisp, scheme, in-english

Back at the beginning of the year, when I started working on what would become Fenius (which I haven't worked on for a good while now; sorry about that), I was divided between two languages/platforms for writing the implementation: Chez Scheme (a Scheme implementation) and SBCL (a Common Lisp implementation). I ended up choosing Chez Scheme, since I like Scheme better. After using Chez for a few months now, however, I've been thinking about SBCL again. In this post, I ponder the two options.

Debugging and interactive development

The main reason I've been considering a switch is this: my experience with interactive development with Chez has been less than awesome. The stack traces are uninformative: they don't show the line of code corresponding to each frame (rather, they show the line of code of the entire function, and only after you ask to enter debug mode, inspect the raise continuation, and print the stack frames), and you can't always inspect the values of parameters/local variables. The recommended way to debug seems to be to trace the functions you want to inspect; this will print each call to the function (with arguments) and the return value of each call. But you must do it before executing the function; it won't help you interpret the stack trace of an exception after the fact.

The interaction between Chez and Geiser (an Emacs mode for interactive development with Scheme) often breaks down too: sometimes, trying to tab-complete an identifier will hang Emacs. From my investigation, it seems that what happens is that the Chez process will enter the debugger, but Geiser is unaware of that and keeps waiting for the normal > prompt to appear. Once that happens, it's pretty much stuck forever (you can't tab-complete anymore) until you restart Chez. There is probably a solution to this; I just don't know what it is.

As I have mentioned before, Chez has no concept of running the REPL from inside a module (library in Scheme terminology), which means you can't call the private functions of a module from the REPL. The solution is… not to use modules, or to export everything, or split the code so you can load the module code without the module-wrapping form.

By contrast, SBCL works with SLIME, the Superior Lisp Interaction Mode for Emacs. SLIME lets you navigate the stack trace, see the values of local variables by pressing TAB on the frame, you can press v to jump to the code corresponding to a stack frame (right to the corresponding expression, not just the line), among other features. Common Lisp is more committed to interactive development than Scheme in general, so this point is a clear win for SBCL.

(To be fair, Guile Scheme has pretty decent support for interactive development. However, Guile + Geiser cannot do to stack traces what SBCL + SLIME can.)

Execution speed

In my experience, SBCL and Chez are both pretty fast – not at the "as fast as hand-crafted C" level, but pretty much as fast as I could desire for a dynamically-typed, garbage-collected, safe language. In their default settings, Chez actually often beats SBCL, but SBCL by default generates more debugger-friendly code. With all optimizations enabled, Chez and SBCL seem to be pretty much the same in terms of performance.

One advantage SBCL has is that you can add type annotations to code to make it faster. Be careful with your optimization settings, though: if you compile with (optimize (safety 0)), "[a]ll declarations are believed without assertions", i.e., the compiler will generate code that assumes your types are correct, and will produce undefined behavior (a.k.a. nasal demons) in case it is not.

Startup time and executable size

This one is complicated. In my machine, Chez compiles a "hello world" program to a 837-byte .so file, which takes about 125ms to run – a small but noticeable startup time. A standalone binary compiled with chez-exe weighs in at 2.7MB and takes 83ms – just barely noticeable.

As for SBCL, a "hello world" program compiles to a 228-byte .fasl file, which runs in 15ms, which is really good. The problem is if the file loads libraries. For instance, if I add this to the beginning of the "hello world":

(require 'asdf)        ;; to be able to load ASDF packages
(require 'cl-ppcre)    ;; a popular regex library

…now the script takes 422ms to run, which is very noticeable.

SBCL can also generate standalone executables, which are actually dumps of the whole running SBCL image: you can load all the libraries you want and generate an executable with all of them preloaded. If we do that, we're back to the excellent 15ms startup time – but the executable has 45MB, because it contains a full-fledged SBCL in it (plus libraries). It's a bit of a bummer if you intend to create multiple independent command-line utilities, for example. Also, I guess it's easier to convince people to download a 2.7MB file than a 45MB one when you want them to try out your fancy new application, though that may not be that much of a concern these days. (The binary compresses down to 12MB with gzip, and 7.6MB with xz.)

Another worry I have is memory consumption (which is a concern in cheap VPSes such as the one running this blog, for instance): running a 45MB binary will use at least 45MB of RAM, right? Well, not necessarily. When you run an executable, the system does not really load all of the executable's contents into memory: it maps the code (and other) sections of the executable into memory, but they will actually only be loaded from the disk to RAM as the memory pages are touched by the process. This means that most of those 45MB might never actually take up RAM.

In fact, using GNU time (not the shell time builtin, the one in /usr/bin, package time on Debian) to measure maximum memory usage, the SBCL program uses 19MB of RAM, while the Chez program uses 27MB. So the 45MB SBCL binary is actually more memory-friendly than the 2.7MB Chez one. Who'd guess?

Available libraries

Common Lisp definitely has the edge here, with over a thousand libraries (of varying quality) readily available via Quicklisp. There is no central repository or catalog of Chez (or Scheme) libraries, and there are not many Chez libraries that I'm aware of (although I wish I had learned about Thunderchez earlier).

[Addendum (2019-11-16): @Caonima67521344 on Twitter points out there is the Akku package manager for Chez and other Schemes.]

The language itself

This one is a matter of personal taste, but I just like Scheme better than Common Lisp. I like having a single namespace for functions and variables (which is funny considering I was a big fan of Common Lisp's two namespaces back in the day), and not having to say funcall to call a function stored in a variable. I like false being distinct from the empty list, and for cdr of the empty list to be an error rather than nil. I like Scheme's binding-based modules better than Common Lisp's symbol-based packages (although Chez modules are annoying to use, as I mentioned before; Guile is better in this regard). Common Lisp's case insensitivity by default plus all standard symbols being internally uppercase is a bit annoying too. Scheme has generally more consistent names for things as well. I used to dislike hygienic macros back in the day, but nowadays, having syntax-case available to break hygiene when necessary, I prefer hygienic macros as well.

And yet… Common Lisp and Scheme aren't that different. Most of those things don't have a huge impact in the way I code. (Well, macros are very different, but anyway.) One things that does have an impact is using named let and recursion in Scheme vs. loops in Common Lisp: named let (similar to Clojure's loop/recur) is one of my favorite Scheme features, and I use it all the time. However, it is not difficult to implement named let as a macro in Common Lisp, and if you only care about tail-recursive named let (i.e., Clojure's loop/recur), it's not difficult to implement an efficient version of it in Common Lisp as a macro. Another big difference is call/cc (first class continuations) in Scheme, but I pretty much never use call/cc in my code, except possibly as escape continuations (which are equivalent to Common Lisp's return).

On the flip side, Common Lisp has CLOS (the Common Lisp Object System) in all its glory, with generic functions and class redefinition powers and much more. Guile has GOOPS, which provides many of the same features, but I'm not aware of a good equivalent for Chez.

Conclusion

As is usually the case when comparing programming languages/platforms, none of the options is an absolute winner in all regards. Still, for my use case and for the way I like to program, SBCL looks like a compelling option. I'll have to try it for a while and see how it goes, and tell you all about it in a future post.

6 comentários / comments

A bunch of things I learned while fighting androids

2019-11-10 23:06 -0300. Tags: comp, android, in-english

I recently had to bypass Android's Factory Reset Protection again, this time for a Samsung Galaxy J4. The procedure at the end turned out to be relatively simple (find a way to get to a browser from the initial screen, download a pair of APKs, finish the Google account login with a random Google account, uninstall the APKs). However, due to the circumstances I was operating in, I spent a lot of time figuring out how to share an internet connection from my laptop with a second Android phone so I could share it with the J4 using the second phone as a wi-fi hostspot. This post documents what I learned.

Bypassing FRP on the Samsung Galaxy J4

There are dozens of videos on YouTube explaining how to do it. I will summarize the information here.

Part 1: Getting to a browser

Part 2: Installing a bunch of APKs

That's it.

Internet sharing shenanigans

Sharing your Android phone's internet connection with your computer is pretty easy: you just enable USB tethering on the phone, and everything magically works (at worst, you have to call dhclient YOUR-USB-INTERFACE on Linux if you don't have NetworkManager running). Doing the opposite, i.e., sharing your (Linux) computer connection with the phone, has to be done manually. Here is how it goes (I'm assuming a rooted phone; mine runs LineageOS 14.1 (Android 7)):

Now, to share the internet connection:

We still have to set DNS. Android does not seem to have a resolv.conf file; I found multiple ways you can set DNS (using 1.0.0.1 and 1.1.1.1 as the DNS servers in the examples):

The last one is the only one that worked for me – and it requires two DNS servers as arguments.

By now, you should have a working internet connection on your phone (you can try it in the browser, for example).

If you want to share it with other devices via wi-fi, you can now enable Wi-Fi Hotspot on the phone. However, there is another weird thing here: for some reason, my phone would reject all DNS queries coming from the other devices. The 'solution' I ended up using was to redirect all requests to port 53 (DNS) coming from other devices to the DNS server I wanted to use:

iptables -t nat -A PREROUTING -p tcp --dport 53 -j DNAT --to-destination 1.0.0.1:53
iptables -t nat -A PREROUTING -p udp --dport 53 -j DNAT --to-destination 1.0.0.1:53

This will skip the Android builtin DNS server entirely, and send DNS requests directly to the desired DNS server.

Comentários / Comments

Random remarks on Old Chinese type-A/type-B syllables

2019-09-25 22:06 -0300. Tags: lang, old-chinese, in-english

Every now and then Academia.edu throws an interesting paper suggestion into my inbox. Today I got a paper titled A Hypothesis on the origin of Old Chinese pharyngealization, by Laurent Sagart and William Baxter (2016). [Note: the linked paper is a draft, not the final published article. I don't know if there is any difference in the content between the draft and the final version.]

This post contains some observations and impressions about the paper. I should note as a disclaimer that I'm not a specialist in Old Chinese at all; I'm just this random person on the internet who has read a bunch of papers and watched the videos of Baxter & Sagart's 2007 workshop I mentioned in a previous post. This post should be seen as my personal notes while trying to understand and thinking about this subject.

I have to say that despite being a huge fan of Baxter & Sagart's work, this paper did not convince me. In fact, it actually weakened a bit my previous belief in the B&S pharyngeal hypothesis. Anyway, here we go.

[P.S.: By the end of the post, I get re-convinced about the pharyngeal hypothesis. This post ended up very rambly.]

* * *

Old Chinese is traditionally reconstructed as having two types of syllables, called type A and type B. Type B syllables are characterized by having a /-j-/ medial in Middle Chinese; type A ones are characterized by the lack of such palatal medial. Traditional reconstructions (e.g., Karlgren's) reconstruct this /-j-/ back into Old Chinese. More recent reconstructions have put this in doubt, though.

These are some known facts about type A and type B syllables:

These are some post-Karlgren theories about the type A/B distinction:

Back in 2007 (at the B&S workshop), B&S notated type A syllables by doubling the initial, as a way to indicate them without committing to any particular realization (pharyngeal or otherwise), or to whether this was a feature of the initial or of the whole syllable. Norman considered pharyngealization to be a feature of the whole syllable, rather than the initial. B&S seems to have shifted towards considering it a feature of the initial; the paper in consideration here explicitly argues for pharyngealization to be a feature of the initial, coming from previous /Cʕ/ (consonant + pharyngeal) clusters. (The paper argues that "type-A and type-B syllables seem to rhyme with each other freely in Old Chinese poetry, which would be unexpected if pharyngealization was a feature of the rhyme as well as the onset". That is a good point, though it might just be that pharyngealization was not considered relevant in rhyming.)

In that system, every consonant has a plain and a pharyngealized version. At first sight, this looks a bit crazy, but that's not much different from, say, Irish having a palatalized and a velarized version of every consonant. There are some suspicious combinations, though; in particular, the pharyngealized glottal stop /ʔˤ/ does not look very convincing. It would not seem so problematic to me if pharyngealization were a feature of the syllable as a whole, since it would still be clearly articulated in the vowel; as a feature of the initial, it does not seem very likely.

The paper argues that these pharyngealized consonants come from pre-Old-Chinese /Cʕ/ clusters. The paper says these are clusters with a "pharyngeal fricative". One thing I just learned is that the /ʕ/ symbol can be used for either a pharyngeal fricative or an approximant; I only knew the approximant usage.

The paper further argues that these clusters come from previous CVʕV- syllables, i.e., the development was of the form:

nuʕup > nʕup > nˤup

It argues then that the corresponding long/short distinction in Lushai comes from the loss of the middle /ʕ/ and subsequent fusion of the identical vowels.

This is motivated by parallel developments in Austronesian and Austroasiatic. Proto-Austronesian seems to have had a constraint against single-syllable words: whenever a single-syllable CVC root would appear by itself (without affixes), it would surface as CV(ʔ)VC instead, with a long vowel 'interrupted' by a glottal stop, as a way to enforce the two-syllable constraint. The paper proposes the same constraint was present in a language ancestral to Proto-Sino-Tibetan. (It does not explicitly claim that this ancestral language would be the parent of Proto-Sino-Tibetan and Austronesian and Austroasiatic.) Syllables of the CV(ʔ)VC or CV(ʕ)VC type would then lead to pharyngealized, type A syllables in Old Chinese on the one hand, and long vowels after loss of the mid-consonant in Lushai.

I'm not very convinced by this idea. For one, there is no direct evidence for the /ʕ/ phoneme in Sino-Tibetan as far as I know. Of course, this was the same argument used against the laryngeal hypotheses in Proto-Indo-European during Saussure's lifetime, until Hittite was discovered which did partially preserve a laryngeal phoneme. The same could be true of the posited /ʕ/ phoneme. I'm not sure the case for /ʕ/ in Proto-Sino-Tibetan is as strong, though.

We are trying to account for a length distinction on one side, and type A/B on the other. Paper footnote 4 says: "Starostin accounted for the correlation by reconstructing a parallel length distinction for Old Chinese long vowels in type A and short vowels in type B. While this reconstruction makes sense of the apparent correlation with Lushai, there is no direct Chinese evidence for it, and it does not help explain the Hàn-time sound changes described above, which affected type A and type B differently."

The first thing to note is that there is no direct Chinese evidence for pharyngealized consonants either. Now for the Hàn-time sound changes referenced. I quote:

Inspired by the treatment in Norman (1994), Baxter and Sagart (2014) assign pharyngealization to OC type-A words, and absence of pharyngealization to type-B words. The main argument for reconstructing pharyngealization is a set of sound changes that occurred during the Hàn period (206 BCE – 220 CE), which affected type-A syllables and type-B syllables differently: original high vowels remain high in type-B syllables, but are lowered in type-A syllables; and original low vowels, which are raised in certain environments in type-B syllables, remain low in type-A syllables. Also, initial consonants often underwent palatalization in type-B syllables, but escaped such palatalization in type-A syllables. Reconstructing pharyngealization in the onset of type-A syllables seems to provide a plausible explanation for these differences, more so than any of the alternative proposals.1

Let's summarize the first part:

If we interpret A/B type as length, we would say that long vowels are lowered, and short vowels are raised. Would it be too crazy to consider length itself as the influencing factor? Vulgar Latin has also shifted vowels based on length; however, Vulgar Latin shows the opposite development: it is the short vowels that get lowered. Moreover, type A (i.e., long) prevents palatalization. Here we could perhaps make a better parallel to Vulgar Latin, where short /e/, /o/ become diphthongized /je/, /we/ (< /wo/) in Spanish. However, type B syllables palatalize regardless of the vowel, so the parallel breaks down again. Maybe B&S is right about pharyngealization after all.

The above quote has a footnote:

Moreover, at least one Hàn-dynasty commentator describes the difference between a type-A syllable and a type-B syllable by stating that the type-A syllable is “inside and deep” (nèi ér shēn 內而深), while the type-B syllable is “outside and shallow” (wài ér qiǎn 外而淺); “inside and deep” seems a natural way to describe the retraction of the tongue root that characterizes pharyngealization. See Baxter & Sagart (2014:72–73).

At the same time, Wikipedia has the following quote:

Pulleyblank initially proposed that type B syllables had longer vowels.[89] Later, citing cognates in other Sino-Tibetan languages, Starostin and Zhengzhang independently proposed long vowels for type A and short vowels for type B.[90][91][92] The latter proposal might explain the description in some Eastern Han commentaries of type A and B syllables as huǎnqì 緩氣 'slow breath' and jíqì 急氣 'fast breath' respectively.[93]

It is hard to make sense of these ancient quotes. It also makes one contemplate how much information, small clues and indirect evidence is out there for reconstructing Old Chinese, and wonder how much an amateur like me can hope to grasp about this subject.

The paper finishes with a discussion of the correlation between Lushai length and Chinese A/B type. The whole argument of the paper hinges on there being such a correlation, so they decided to check how much evidence there is for the correlation. After filtering candidates to avoid problematic cases, they get to 43 comparanda in Proto-Kuki-Chin and Old Chinese, and present the following table:

PKC long PKC short
Chinese type A 6 6
Chinese type B 5 26

They conclude that the correlation is statistically significant. One thing stands out to me, though: although PKC short and Chinese type B seem to strongly correlate, there does not seem to be a strong correlation at all between PKC long and Chinese type A, which is a bit disturbing. While this may be an effect of the small sample and the fact that there are more short (32) than long (11) words in the sample, and more type B (31) than type A (12), there may also be something meaningful going on here.

Let's interpret this table:

One way to interpret this is that there is a feature in Proto-Sino-Tibetan (PST) whose presence triggers type B in Old Chinese and short vowels in PKC, but whose absence does not influence the syllable's type. This would turn type B the marked element again, which is unsatisfying, and would also turn short vowels the marked element, which is even less satisfying.

The correlation may be less direct. For example, it might be that PST had both length and pharyngealization (or something that yields length and pharyngealization as reflexes), but only long syllables could be pharyngealized. Then short PST syllables would yield PKC short and Chinese type B, but long syllables could get either type A or B. However, this would imply that long PST syllables don't always yield long PKC syllables. It might just be so, or it might be that the PST feature that enabled length and pharyngealization (or whatever was type A/B) distinctions was a third one, say, only syllables of a certain kind could carry those distinctions. The absence of that feature would yield type B and PKC short, but its presence enabled syllables to go either way.

Conclusions

The main argument of the paper is to show that long vowels interrupted by a pharyngeal element were the origin of both Old Chinese type A syllables (argued to have pharyngealized initials) and Lushai long vowels (after loss of the pharyngeal element). The fact that the correlation only seems to appear between short vowels and type B, but not long vowels and type A, suggests that long vowels and type A do not share a common origin, only perhaps a common enabling environment (i.e., they can occur in the same environments, but are distinct features, in Proto-Sino-Tibetan). In my opinion, this undermines the motivation for reconstructing /CVʕV-/ roots for Sino-Tibetan.

Pharyngealization still seems a compelling explanation for the phenomena observed with type A syllables. However, it is not clear to me there is any good reason to consider it a feature of the initial (like the article proposes) rather than the whole syllable (as in Norton's original proposal).

Comentários / Comments

From Thunderbird to Liferea as a feed reader

2019-09-20 18:04 -0300. Tags: comp, unix, mundane, in-english

I've recently switched from Thunderbird to Liferea as my RSS feed reader. Thunderbird was randomly failing to update feeds at times*, and I thought it might be a good idea to use separate programs for e-mail and RSS for a change, so I went for Liferea. (I considered Elfeed too, but Elfeed does not support folders, only tags. In principle, tags can do everything folders can and more; the problem is that Elfeed cannot show a pane with all tags and the number of unread articles with each tag, the way Thunderbird or Liferea (or your average mail client) can do with folders.)

Liferea is pretty good, although I miss some shortcuts from Thunderbird, and sometimes shortcuts don't work (because focus is on some random widget). Here are some tips and tricks.

Importing feeds from Thunderbird to Liferea

Thunderbird can export the feed list in OPML format (right click on the feed folder, click Subscribe…, then Export). You can then import that on Liferea (Subscriptions > Import Feed List). No surprises here.

The tray icon

Liferea comes with a number of plugins (Tools > Plugins). By default, it comes with the Tray Icon (GNOME Classic) plugin enabled, which, unsurprisingly, creates a tray icon for Liferea. The problem with this for me is that whenever the window is 'minimized', Liferea hides the window entirely; you can only bring it back by clicking on the tray icon. I believe the idea is so that the window does not appear in the taskbar and the tray, but this interacts badly with EXWM, where switching workspaces or replacing Liferea with another buffer in the same Emacs 'window' counts as minimizing it, and after that it disappears from the EXWM buffer list. The solution I used is to disable the tray icon plugin.

Playing media

Liferea has a Media Player plugin to play media attachments/enclosures (such as in podcast feeds). To use it on Debian, you must have the gir1.2-gstreamer-1.0 package installed (it is a 'Recommends' dependency, not a mandatory one).

Alternatively, you can set Liferea to run an arbitrary command to open a media enclosure; the command will receive the enclosure URL as an argument. You can use VLC for that. The good thing about it is that VLC will start playing the stream immediately; you don't have to wait for it to download completely before playing it. The bad thing is that once it finishes playing the stream, the stream is gone; if you play it again, it will start downloading again. Maybe there is a way to configure this in VLC, but the solution I ended up using was to write a small script to start the download, wait a bit, and start VLC on the partially downloaded file. This way, the file will be fully downloaded and can be replayed (and moved elsewhere if you want to preserve it), but you don't have to wait for the download to finish.

#!/bin/bash
# download-and-play-media.sh

# Save file in a temporary place.
file="/tmp/$(date "+%Y%m%d-%H%M%S").media"
# Start download in a terminal so we can see the progress.
x-terminal-emulator -e wget "$1" -O "$file" &
# Wait for the file to be non-empty (i.e, for the download to start).
until [[ -s "$file" ]]; do
    sleep 1
done
# Wait a bit for the file to fill.
sleep 2
# Play it.
vlc "$file"

Miscellaneous tips

Caveats

So far I had two UI-related problems with Liferea:

Conclusion

Overall, I'm pretty satisfied with Liferea. There are a few problems, but so far I like it better than Thunderbird for feed reading.

Update (2020-03-23): After a few months using Liferea, I have to say that Thunderbird is better to use from the keyboard. Liferea is way too sensitive to which invisible thing has focus at a given moment. Were it not for Thunderbird not handling well hundreds of feeds, I think I would switch back.

Update (2020-07-10): I ended up switching to Elfeed.

_____

* I suspect the problem was that Thunderbird was trying to DNS-resolve the domains for a huge number (perhaps all) of feeds at the same time, and some of the requests were being dropped by the network. I did not do a very deep investigation, though.

Comentários / Comments

Emacs performance, profiling, and garbage collection

2019-09-13 00:13 -0300. Tags: comp, emacs, in-english

This week I finally got around to upgrading my system and my Emacs packages, including EXWM. Everything went fine, except for one problem: every time I loaded a XKB keymap, EXWM would hang for 10–20 seconds, with CPU usage going up. I opened an issue on the EXWM repository, but I decided to investigate a bit more.

After learning the basic commands for profiling Emacs Lisp code, I started the profiler (M-x profiler-start), loaded a new keymap, and generated a report (M-x profiler-report). It turned out that 73% of the CPU time during the hangup was spent on garbage collection. I tried the profiler again, now starting it in cpu+mem mode rather than the standard cpu mode. From the memory report, I learned that Emacs/EXWM was allocating around ~500MB of memory during the keyboard loading (!), apparently handling X MapNotify events.

I did not go far enough to discover why so much memory was being allocated. What I did discover though is that Emacs has a couple of variables that control the behavior of the garbage collector.

gc-cons-threshold determines how many bytes can be allocated without triggering a garbage collection. The default value is 800000 (i.e., ~800kB). For testing, I set it to 100000000 (i.e., ~100MB). After doing that, the keyboard loading freeze fell from 10–20s to about 2–3s. Not only that, but after setting it near the top of my init.el, Emacs startup time fell by about half.

Now, I've seen people warn that if you set gc-cons-threshold too high, Emacs will garbage collect less often, but each garbage collection will take longer, so it may cause some lag during usage, whereas the default setting will cause more frequent, but less noticeable garbage collections (unless you run code causing an unusually large number of allocations, as in this case with EXWM). However, I have been using it set to 100MB for a couple of days now, and I haven't noticed any lag; I just got a faster startup and less EXWM hangup. It may well depend on your Emacs usage patterns; you may try different values for this setting and see how it works for you.

Another recomendation I have seen elsewhere is to set gc-cons-threshold high and then set an idle timer to run garbage-collect, so Emacs would run it when idle rather than when you're using it, or setting a hook so it would run when unfocused. I did not try that, and I suspect it wouldn't work for my use case: since Emacs runs my window manager, I'm pretty much always using it, and it's never unfocused anyway. Yet another recommendation is to bind gc-cons-threshold temporarily around the allocation-intensive code (that comes from the variable's own documentation), or to set it high on startup and back to the original value after startup is finished. Those don't work easily for the XKB situation, since Emacs does not know when a XKB keymap change will happen (unless I wrote some Elisp to raise gc-cons-threshold, call XKB, and set it back after a while, which is more complicated than necessary).

Comentários / Comments

Main menu

Recent posts

Recent comments

Tags

em-portugues (213) comp (145) prog (70) in-english (59) life (48) unix (36) pldesign (36) lang (32) random (28) about (28) mind (26) lisp (24) mundane (22) fenius (21) web (20) ramble (18) img (13) hel (12) rant (12) privacy (10) scheme (10) freedom (8) lash (7) music (7) bash (7) academia (7) copyright (7) esperanto (7) shell (6) home (6) mestrado (6) misc (5) android (5) emacs (5) conlang (5) politics (4) worldly (4) book (4) latex (4) editor (4) etymology (4) php (4) tour-de-scheme (3) c (3) kbd (3) network (3) security (3) film (3) wrong (3) poem (2) cook (2) wm (2) treta (2) philosophy (2) llvm (2) lows (2) physics (2) comic (2) x11 (1) perl (1) golang (1) audio (1) old-chinese (1) translation (1) ai (1) kindle (1) german (1) pointless (1) en-esperanto (1)

Elsewhere

Quod vide


Copyright © 2010-2023 Vítor De Araújo
O conteúdo deste blog, a menos que de outra forma especificado, pode ser utilizado segundo os termos da licença Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Powered by Blognir.