Elmord's Magic Valley

Posts com a tag: `comp`

Impressions on R7RS-small: libraries, records, exceptions

2017-10-03 15:10 -0300. Tags: comp, prog, lisp, scheme, in-english

In the last post, I wrote a little bit about the historical context in which R7RS came about. In this post, I will comment on my impressions about specific features of the R7RS-small language.

First of all, I'd like to note that if you are going to read the R7RS-small report, you should also read the unofficial errata. As I read the document I spotted a few other errors not mentioned in the errata, but unfortunately I did not keep notes as I was reading. I'm not sure why a, um, revised version of the report is not published with the known errors corrected (a Revised^7.01 Report?), but alas, it isn't, so that's something to keep in mind.

In this post, I will talk mainly about the differences between R5RS and R7RS-small, since R7RS-small is more of an incremental extension of R5RS, rather than R6RS. This is not intended as a complete or exhaustive description of each feature; for that, consult the report.

Libraries/modules

R7RS introduced the concept of libraries (what some other systems call modules; I suppose R6RS and R7RS chose the name "library" to avoid conflict with the concept of modules in existing implementations). Library names are lists of symbols and non-negative integers, such as (scheme base), or (srfi 69). A library has a set of imports, a set of exported symbols, and code that constitutes the library. A library definition looks like this:

(define-library (foo bar)
  (import (scheme base)
          (scheme write))
  (export hello)
  (begin
    (define (hello)
      (display "Hello, world!\n"))))

Instead of putting the library code directly in the define-library form (inside a begin clause), it is also possible to include code from a different file with the (include "filename") directive (or include-ci for parsing the file case-insensitively; standard Scheme was case-insensitive until R5RS (inclusive)). This makes it easier to package R5RS code as an R7RS library, by including the older code within a library declaration. It's also a way to avoid writing all library code with two levels of indentation.

Imports can be specified or modified in a number of ways:

One can import a whole library: (import (scheme base))
Or just some identifiers from it: (import (only (some lib) someid1 someid2))
Or all but some identifiers: (import (except (some lib) unwanted-id1 unwanted-id2))
Or rename some identifiers: (import (rename (some lib) (original1 new1) (original2 new2)))
Or add a prefix to all identifiers: (import (prefix (some lib) somelib/))

These forms can be combined (e.g., you can import only some identifiers and add a prefix to them).

Exports just list all identifiers to be exported, but you can also write (rename internal-name exported-name) to export identifiers with a different name than they have within the library body.

Unlike R6RS, all library code directly embedded in the define-library form must be written within a begin clause. At first I found this kinda weird, but it has an interesting consequence: the library definition sublanguage does not have to know anything about the rest of the programming language. There is only a limited number of kinds of subforms that can appear within define-library, and parsing it does not require knowing about the values of any identifiers. This means that define-library can be more easily processed as data. One can imagine useful tools which read library definitions from files and, say, compute the dependencies of a program, among other possibilities.

In fact, R7RS does not classify define-library or its subforms as syntax forms, i.e., they are something apart from Scheme expressions. This also resolves a problem that would occur if define-library were an expression. The report specifies that the initial environment of a program is empty. So, how would I use import declarations before importing the library where import declaration syntax is defined? Of course one way around this would be to make (scheme base) available by default rather than start with the empty environment. But the solution adopted by R7RS means we don't have to import (scheme base) if we don't want to (for example, if we want to import (scheme r5rs) instead to package R5RS code as an R7RS library). (The report does define for convenience some procedures and syntax forms with the same name as corresponding library subforms, e.g., include.)

R7RS also standardized cond-expand (extended from SRFI 0). cond-expand is a mechanism somewhat like ifdefs in C for conditionally including code depending on whether the implementation defines specific feature symbols, or whether some library is available. This makes it possible to provide different implementations of a procedure (or a whole library) depending on the current implementation. One way we could use it is to write shims, or compatibility layer libraries to provide an uniform interface for features that are implemented differently by various implementations. For example, in Common Lisp, many implementations support threads, but they provide different interfaces. Bordeaux Threads is a library which provides a uniform API and maps those to the corresponding functions in each implementation it supports. Now we can do similar things in R7RS for those features that are supported everywhere but in incompatible ways (e.g., for networking).

Libraries and cond-expand are by far the most important addition in R7RS relative to R5RS. Even if we did not have any of the other features, we could package them as libraries and provide implementation-specific code for them via cond-expand.

Missing things

The report does not specify a mapping between library names and file names. I realize that it would be kinda hard to make everyone agree on this, but it is somewhat of a hurdle in distributing programs organized into libraries. Some implementations, such as Chibi, will look up a library named (foo bar) in a file named foo/bar.sld (where .sld stands for Scheme library definition), whereas CHICKEN will look it up at foo.bar.*. There is a project of a portable package manager for R7RS called Snow, which I think takes care of mapping packaged library files to implementation-specific names, but I haven't taken the time to check it out yet.

R7RS takes the excellent step of specifying that library names whose first component is the symbol srfi are reserved for implementing SRFIs, but then fails to specify how to name a specific SRFI. In practice, the few implementations I checked all agree on using (srfi n) as the name of the library implementing SRFI number n (i.e., I can write (import (srfi 69)) and remain reasonably portable), so this may turn out not to be a problem in practice.

Records

R7RS incorporates the define-record-type form from SRFI 9, for defining new record/struct types. It is a somewhat verbose form, which requires specifying the record constructor arguments and the names for each field accessor/getter and (optional) mutator/setter, but it's basically the least common denominator that any implementation which has some form of records can easily support. It looks like this:

(define-record-type <person> (make-person name age) person?
  (name person-name person-name-set!)
  (age person-age person-age-set!))

Here:

<person> is the record type name. R7RS-small does not specify any particular use for this name, but R7RS-large or some other SRFI may specify uses for it (e.g., to allow querying information about the record type).
(make-person name age) defines the constructor procedure and order of arguments for this record.
person? defines the name of a predicate which will answer true when given records created with make-person, and false for other Scheme values.
The clause (name person-name person-name-set!) declares a field called name, associated with a getter procedure person-name to extract the name field from a given person record, and a setter procedure person-name-set! for mutating the name field in a person record. The next clause does the same for the field age.

R5RS did not have any way to define new types disjunct from existing types. R6RS provides a more complex records facility, including both a syntactic and a procedural layer allowing reflection, but I don't know it well enough to comment. (I have read some comments on problems in the interaction between syntactically- and procedually-defined records, but I don't know the nature of the problems or how serious they are.)

Missing things

Reflection would be nice, or at least a way to convert a record into a vector or something like this (though I realize this might displease some people), but we could make libraries for that. Another thing that would be nice is for records to have a standard printed representation which could be printed out and read back again, but I realize there is a slightly complicated interaction here with the module system (the printed representation should be tagged with the record type in a way that will work regardless of which module it is read back in), and this might not even be desirable for implementation-internal types which happen to be defined in terms of define-record-type.

Exceptions

R7RS incorporates the exception handling mechanisms from R6RS, but not the condition types. Any value can be raised in an exception. The raise procedure raises a value as an exception object, or condition, to be caught by an exception handler. The guard form can be used to install an exception handler to be active during the evaluation of its body. The guard form names a variable to hold the captured condition, a sequence of cond-like clauses to determine what action to take given the condition, and a body to be executed. It looks like this:

(guard (err
        ((file-error? err) (display "Error opening file!\n"))
        ((read-error? err) (display "Error reading file!\n"))
        (else (display "Some other error\n")))
  (call-with-input-file "/etc/passwd"
    (lambda (file)
      (read-line file))))

If an else clause is not provided and no other clause of the guard form matches, the exception propagates up the stack until some handler catches it. If an exception is raised and caught by a guard clause, the value returned by the guard form is whatever is returned by the body of that clause.

Beside raise, R7RS also defines a procedure (error message irritants...), which raises an error object (satisfying the error-object? predicate) encapsulating an error message and a sequence of objects somehow related to the error (called "irritants"). It also defines the procedures error-object-mesage and error-object-irritants to extract the components of the error object.

R7RS does not define specific object types to represent errors; it only says that objects satisfying a given predicate must be raised in some circumstances. An implementation might define a record type for that, or just use lists where the first element represents the error type, or whatever is appropriate for that implementation.

At first I did not think exceptions were that important in the grand scheme of things (heh), since you can implement them on the top of continuations. (And indeed, exceptions in R6RS are in a separate library rather than the base language, although this does not mean much in R6RS because, if I understand correctly, all libraries are mandatory for R6RS implementations.) However, I then realized that until R5RS (inclusive), there was no standard way to signal an error in Scheme code, and perhaps more importantly, no standard way of catching errors. If portable libraries are to become more prominent, we will need a standard way of signalling and catching errors across code from different projects, so exceptions are a good add-on.

Beside raise, R7RS also defines raise-continuable, which raises an exception but, if the ~~guard~~ exception handler returns, it returns to the point where the exception was raised rather than exiting from the ~~guard~~ handler form. [Correction: this is how raise-continuable interacts with with-exception-handler, not guard. I'm still figuring how guard acts with respect to continuable exceptions.] On the top of this, something like Common Lisp's restarts can be implemented.

One side effect of having guard in the language is that now you can do control flow escapes without using call-with-current-continuation (call/cc for short). In theory this could be more efficient than capturing the fully general continuation just to escape from it once; in practice, some implementations may rely on call/cc to implement guard (the example implementation provided in the report does), so this performance advantage may not realize. But just having a construct to express a one-shot escape is already great, because it allows expressing this intent in the program, and potentially allows implementations to emit more efficient code when a full continuation is not required.

I was wondering if one could implement unwind-protect (a.k.a. try/finally) in terms of guard, and so avoid dynamic-wind for error-handling situations. Alas, I don't think this is possible in general, because the presence of raise-continuable means an error handler may execute even though control may still return to the guard body. I wish to write more about continuations in a future post.

Conclusion

Libraries (plus cond-expand), records and exceptions are the most important additions in R7RS-small relative to R5RS, and they are all a great step towards enabling code reuse and portability across implementations, while not constraining Scheme implementors unnecessarily. I am particularly happy about libraries and cond-expand, because this means we can start writing compatibility libraries to bridge differences between implementations without having to rely on a standardization process.

I have some other comments to make on I/O, bytevectors, and other parts of the standard library, but they can wait for a future post.

1 comentário / comment

R5RS, R6RS, R7RS

2017-10-01 22:11 -0300. Tags: comp, prog, lisp, scheme, in-english

Over the last few days I have skimmed over R7RS, the Revised⁷ Report on [the Algorithmic Language] Scheme. I thought I'd write up some of my impressions about it, but I decided first to write a bit about the history and the context in which R7RS came about and the differing opinions in the Scheme community about R6RS and R7RS. In a future post, I intend to write up about my impressions of specific features of the standard itself.

The Scheme language was first described in a document named the "Report on the Algorithmic Language Scheme". Afterwards, a second version, called the "Revised Report on the Algorithmic Language Scheme", came out. The following version of the standard was called the "Revised Revised Report …", or "Revised² Report …" for short. Subsequent versions have kept this naming tradition, and the abbreviation RnRS (for some n) is used to refer to each version of the standard.

Up to (and including) R5RS, all versions of the standard were ratified only by unanimous approval of the Scheme Steering Committee. As a result, each iteration of the standard was a conservative extension of the previous version. R5RS defines a very small language: the whole document is just 50 pages. The defined language is powerful and elegant, but it lacks many functions that are typically expected from the standard library of a modern language and necessary for many practical applications. As a result, each Scheme implementation extended the standard in various ways to provide those features, but they did so in incompatible ways with each other, which made it difficult to write programs portable across implementations.

To amend this situation a bit, the Scheme community came up with the Scheme Requests for Implementation (SRFI) process. SRFIs are somewhat like RFCs (or vaguely like Python's PEPs): they are a way to propose new individual features that can be adopted by the various implementations, in a way orthogonal to the RnRS standardization process. A large number of SRFIs have been proposed and approved, and some are more or less widely supported by the various implementations.

R6RS attempted to address the portability problem by defining a larger language than the previous reports. As part of this effort, the Steering Committee broke up with the tradition of requiring unanimous approval for ratification, instead requring a 60% majority of approval votes. R6RS defines a much larger language than R5RS. The report was split into a 90 page report on the language plus a 71 page report on standard libraries (plus non-normative appendices and a rationale document). The report was ratified with 67 yes votes (65.7%) and 35 no votes (34.3%).

The new report caused mixed feelings in the community. Some people welcomed the new standard, which defined a larger and more useful language than the minimalistic R5RS. Others felt that the report speficied too much, reinvented features in ways incompatible with existing SRFIs, and set some things in stone too prematurely, among other issues.

In response to this divide, the Scheme Steering Committee decided to split the standard into a small language, more in line with the minimalistic R5RS tradition, and a large language, intended to provide, well, a larger language standardizing a larger number of useful features. The R7RS-small report was completed in 2013. The R7RS-large process is still ongoing, being developed in a more incremental way rather than as one big thing to be designed at once.

I think that the R6RS/R7RS divide in part reflects not only differing views on the nature of the Scheme language, but also differing views on the role of the RnRS standards, the Steering Committee, and the SRFI process. In a discussion I read these days, a person was arguing that R6RS was a more useful standard to them, because for most practical applications they needed hashtables, which R6RS standardized but R7RS did not. My first thought was "why should hashtables be included in the standard, if they are already provided by SRFI 69?". This person probably does not consider SRFIs to be enough to standardize a feature; if something is to be portable across implementations, it should go in the RnRS standard. In my (current) view, the RnRS standard should be kept small, and SRFIs are the place to propose non-essential extensions to the language. My view may be colored by the fact that I started using Scheme "for real" with CHICKEN, an implementation which not only supports a large number of SRFIs, but embraces SRFIs as the way various features are provided. For example, whereas many implementations provide SRFI 69 alongside their own hashtable functions, CHICKEN provides SRFI 69 as the one way of using hashtables. So, CHICKEN users may be more used to regard SRFIs as a natural place to get language extensions from, whereas users of some other implementations may see SRFIs as something more abstract and less directly useful.

I have already expressed my view on Scheme's minimalism here, so it's probably no surprise that I like R7RS better than R6RS. I don't necessarily think R6RS is a bad language per se (and I still have to stop and read the whole R6RS report some day), I just have a preference for the standardized RnRS language to be kept small. (I'm okay with a larger standard a la R7RS-large, as long as it remains separate from the small language standard, or at least that the components of the large language remain separate and optional.) I also don't like every feature of R7RS-small, but overall I'm pretty satisfied with it.

Comentários / Comments

On Scheme's minimalism

2017-09-14 19:34 -0300. Tags: comp, prog, lisp, scheme, pldesign, ramble, in-english

[This post started as a toot, but grew slightly larger than 500 characters.]

I just realized something about Scheme.

There are dozens, maybe hundreds, of Scheme implementations out there. It's probably one of the languages with the largest number of implementations. People write Schemes for fun, and/or to learn more about language implementations, or whatever. The thing is, if Scheme did not exist, those people would probably still be writing small Lisps, they would just not be Scheme. The fact that Scheme is so minimal means that the jump from implementing an ad-hoc small Lisp to implementing Scheme is not that much (continuations notwithstanding). So even though Scheme is so minimal that almost everything beyond the basics is different in each implementation, if there were not Scheme, those Lisps would probably still exist and not have even that core in common. From this perspective, Scheme's minimalism is its strength, and possibly one of the reasons it's still relevant today and not some forgotten Lisp dialect from the 1970s. It's also maybe one of the reasons R6RS, which departed from the minimalist philosophy, was so contentious.

Plus, that core is pretty powerful and well-designed. It spares each Lisp implementor from part of the work of designing a new language, by providing a solid basis (lexical scoping, proper closures, hygienic macros, etc.) from which to grow a Lisp. I'm not one hundred percent sold on the idea of first class continuations and multiple values as part of this core*, and I'm not arguing that every new Lisp created should be based on Scheme, but even if you are going to depart from that core, the core itself is a good starting point to depart from.

[* Though much of the async/coroutine stuff that is appearing in modern languages can be implemented on the top of continuations, so maybe their placement in that core is warranted.]

2 comentários / comments

Some thoughts on Twitter and Mastodon

2017-06-25 01:32 -0300. Tags: comp, web, privacy, freedom, life, mind, in-english

Since the last post I've been using Mastodon as my primary microblogging platform for posting, but I was still regularly reading and retweeting stuff on Twitter. A while ago Twitter started reordering tweets in my timeline despite my having disabled that option, just as I said could eventually happen (except much earlier than I expected). The option is still there and is still disabled, it's just being ignored.

Twitter brought me much rejoicing during the years I used it. I follow a lot of cool people there and I've had lots of nice interactions there. I found myself asking if I should accept some abuse from Twitter to keep interacting with those people, and I've been shocked at myself for even asking myself that. I've been using Twitter less and less as of late. (I'd like to be able to say I did it out of principles, but to be completely truthful I find the non-chronological timeline utterly annoying, and that has had as much to do with my leaving as principles.)

Although I switched to Mastodon as my Twitter replacement, Mastodon is not really "another Twitter". Having 500 rather than 140 characters to write initially felt like learning to talk again. Early on when I started using Mastodon, I was going to reply to a person's toot (that's what posts are called in Mastodon) with a short, not-really-one-full-sentence line that is the norm in Twitter. I wrote it down and was like "no, this kind of grunting a half-thought is not going to cut it here". It felt like Twitter's 140 character limit not only limited the kinds of things you could say, but also imposed/favored a "140-character mindset" of not finishing lines of thought or thinking with much depth. As I went on using Mastodon, I found myself writing thoughts I wouldn't have even tried to write in Twitter.

I still open up Twitter once in a while. Today I opened the mobile version in my desktop browser and noticed that the mobile version still shows a chronological timeline, still doesn't pollute the timeline with liked-but-not-retweeted tweets, and is much faster and cleaner than the desktop version. (I still have to un-fix the navigation bar via CSS, but I already had to do that in the desktop version anyway.) It's tempting to start using Twitter again through the mobile version, while it doesn't catch up with the new "features". I know I shouldn't, though. Even if the mobile version never caught up with the misfeatures (I suppose it eventually will, probably in short time), Twitter has already shown they're willing to throw stuff down their users' throats in the name of – what? I'm not even sure. Maybe they want to make Twitter more Facebook-like to attract Facebook users, even if that means alienating the people who used Twitter exactly because it was not like Facebook?

The funny thing is Twitter could simply provide some options for users to control their experience ("(don't) show tweets liked by your followers", "(don't) show tweets you liked to your followers", "(don't) reorder tweets" (the last one is already there, it just doesn't work)). This way they could cater to whatever new audience they have in mind and keep the users who liked how Twitter used to work. They just don't care to. I'm not really sure what are the motivations and goals behind Twitter's actions. For a really long time before the last changes it had been showing the "you might like" box (even if you clicked the "show me less like this" option (the only way to dismiss it) every time) and the "you might like to follow" box (even if you dismissed that too, and even though it also showed undimissable follow suggestions on the right pane anyway). I used to open Twitter pretty much every day, so it didn't really make sense as a user retention strategy. Maybe they want to incentivize people to do specific things on Twitter, e.g., throw in more data about themselves? (Yeah, there was the "add your birthday to your profile" periodic thing too.)

Meh.

3 comentários / comments

Why the new like-based Twitter timeline is terrible

2017-05-31 21:56 -0300. Tags: comp, web, privacy, in-english

Recently, tweets which people I follow 'liked' (but not retweeted) started showing up in my Twitter timeline. Twitter had been showing the "you might like" box with such tweets for quite a long time, but they were separate from normal tweets, and you could dismiss the box (it would come back again after a while, though). Now those 'liked' tweets are showing up intermingled with the normal tweets, and there is no option to disable this.

Now, Twitter has been working hard on its timeline algorithms lately, and, at least initially, the liked tweets it added to my timeline were indeed stuff I liked, and they constituted just a small part of total tweets. That's not the case anymore: now liked tweets seem to be about one third of all tweets I see, and a smaller proportion of them interest me. Moreover, I simply don't want to see that many tweets. If I'm seeing all tweets I used to see plus liked tweets, and liked tweets comprise about a third of all tweets I see now, then I'm seeing about 50% more tweets, and I simply don't have the patience for so much tweetering; I already limit the number of people I follow so as to keep my timeline manageable.

But even if Twitter's algorithms were perfect and showed me only things I wanted to see in an ideal quantity, showing liked-but-not-retweeted tweets would already be bad, for a number of reasons:

'Likes' (formerly 'favorites') were a way to give fake Internet points to someone without being forced to share it with your followers. Sometimes I do this just because I don't think the content will be generally of interest to my followers (e.g., linguistics-based humor). But sometimes I may do it for more serious reasons, for instance, because I think the content might be offensive or be taken badly by some of my followers (e.g., religion-based humor). And sometimes I want to signal an approval to someone's opinion which might generate controversy or discussion if shared, but I don't really want to have a discussion about it with anyone at the time (e.g., some political views).
Likes have always been public in the sense that you could open up someone's profile and see a feed of everything they liked, but you had to actively stalk someone to see that. Now people who follow you don't have to do anything in particular to see the things you liked, so you have to work on the assumption that anything you like might show up in someone else's feed, which is annoying.
You don't see every tweet your followees liked; you only see some of them, and the selection is done by Twitter's algorithms, which of course are closed. At best (if the algorithms try to maximize showing things you like), this has the potential of creating a bubble effect of feeding you more and more with the things you already like and agree with, and filtering out differing views and opinions. At worst, the algorithms might try to maximize something else (e.g., things advertisers would like you to see rather than things you would like to see).
The bubbling effect is currently probably not as bad in Twitter as it is in Facebook because you still see all tweets your followees (re)tweet; you just see more tweets on the top of those. Note that Twitter already has an option of using a more Facebook-like timeline which reorders the tweets according to what its algorithms think you might be interested in more. It is currently opt-out, but it might stop being so in the future, the same way you used to be able to dismiss the "you might like" tweets but can't anymore.

As Twitter keeps trying brave new ways of monetizing its users, it's probably going to become more problematic from a privacy perspective. Meanwhile, we now have a quite viable decentralized, free, and usable social network (and I'm already there). The sad thing is that most of the people I follow will probably not migrate from Twitter, but as Twitter keeps getting worse in matters of privacy, transparency and usability, I'm becoming more inclined to leave it, as I have done before.

Update (2017-06-13): Today my Twitter timeline showed up out of order, at least temporarily, even though the "Show me the best Tweets first" options still appears disabled. That one came quick.

1 comentário / comment

elmord.org

2017-02-10 23:52 -0200. Tags: about, comp, web, em-portugues

Então, galere: este blog está em vias de se mudar para elmord.org/blog. Eu ainda estou brincando com as configurações do servidor, mas ele já está no ar, e eu resolvi avisar agora porque com gente usando já tem quem me avise se houver algo errado com o servidor novo.

(Incidentalmente, eu também me mudei fisicamente nas últimas semanas, mas isso é assunto para outro post.)

Mas por quê?

Eu resolvi fazer essa mudança por uma porção de motivos.

No final do ano passado, a minha Internet de casa tinha ficado absurdamente instável (por motivos de "o proprietário da casa mexeu na fiação e a deixou exposta à chuva e ao sol por quase um ano e eu não sabia"). Eu costumava deixar alguns serviços rodando na máquina de casa, e estava impossível usá-la remotamente com a conexão caindo a cada 10 minutos e o IP trocando. Um colega de serviço me recomendou contratar uma VPS da Host1Plus, e assim o fiz (pela módica quantia de 8 pila por mês). Agora que eu já pago pela VPS anyway, não tem por que eu não hospedar o blog lá também.
O servidor da INF ainda está rodando PHP 5.2 num Debian 5, que não recebe mais atualizações de segurança desde 2012 (sim, esse post é 2015, e não, nada mudou). Como já mencionei naquele post, ter que programar em PHP 5.2 é bem annoying, e hospedando o blog no meu próprio servidor eu posso instalar um PHP mais recente.
More generally, programar em PHP é bem annoying, e tendo meu próprio servidor eu não preciso programar em PHP. Esperem uma reescrita do blog em Scheme, Real Soon Now™.
Ainda falando da INF, a rede da INF tem uma tendência a cair sempre que chove forte, e sempre chove forte no RS. Não que o Host1Plus seja uma maravilha de estabilidade, mas enfim.
Ainda falando da INF, a INF desabilitou o HTTPS em 2009~2010 por algum motivo, e assim ficou. No site novo eu tenho HTTPS, powered by Let's Encrypt e dehydrated.
É mais fácil dizer para as pessoas que o meu blog fica em elmord.org do que em inf.ufrgs.br/~vbuaraujo. Acho.
For great justice.

O plano qüinqüenal

Eu ainda não sei se essa migração vai ser definitiva (vai depender da estabilidade e performance do servidor novo), então pretendo fazê-la em dois passos:

Servidor novo como réplica do atual. Inicialmente, tanto o elmord.org/blog quanto o inf.ufrgs.br/~vbuaraujo/blog vão ficar servindo o mesmo conteúdo. As páginas do elmord.org ficarão apontando o inf.ufrgs.br como link canônico, i.e., search engines e afins vão ser instruídos a continuar usando a versão no inf.ufrgs.br como a versão "oficial" das páginas. Os comentários de ambas as versões serão sincronizados periodicamente (o que dá para fazer com rsync porque os comentários são arquivos texto).

Redirecionamento para o servidor novo. Se daqui a alguns meses eu estiver suficientemente satisfeito com o funcionamento do servidor novo, ele passa a ser o oficial, as páginas param de incluir o header de link canônico para o blog antigo, e o blog antigo passa a redirecionar para as páginas correspondentes do novo. Se eu não me satisfizer com o servidor novo, eu tiro ele do ar, o inf.ufrgs.br continua funcionando como sempre, e fazemos de conta que nada aconteceu.

EOF

Por enquanto é só. Se vocês encontrarem problemas com o site novo, queiram por favor reportar.

6 comentários / comments

Truques com SSH

2017-01-24 19:40 -0200. Tags: comp, unix, network, em-portugues

Pous, nos últimos tempos eu aprendi algumas coisinhas novas sobre o SSH. Neste post relato algumas delas.

Port forwarding

O SSH possui duas opções, -L e -R, que permitem encaminhar conexões de uma porta local para um host remoto e vice-versa.

Porta local para um host remoto

Imagine que você está na sua máquina local, chamada midgard, e há uma máquina remota, chamada asgard, que é acessível por SSH. Você quer acessar um serviço na pora 8000 da máquina asgard a partir da máquina midgard, mas você quer tunelar o acesso por SSH (seja porque você quer que o acesso seja criptografado, ou porque a porta 8000 simplesmente não é acessivel remotamente). Você pode usar o comando:

midgard$ ssh -L 9000:127.0.0.1:8000 fulano@asgard

O resultado disso é que conexões TCP feitas para sua porta local 9000 serão tuneladas através da conexão com fulano@asgard para o endereço 127.0.0.1, porta 8000 na outra ponta. Por exemplo, se asgard tem um servidor web ouvindo na porta 8000, agora você vai poder abrir um browser em midgard, apontar para http://localhost:9000, e a conexão vai cair na porta 8000 de asgard, tudo tunelado por uma conexão SSH.

Note que o 127.0.0.1 é o endereço de destino do ponto de vista do servidor. Você poderia usar outro endereço para acessar outras máquinas na rede do servidor. Por exemplo, se a máquina vanaheim é acessível a partir de asgard, você poderia rodar:

midgard$ ssh -L 9000:vanaheim:8000 fulano@asgard

e agora todos os acessos à porta TCP 9000 da sua máquina local cairão na porta 8000 de vanaheim, tunelados através da conexão SSH com asgard.

Opcionalmente, você pode especificar um "bind address" antes da porta local, para especificar que apenas a porta 9000 de uma interface de rede específica deve ficar ouvindo por conexões. Por exemplo, você pode usar:

midgard$ ssh -L localhost:9000:vanaheim:8000 fulano@asgard

para dizer que a porta deve escutar apenas conexões da própria máquina. (Por padrão, que interfaces serão usadas é decidido pela opção GatewayPorts do cliente SSH, que defaulta para ouvir apenas na interface local de qualquer forma.) Alternativamente, pode-se passar um bind address vazio (i.e., :9000:vanaheim:8000, sem nada antes do primeiro :), para ouvir em todas as interfaces. Dessa maneira, outras máquinas na sua rede local que acessem a porta 9000 de midgard também terão o acesso tunelado para a porta 8000 de asgard. (* também funciona ao invés da string vazia, mas aí você tem que escapar o * para o shell não tentar expandir.)

Porta remota para a máquina local

Também é possível fazer o contrário: instruir o servidor SSH remoto a redirecionar alguma de suas portas para uma máquina e porta acessível a partir da sua máquina local. Para isso, utiliza-se a opção -R. Por exemplo:

midgard$ ssh -R 8000:localhost:22 fulano@asgard

Isso faz com que a porta 8000 em asgard seja tunelada para a porta 22 da máquina local. Agora, se alguém na máquina asgard acessar a porta 8000 (por exemplo, com ssh -p 8000 beltrano@localhost), a conexão vai cair na sua porta 22 local (e a pessoa terá acesso ao seu servidor SSH local). Você pode usar isso se você está atrás de um firewall ou NAT e a máquina remota é acessível pela Internet, mas a sua máquina local não, e você quer dar acesso a algum serviço da sua máquina local à máquina remota. (Já abordamos isso por aqui antes, mas menciono de novo for completeness.)

Proxy SOCKS

O SSH é capaz de funcionar como um proxy SOCKS. Para isso, utiliza-se a opção -D ("dynamic forwarding"):

midgard$ ssh -D localhost:8000 fulano@asgard

Isso faz com que o SSH ouça como um servidor SOCKS na porta 8000 da máquina local. Conexões recebidas nessa porta serão tuneladas para a máquina asgard, que funcionará como um proxy. Você pode então apontar o proxy SOCKS do seu browser ou outra aplicação para localhost, porta 8000.

Outras opções úteis

-C habilita compressão da conexão. E útil principalmente com conexões lentas (numa rede local, a compressão não compensa muito).

Por padrão, se você usa um dos comandos de redirecionamento de portas acima, o SSH faz o redirecionamento e abre uma sessão de shell comum. Se você quer apenas fazer o redirecionamento, pode usar as opções -N (não executa comando remoto) e -f (vai para background (forks) depois de pedir a senha). As opções podem ser combinadas em um único argumento (e.g., -CNf).

Escapes e comandos especiais

Em uma sessão SSH, a seqüência ENTER ~ é reconhecida como um prefixo de escape para acessar uma série de comandos especiais. Se você digitar ENTER ~ ?, verá uma lista de todos os comandos disponíveis:

Supported escape sequences:
 ~.   - terminate connection (and any multiplexed sessions)
 ~B   - send a BREAK to the remote system
 ~C   - open a command line
 ~R   - request rekey
 ~V/v - decrease/increase verbosity (LogLevel)
 ~^Z  - suspend ssh
 ~#   - list forwarded connections
 ~&   - background ssh (when waiting for connections to terminate)
 ~?   - this message
 ~~   - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.)

O comando ENTER ~ C abre um prompt onde é possível fazer e cancelar redirecionamentos de porta, com uma sintaxe análoga à das opções vistas anteriormente:

ENTER ~ C
ssh> ?
Commands:
      -L[bind_address:]port:host:hostport    Request local forward
      -R[bind_address:]port:host:hostport    Request remote forward
      -D[bind_address:]port                  Request dynamic forward
      -KL[bind_address:]port                 Cancel local forward
      -KR[bind_address:]port                 Cancel remote forward
      -KD[bind_address:]port                 Cancel dynamic forward

Pasmem.

Observações

O uso das opções de redirecionamento pode ser controlado/desabilitado na configuração do servidor. Consulte a man page sshd_config(5) para mais informações.

2 comentários / comments

Guile: primeiras impressões

2017-01-02 22:54 -0200. Tags: comp, prog, scheme, lisp, em-portugues

Até agora, as únicas implementações de Scheme com as quais eu tinha tido um contato mais extensivo foram o Chicken e, em menor grau, o Racket. Semana passada eu comecei a dar uma olhada no manual do Guile, o Scheme do Projeto GNU. So far, o Guile pareceu um Scheme bem bacaninha. Neste post, deixo registradas algumas das minhas impressões iniciais do Guile e coisas que eu achei interessantes até agora, com o caveat de que eu ainda não usei o Guile para nada na prática além de testar meia dúzia de coisas no REPL e escrever um ou outro script de meia dúzia de linhas.

Bytecode

Diferente do Chicken, o Guile não gera executáveis nativos; ao invés disso, ele compila para um bytecode próprio. Na verdade, a VM do Guile suporta não apenas Scheme, como também possui suporte preliminar a Emacs Lisp e ECMAScript (!), mas ainda não sei como essas coisas se integram. Em termos de performance, o Guile não parece ser nem lá nem cá, e imagino que seja comparável a outras linguagens interpretadas, como Python. Eu experimentei fazer uns benchmarks toscos, mas os resultados foram inconclusivos e requererão uma análise mais aprofundada, que eu não hei de fazer tão cedo.

Debugabilidade

Em termos de debugabilidade, o Guile ganha bonito do Chicken. Para começar, o Guile imprime (pasmem!) um stack trace quando ocorre um erro. O Chicken não imprime um stack trace pelo simples fato de que ele não usa uma pilha de chamadas da maneira convencional; quando ocorre um erro, o Chicken imprime um "histórico de chamadas", i.e., uma lista das últimas N chamadas, na ordem em que ocorreram, mas sem representar o aninhamento das chamadas, o que torna a depuração mais complicada. Além de mostrar uma pilha, o Guile ainda mostra os valores dos argumentos em cada chamada empilhada (algo cuja falta me incomoda bastante em Python) e, quando executado em modo interativo, cai num debugger que permite, entre outras coisas, inspecionar os valores das variáveis locais. Também é possível definir breakpoints e essas coisas que se espera de um debugger, mas não cheguei a olhar essa parte com calma.

Além disso, o Guile tende a detectar mais erros do que o Chicken. Por exemplo, o Chicken não reporta um erro se uma função é declarada com múltiplos parâmetros com o mesmo nome, ou se uma função é chamada com um keyword argument que ela não suporta.

(Não-)minimalismo

No Chicken há uma separação maior entre uma linguagem core pequena e extensões, que têm que ser importadas explicitamente em programas que as usam. (Por exemplo, no programa de adivinhações de um post anterior, foi necessário dar um (use extras) para ter acesso à função random.) No Guile, uma quantidade bem maior de funcionalidades (incluindo expressões regulares e a API POSIX) já está disponível mesmo sem fazer nenhum import. Nesse quesito, o Guile tem um feel um pouco mais "Common-Líspico" do que o Chicken. (Mas não muito; coisas como orientação a objetos requerem um import explícito.)

Um outro sentido em que o Guile é não-minimalista é que freqüentemente há multiplas APIs para fazer a mesma coisa. Em muitos casos, isso se deve ao fato de que uma API nova foi introduzida (freqüentemente uma SRFI, o que é um ponto positivo), mas a antiga foi mantida por compatibilidade. Por exemplo, para a definição de estruturas, o Guile suporta a SRFI-9, as APIs tradicionais do Guile (anteriores à SRFI-9) e a API de records do R6RS. Da mesma forma, o Guile suporta escopo dinâmico tanto por meio de fluids (a interface histórica) quanto por parameters (SRFI-39). (Os parameters são implementados em termos de fluids no Guile.)

O Guile parece ser bastante comprometido com compatibilidade com versões anteriores, o que tem o ponto bem positivo de que seu código provavelmente vai continuar funcionando nas versões futuras, mas isso vem com o custo de ter múltiplas APIs para as mesmas funcionalidades hanging around.

Módulos

Enquanto o Chicken faz uma distinção entre units (que são usadas para compilação separada de partes de um programa) e módulos (que são usados para isolar namespaces), no Guile um módulo serve a ambos os propósitos. Na verdade eu acho essa distinção que o Chicken faz bastante annoying (e aparentemente há quem queira deprecar as units no Chicken 5), e mui me alegrou saber que o Guile (1) possui um sistema de módulos; (2) que não é cheio de frescura (ou pelo menos as frescuras são opcionais); e (3) é fácil de usar.

O nome de um módulo em Guile é uma lista de símbolos, e um módulo de nome (foo bar) é procurado no arquivo load_path/foo/bar.scm. O load path default pode ser alterado através de um parâmetro da linha de comando, ou de uma variável de ambiente, ou setando %load-path e %load-compiled-path explicitamnte.

Não sei qual é a maneira convencional de escrever programas separados em múltiplos arquivos sem ter que instalá-los no load path. Imagino que uma maneira seja escrever um arquivo main que sete o load path para incluir o diretório do programa, e depois importar os demais componentes do programa. Outra maneira é dar include nos arquivos, mas isso não cria módulos com namespaces separados.

Threads

O Guile suporta threads nativas do sistema operacional, diferentemente do Chicken, que suporta apenas "green threads" (uma thread nativa rodando múltiplas threads lógicas cooperativamente). Além das APIs comuns para criação de threads, mutexes e toda essa bagulheira, o Guile também suporta uma API de futures, mantendo automaticamente uma pool de threads cujo tamanho é determinado por padrão pelo número de cores da máquina, e uma macro (parallel exp1 exp2 ...) que roda todas as expressões em paralelo e retorna o valor de cada uma, e um letpar, um "let paralelo" que avalia o valor de todas as variáveis em paralelo. Não sei quão útil isso é na prática, mas que é bem legal, é.

Orientação a objetos

O Guile vem com um sistema de orientação a objetos baseado em generic functions e multiple dispatch a la CLOS, chamado GOOPS. Ainda não olhei o GOOPS com calma, mas ele parece não ter todas as coisas que o CLOS tem (por exemplo, before, after e around methods), mas ele permite redefinir classes em tempo de execução (com atualização automática das instâncias existentes da classe), e parece ter algumas coisinhas a mais (e.g., provisões para mergear métodos de mesmo nome herdados de módulos diferentes).

Uma coisa muito legal do GOOPS em comparação com o CLOS é que ele permite transformar transparentemente uma função comum em uma função genérica. Por exemplo, você pode adicionar um método à função builtin +:

(define-method (+ (a <string>) (b <string>))
  (string-append a b))

Feito isso, agora você pode escrever (+ "a" "b"), e o resultado será "ab". O interessante disso é o define-method não sobrepõe o + existente com um + novo: ele modifica o + existente, e agora todo o código que usava + antes vai passar a usar esse + aumentado. Aparentemente isso só funciona para substituir funcionalidades não-padrão dos operadores; se você definir um método (+ (a <number>) (b <number>)) e tentar somar dois números, o Guile vai continuar usando a soma padrão ao invés da sua definição, acredito eu que porque a chamada a + é compilada para a instrução add da VM, que vai ignorar o método caso os argumentos sejam números. (O que de certa forma torna o fato de o + usar o método definido pelo usuário quando os argumentos não são números ainda mais impressive, pois, eu suponho, eles tiveram que "go out of their way" para fazer a instrução add da VM verificar se houve a adição de métodos ao + pelo usuário quando os argumentos não são números. Mas não sei o suficiente sobre a implementação do Guile para saber realmente o que está acontecendo por baixo dos panos.)

Setters

Uma coisa que eu achei meio chata no Guile com relação ao Chicken é que o Guile não suporta "de fábrica" usar set! em coisas que não sejam identificadores. Por exemplo, no Chicken é possível escrever coisas como (set! (car x) 42) ao invés de (set-car! x 42); o Guile, por padrão, não tem suporte a isso. O Guile tem suporte a "procedures with setters", através de uma API tradicional e da API da SRFI-17, e ao importar o módulo (srfi srfi-17) o set! passa a ser usável com car, cdr e vector-ref, mas tem um zilhão de outras funções similares (como hash-ref, array-ref, etc.) que não têm setters definidos. Nada fatal, e nada lhe impede de definir os setters para essas funções, mas seria legal se houvesse suporte nativo a setters para todas as funções em que faz sentido ter um setter, como é o caso no Chicken.

Bibliotecas

O Guile parece ter bem menos bibliotecas do que o Chicken, e certamente não possui um repositório centralizado de bibliotecas, como é o caso dos eggs do Chicken. (A documentação do guild, a interface para os utilitários de linha de comando do Guile, tais como guild compile, menciona planos de permitir instalar pacotes da Internet através do guild no futuro. Não sei como eles pretendem realizar isso, mas, da minha parte, eu acho que mais importante do que um repositório centralizado é uma maneira padronizada de empacotar programas/bibliotecas e descrever dependências de uma maneira que permita sua resolução automática na instalação. But I digress.)

Por outro lado, o Guile vem com uma porção de módulos de fábrica, e possui bindings para a Gtk e o GNOME. Ainda não as olhei com calma, mas pode ser uma solução interessante para criar aplicações com interface gráfica.

Unicode

No Chicken, por padrão, todas as strings são strings de bytes. Há um módulo/extensão/unit/library/whatever chamada utf8, que reimplementa diversas funções de manipulação de strings para assumirem que as strings estão codificadas em UTF-8 (as strings continuam sendo strings de bytes por baixo dos panos). Importar o utf8 não substitui, mas sim redefine, as funções padrão, então, pelo que eu entendo, importar utf8 no seu módulo não vai fazer os outros módulos do sistema que não importaram explicitamente utf8 passarem a funcionar magicamente com strings UTF-8.

No Guile, strings são Unicode nativamente (há um tipo separado para "byte vectors", que pode ser usado para guardar bytes literais não interpretados como caracteres). Portas (arquivos abertos) possuem um encoding associado, e o Guile faz a conversão de Unicode para o encoding da porta automaticamente. Não sei se isso não pode acabar incomodando na prática (o encoding default é determinado pelo locale, e modo de abertura de arquivos que depende do locale me dá um certo medo, mas talvez seja por trauma dos UnicodeDecodeError do Python 2, o que não é a mesma situação do Guile porque no Guile todas as strings são Unicode por padrão; e nada impede de setar o encoding manualmente ao abrir um arquivo).

Conclusão

No geral, o Guile me pareceu uma implementação bem legal de Scheme, e tem um monte de outros aspectos interessantes que eu não cheguei a mencionar nesse post (por exemplo, que ele foi feito para ser embutível em programas C, e que a API C é documentada juntamente com as funções correspondentes em Scheme, e que no geral a documentação do Guile é bem boa). Quero ver se o uso em projetos no futuro para ter uma experiência mais prática com ele.

2 comentários / comments

Some things I know about Clang and LLVM, part 1

2016-06-17 04:10 -0300. Tags: comp, prog, llvm, mestrado, in-english

In this post I'm going to talk about a few things about Clang and LLVM which I learned during my Master's and which might be useful to people new to Clang/LLVM.

The LLVM Project

According to its website, "The LLVM Project is a collection of modular and reusable compiler and toolchain technologies." The LLVM Project encompasses a number of sub-projects, the main ones being LLVM and Clang. Basically, LLVM is an infrastructure for code compilation, analysis and transformation. LLVM originally stood for "Low Level Virtual Machine", but it is not really a virtual machine, so nowadays "LLVM" is not considered an acronym anymore, it's just the name of the project. Clang (which is pronounced "clang", by the way, not "C-lang") is a C/C++/Objective C compiler which uses LLVM for code generation. The great things about Clang and LLVM are:

Unlike traditional compilers, they are designed as reusable components. You can use Clang and LLVM components as libraries to write your own program analysis/transformation tools, for instance, or you might extend Clang and LLVM with new passes/plugins, or use LLVM as a backend to generate machine code for your compiler. People have implemented JIT compilers (such as this) using LLVM as a basis for code generation, for example.
LLVM is designed around a well-defined intermediate language, called LLVM IR (Intermediate Representation), which is sort of an assembly-like language for an abstract machine. All code transformations at the LLVM level (apart from machine code generation) are implemented as transformations taking an LLVM IR program as input and producing a modified LLVM IR program as output. This makes it easier to add new passes, use them individually, combine them, inspect what each pass does, etc. LLVM IR has a textual representation (which you can print out or feed as input to LLVM), a bitcode representation which is more space-efficient, and an in-memory data structure representation (which is what the LLVM tools use internally).

Nowadays LLVM is quite popular as a compiler backend for various languages, such as Rust. The great thing about targeting LLVM for code generation is that it implements a large number of code optimizations. In fact, when Clang compiles a C/C++/ObjC program, it emits very naive, unoptimized code – most optimizations happen at the LLVM level. Because of this, any compiler targeting LLVM is able to use those same optimizations without having to do anything in particular (other than emitting code which LLVM is able to optimize – LLVM can't do magic, after all).

An example of a weirder project using Clang/LLVM is Emscripten, which implements an LLVM backend, the part which translates LLVM IR to machine code – except in this case the machine code is JavaScript.

Documentation, caveat, and scope of this post

The LLVM Project website has plenty of documentation (for suitable values of "plenty"), for LLVM and Clang. You should consult those for reference. The mailing lists (there are separate ones for the various projects) also have plenty of useful information (although I usually end up there by searching stuff on ~~Google~~ StartPage rather than going directly to the mailing list). As far as I can tell, people are quite helpful if you ask questions there (but I have never personally asked anything).

It's probably a good moment to warn that LLVM and Clang development moves quite fast, so it's probable some (or most) things in this post will be out of date sooner or later. So, when in doubt, consult the documentation. I will not attempt to duplicate the information in the documentation here, but rather will try to provide an overview of things I had to learn and some gotchas I found during the process. As of now, the current stable version of LLVM is 3.8, although I used initially 3.6 and later 3.7 for most of my Master's (which were the most current stable versions at the times).

Compiling LLVM and Clang

If you want to compile LLVM and Clang from source, you can find information in Getting Started with the LLVM System, Building LLVM with CMake, and Clang – Getting Started.

The main gotchas I found in the process were:

If you compile the source from the tarball (rather than the version on SVN/Git) using the standard ./configure; make; make install, it will compile a release build, not a debug build, even if you specify you want debug symbols to ./configure (and if you intend to write new passes/plugins or modify Clang/LLVM, you probably want a debug build). The solution is to compile with CMake instead of the standard Makefiles which come with the distribution. According to my notes here, the commands I used were:
```
tar -xvf llvm-3.6.1.src.tar.xz 

tar -xvf cfe-3.6.1.src.tar.xz 
mv cfe-3.6.1.src llvm-3.6.1.src/tools/clang

tar -xvf compiler-rt-3.6.1.src.tar.xz
mv compiler-rt-3.6.1.src llvm-3.6.1.src/projects/compiler-rt

mkdir build
cd build

cmake -GNinja -DCMAKE_BUILD_TYPE=Debug ../llvm-3.6.1.src
ninja     # or ninja -j1
```
ninja (package ninja-build in Debian) is a build program similar to make. You can pass cmake the argument -G"Unix Makefiles" instead of -GNinja, and then it will generate classical Makefiles instead of Ninja files and you can run make instead of ninja, but then you don't get to think of this video every time you type ninja in the terminal.
By default, ninja will run multiple compilation jobs in parallel. This is all fine and dandy, and you probably want do to that if you have a multi-core machine, except linking Clang eats a lot of memory, and with four linkers in parallel it will likely eat up all your RAM (I froze my 8GB RAM machine a couple times due to that). This only happens during linking, at the very end of the compilation process. A couple of things help here:
- I very much recommend disabling swap space (sudo swapoff -a) during compilation. This way the compilation process will die when it eats up all your memory instead of freezing your system. After compilation you can reenable swap with sudo swapon -a.
- After the compilation process dies from lack of memory (or alternatively, if you interrupt it manually after your system starts to freeze), you can restart it with ninja -j1, which tells ninja to run a single compilation job. This way, you can run most of the process in parallel, and then restart it with a single job for the memory intensive last part.
- Another thing that may help is using the gold linker, which consumes somewhat less memory than the standard ld linker. It comes with recent versions of GNU Binutils, so you probably already have it on your system under the name ld.gold, but it is not used by default. You can set up a symlink in /usr/bin from ld to ld.gold to use it (you'll probably want to rename the old ld first).

If you just want to use the LLVM/Clang infrastructure, rather than modifying it, you may not need to compile it from source; you can install your distribution's development packages for LLVM and Clang instead (e.g., llvm-3.7-dev and libclang-3.7-dev on Debian). Then you can compile your pass/plugin/whatever against those.

Interfacing with Clang/LLVM

As far as I can tell, the primary interface with LLVM is the C++ API. There are C bindings to it too, but I don't know how common it is to use them. Besides C, there are bindings for OCaml, Python and Go, as well as third-party ones for Haskell, Rust, and maybe others. I can't attest to their stability or completeness (I remember trying to compile the OCaml bindings and failing miserably, but I didn't really try hard enough).

For Clang, there is a number of interfaces, the most stable of which (as in "the one that changes the least across Clang versions") is the C LibClang. There is also the Plugin interface and LibTooling, both of which are based on C++ and provide finer-grained control over the generated AST.

LLVM IR as an interface

If you want to use LLVM from a language for which there are no bindings (and you don't want to write the bindings yourself), an alternative is to communicate with LLVM by parsing and emitting LLVM IR directly, rather than using LLVM's APIs. This is what I did for my Master's software, which I wrote in Scheme. If you intend to take LLVM IR code as input (e.g., for writing a code analysis/transformation), you will have to write an LLVM IR parser, which is somewhat annoying (LLVM IR syntax could be quite a bit more regular, if you ask me), but is not particularly hard. If you don't need to read LLVM IR code, but only emit it (for example, if you are using LLVM as a backend for a compiler), then you don't need a parser, you just need to be able to print valid LLVM IR code. The drawback of this approach rather than using a binding is that you will have an extra overhead from converting your data structures to textual LLVM IR, and then feeding it to LLVM (typically invoked as a separate program (usually the opt tool)), which will then reconstruct it as the in-memory LLVM IR representation, rather than generating the in-memory representation directly and running the LLVM routines as library calls in the same process. On the other hand, that's exactly what a traditional compiler (such as GCC) does when calling the assembler, which takes textual assembly code as input (usually piped into it), so it's not like you're necessarily going to have an unacceptable overhead from this.

If you are writing an LLVM IR transformation in this way, and you want to run it as if it were a pass during compilation of a C/C++ program, you'll have to do some tricks. If you want to run your transformation after all other LLVM IR passes, then your life is simple: you can run clang -S -emit-llvm -o - (your normal arguments) to tell Clang to generate "assembly" code rather than an executable (-S), to emit LLVM IR rather than assembly, to output to stdout rather than a file (-o -), and use your normal compilation flags and arguments. Then you can pipe the LLVM IR output into your program (or make your program call clang and read its output via a pipe), transform it as you wish, and then pipe the result back into Clang with clang -x ir - (more arguments) to finish compilation, where -x ir - tells Clang to read code in LLVM IR language from stdin, and (more arguments) will typically include -o executable-name.

If you need to take the output from Clang before any optimization passes are run, things are slightly more tricky. Even if you run Clang with -O0 some LLVM passes may still run. Worse, if you do that, Clang will not include within the LLVM IR code information needed by the optimization passes, such as type information used by type-based alias analysis (TBAA), which means that if you try to do something like clang -O0 ... | your-pass | clang -O3 ..., the result won't be as optimized as if you had directly run clang -O3 on the source, because clang -O0 will lose information which is needed by some of the optimizations performed by clang -O3. The solution is:

clang -S -emit-llvm -Xclang -disable-llvm-optzns -o - -O3 (your normal arguments)

This will make sure Clang includes all information required by optimizations, but stops Clang from invoking the optimizations themselves. Then you can feed this into clang -x ir - -O3 later and optimizations will work properly. (-Xname option passes the option to the compilation subprocess name. Note also that -x ir will apply to all inputs specified afterwards in the command line, not just the -; if you need to pass, say, an extra C file to be combined with the result of your transformation, then you have to specify -x c filename.)

As far as I know, there is no way to simply intercalate a new external pass (i.e., one implemented as an external program) into the process, like "I just want to run:

clang -O3 -lsomelibrary -o hello hello.c

but with this new pass intercalated"; if you want your "compiler+pass" to accept the same arguments as the standard compiler, you'll have to write a routine or script to do some juggling of the arguments passed to each call to the compiler, to get something like:

clang -S -emit-llvm -Xclang -disable-llvm-optzns -o - -O3 hello.c |
     your-pass |
     clang -x ir - -O3 -l somelibrary -o hello

This is another drawback of using an external program and communicating purely via the IR, rather than writing a real LLVM IR pass (which I guess you could intercalate with some -Xclang option or something, I don't really know).

If you need to run specific LLVM passes on an LLVM IR program, you can use the opt tool. For example, if you want to run the reg2mem pass, you can add opt -S -reg2mem in the pipeline. You can run opt -help for a list of available passes. (-S tells opt to emit textual LLVM IR, rather than bitcode.)

End of Transmission

That's it for today. In the next post, I intend to talk a bit about the LLVM IR language itself.

4 comentários / comments

Lisp without cons cells

2016-05-28 13:14 -0300. Tags: comp, prog, pldesign, lisp, ramble, in-english

Okay, I'm gonna write this down now to distract myself for a while before I get back to Master's stuff.

In a recent post I talked about the problem of cross-process garbage collection, and suggested wrapping objects in a reference-counted container when crossing process boundaries as a possible solution, but I remarked that this would have a large overhead when passing many small objects. The prime example would be passing a linked list, as (at least naively) every node of the list would get wrapped as the elements of the list are accessed.

Now, I particularly cared about this case because the linked list (based on cons cells) is a very prominent data structure in Lisp. And although they have some nice properties (they are conceptually simple, you can insert and remove elements into the middle/end of a list by mutating the cdrs), they also are not exactly the most efficient data structure in the world: half the memory they use is just for storing the "next" pointer (which fills processor cache), whereas in a vector you just need a header of constant size (indicating the vector size and other metadata) and the rest of the memory used is all payload. Also, vectors have better locality. On the other hand, "consing" (i.e., nondestructively inserting) an element into a vector is O(n), because you have to copy the whole vector, and even destructive insertion may require a whole copy every once in a while (when you exceed the current capacity of the vector). I've been wondering for a long time: could you make a Lisp based on a data structure that is halfway between a linked list and a vector?

If we are to allow the common Lisp idioms with this new kind of list, it has to support consing and taking the tail of the list efficiently. (Another possibility is to replace the common idioms with something else. That is much more open-ended and requires more thought.)

What I've been thinking of as of late is roughly a linked list of vectors, with some bells and whistles; each vector would be a chunk of the list. Each vector/chunk would have a header containing: (1) the number of elements in the chunk; (2) a link to the next chunk; (3) an index into the next chunk. Then comes the payload. So, for example, if you have the list (w x y z), and you want to append the list (a b c) on the front of it, you'd get a structure like this (the | separates graphically the header from the payload; it does not represent anything in memory):

[3 * 0 | a b c]
   |
   `->[4 * 0 | w x y z]
         |
         `-> ø

The reason for the index is that now you can return the tail of a list lst without the first n elements by returning a vector chunk with 0 length and a pointer into lst with index n: [0 lst n | ]. If the n is greater than the size of the first chunk (e.g., if you want to drop 5 elements from the (a b c w x y z) list above), we must follow the "next" pointers until we find the chunk where the desired tail begins. This is likely to be more efficient than the cons cell case, because instead of following n "next" pointers, you follow the number of chunks, subtracting the length of the skipped chunk from n each time. In the worst case, where there is one chunk for each element, the performance is the same as for cons cells, at least in number of pointers traversals. (We must only allow empty chunks, like the [0 lst n | ] example, at the beginning of a list, never in the middle of a chunk sequence. This ensures worst-case cons-like behavior. If we allowed empty chunks anywhere, reaching the nth element of a list could require arbitrarily many chunk traversals.)

One problem with this is that now (cdr lst) allocates memory (it creates a [0 lst 1 | ] chunk and returns it), unlike the cons cell case, where cdr never allocates memory (it just returns the value of the cell's "next" pointer). One possible solution is to try to make the result of cdr go in the stack rather than being heap-allocated, to reduce the overhead (the compiler could special-case cdr somehow to make it return multiple values rather than a new chunk, and build the chunk on the fly in the caller if it turns out to be necessary.) Another way around this would be to return a pointer into the middle of a chunk instead of a new chunk. I see two ways of achieving this:

Change the layout of the vector such that it begins with the payload, followed by a special value that marks the end of the payload, followed by the pointer to the next chunk. Now we don't need an index into the next chunk anymore, because we can just make the "next" pointer point into the middle of the next chunk. The drawback is that now finding the end of the chunk requires traversing the whole chunk, rather than looking at the length at the header. This would still be faster than with cons cells, because you're traversing adjacent elements in memory.
Require all chunks to have a fixed size and fixed memory alignment (say, every chunk has 64 bytes and is allocated at an address that is a multiple of 64). Then you can always find the header by zeroing out the last bits of the address.
Third idea that occurred to me now: if all chunks are, say, 16-byte aligned, you can encode an index 0~15 into the lower-order bits of the pointer.

All these have drawbacks. First, you need to know that the pointer you have is a pointer to a cons cell to be able to safely do the pointer arithmetic. (The fixed-size chunks case is simpler to solve: you zero out the pointer and see if it points to a chunk type tag.) Also, pointers into the middle of objects complicate garbage collection (and even more reference counting, I think). Finally, if you fix the size of chunks some of the advantages of using chunks in first place go away; if I allocate a 1000-element list at once, that should get me a single 1000-element chunk.

Or should it? Another problem here is that now garbage collection / reference counting can only collect whole chunks. If you choose your chunks badly, you may end up holding memory for longer than necessary. For instance, if you have a 1000-element list and at some point your program takes tails until it only remains with a reference to the last three elements, and the list was made out of a single 1000-element chunk, now you're stuck with a huge chunk most of which is unused – and more, all the elements in it are held from being collected too. Maybe we'd need a heuristic: if the tail size you want is less than some threshold size of the chunk, the system would return a copy of the tail rather than the tail. This would mess with mutability (you'd never know if the tail list you got shares storage with the original), but maybe immutable lists are the way to go anyway.

The other problem to solve is how to make cons efficient: the classical Lisp cons adds (non-destructively) one element to the front of an existing list, and we don't want to create a new chunk per cons invocation, otherwise the chunks just degenerate into cons cells. One idea I had is to allocate chunks with a least a certain amount of elements. For example, if you create a list with just a, you'd get a chunk with a few blank spaces (and enough metadata to know what is blank and what isn't; this could be an extra header element, or just a distinguished value meaning "blank"): [4 ø 0 | _ _ _ a]. Now, when you cons a new element x into that list, cons would check if there is a space immediately before the a in the existing chunk, and mutate it in place: [4 ø 0 | _ _ x a]. This won't mess with the program's view of the list because so far it only had references to the already filled part of the list. The problem with this is if you have multiple threads wanting to cons onto the same list at the same time: we must ensure only one of them gets to mutate the chunk. For example, say one thread want to cons x onto the list (a), and another thread wants to cons y onto the same list (a). We must make sure that only one gets to mutate the chunk in place ([4 ø 0 | _ _ x a]), and the other one will fail and fall back to either by copying the chunk and then mutating the copy, or by creating a new chunk that points to the old one ([4 [4 ø _ _ x a] 3 | _ _ _ y]; note that outer chunk points into the inner chunk with an index 3, skipping the first 3 elements, including the x added by the other thread). This could have a synchronization overhead. I'm not sure if it would be significant, though, because all you need is a compare-and-swap: "try to write into this space if it is blank". You don't need a lock because you don't need to wait anyone: if this first try fails (i.e., if the other thread got the space first), the space won't be available anymore, so you must immediately fall back to creating a new chunk rather than waiting for anything.

A possible side-effect of all of this is that now vectors as a separate data structure may not be necessary: you just allocate an n-element list at once, and it will largely have the same performance as an n-element vector. Well, unless we make lists immutable, then we may need (mutable) vectors. And lists still have some arithmetic overhead to find the position of the element (because in general we don't know that the list is a single chunk when performing an access, we have to find that out), so vectors may still be advantageous in many circumstances.

Now, back to (trying to) work.

[Update: Apparently I reinvented a half-hearted version of VLists. Also, I didn't mention that, but the Lisp Machine had a feature similar in spirit (but not in implementation) called CDR coding, which used a special tag in cons cells to mean that the rest of the list itself rather than a pointer to it was stored at the cdr place, thus saving one pointer and gaining locality. In the Lisp Machine, every memory object was tagged, so this special tag came more or less for free, which is generally not the case for modern architectures.]

Computers, languages, and computer languages. Às vezes em Português, sometimes in English.

Posts com a tag: comp

2017-10-03 15:10 -0300. Tags: comp, prog, lisp, scheme, in-english

Libraries/modules

Missing things

Records

Missing things

Exceptions

Conclusion

2017-10-01 22:11 -0300. Tags: comp, prog, lisp, scheme, in-english

2017-09-14 19:34 -0300. Tags: comp, prog, lisp, scheme, pldesign, ramble, in-english

2017-06-25 01:32 -0300. Tags: comp, web, privacy, freedom, life, mind, in-english

2017-05-31 21:56 -0300. Tags: comp, web, privacy, in-english

2017-02-10 23:52 -0200. Tags: about, comp, web, em-portugues

Mas por quê?

O plano qüinqüenal

EOF

2017-01-24 19:40 -0200. Tags: comp, unix, network, em-portugues

Port forwarding

Porta local para um host remoto

Porta remota para a máquina local

Proxy SOCKS

Outras opções úteis

Escapes e comandos especiais

Observações

2017-01-02 22:54 -0200. Tags: comp, prog, scheme, lisp, em-portugues

Bytecode

Debugabilidade

(Não-)minimalismo

Módulos

Threads

Orientação a objetos

Setters

Bibliotecas

Unicode

Conclusão

2016-06-17 04:10 -0300. Tags: comp, prog, llvm, mestrado, in-english

The LLVM Project

Documentation, caveat, and scope of this post

Compiling LLVM and Clang

Interfacing with Clang/LLVM

LLVM IR as an interface

End of Transmission

2016-05-28 13:14 -0300. Tags: comp, prog, pldesign, lisp, ramble, in-english

Main menu

Recent posts

Recent comments

Tags

Elsewhere

Quod vide

Posts com a tag: `comp`