- General (18)
- Food (1)
- Cooking (1)
- Ruby (6)
- Rails (2)
- Svn (2)
- Linux (9)
- Git (1)
- Firefox (8)
- Porn (1)
- Freyja (1)
- Witchhammer (6)
- Music (1)
- Merb (3)
- Poetry (0)
- Bolverk (3)
- Sinatra (1)
- Discogs (1)
- Centos (1)
- Python (1)
- Whinging (2)
- Travel (2)
- Scheme (7)
- Lisp (8)
- Sicp (1)
- Rot13 (1)
- Czech (2)
- Metal (3)
- Passenger (1)
- Fun (5)
- Fractals (2)
- Plt (2)
- Clojure (1)
- Continuations (1)
- Presentation (1)
Comprehending first-class continuations
The notion of continuations as first-class values has been a tricky subject for me to understand to a comfortable level of certainty. I think this is probably true for many PLT-laymen like myself. This article represents my attempt at collecting and presenting my thoughts in a coherant manner! I'd be happy to receive corrections and comments.
I will start by defining a few key terms in my own style.
We can observe that every expression, regardless of complexity, has the ultimate goal of returning a value to some surrounding execution context. That context is known as a continuation - it represents everything that is left to compute. Therefore, every computation has an associated continuation, which specifies the place from which execution should continue once control has been returned.
As a simple example, consider the following Scheme form:
The continuation associated with the sub-expression "1" is an anonymous function, whose body consists of a call to the + primitive procedure whose first argument is the local variable n and whose second argument is being sought. With this understanding we can say expressions deliver values and continuations receive values.
By "current", we are referring to the continuation that would be derived from the current point in a programs execution. In other words, the current point of execution, lexical environment and state of the call stack (You should be familiar with the stack and the heap).
For an object to be a "first-class citizen" in a programming language, it needs a few properties. Specifically, it must support being:
- passed as a parameter to a function;
- returned from a function;
- stored in a variable or within a data structure;
- constructed at runtime.
Objects like numbers and strings are first-class in most programming languages. Functions are first-class in all functional programming languages. And in Scheme, my language of choice for this article, nearly everything can be considered first-class - including continuations as we'll see shortly. To avoid confusion, note that I am using the term "object" to refer to any entity that can be controlled within the realm of a particular programming language, rather than to the OO-sense of the word (the instance of a class).
Continuations in Scheme
In order to render the abstract notion of the current execution context (the current continuation) to the programmer, Scheme needs some way to represent it. It would have been possible to implement specific objects and syntax into the language to handle this, but it's much easier to simply represent the current continuation as a first-class function. This process of transforming something implicit into something that can be explicitely expressed is known as "reification". So we can say Scheme reifies the current continuation as a function object.
Scheme provides the function call-with-current-continuation (aliased as call/cc) to furnish continuations to the programmer. call/cc is a unary function that accepts a further unary function, f, as it's argument. When invoked, Scheme will reify the current continuation, c, as a function and apply f to c. Therefore:
When c is applied to an argument, v, the existing continuation is terminated and the one represented by c is reinstated. So program flow will continue on from where the continuation was captured and v will become the overall value of the call/cc invokation.
Confusing? Take a look at this simple example:
At first glance, you may think this expression will evaluate to 7, but infact, it will evaluate to 4. This is because we've captured the current continuation in the formal parameter return and then applied it to the number 4. This causes program flow to return instantly to the point at which the continuation was taken. Yep -- when applied, c will never return! We are effectively emulating the "return" control operator that allows early-escape in many popular programming languages.
To demonstrate the concept of early-escape in a more effective manner, let me show you two versions of a small Scheme procedure - one that makes use of continuations and one that does not. The procedure has-sym? accepts as arguments an S-expression (in this case a list of symbols and/or nested lists of symbols) and a symbol. It returns true if the given symbol is present somewhere within the S-expression. First of all, take a look at the non-continuation version:
Whilst concise and easy to understand, this implementation (because it's defined as a recursive process) suffers from the slight hassle in that even if we find a match, we need to wait until the call-stack unwinds before the actual result gets back to the continuation in which the has-sym? procedure was called. If, perhaps, we captured the continuation of the has-sym? procedure with call/cc and then defined a local helper procedure (to perform the search) inside our function argument, we'd be able to invoke the continuation object and jump straight out of the procedure as soon as we found a match. For example:
At this point it may be worth pointing out that continuations in Scheme, whilst far more powerful, are also more conservative than the goto statement of other languages for we can only revert control back to a place we have already visited (obviously, this is a good thing!). Also note that applying a continuation, unlike lexical closures, does not reinstate the referencing environment! If you change the values of variables - they will remain changed.
Continuations as first-class citizens
So far I've tried to explain the notions of both first-class objects and continuations in Scheme, but when we combine the two and start considering continuations themselves as being of first-class status, we open up a treasure trove of cool ideas. For example, consider the following mind-bender:
We can demonstrate the indefinite extent of continuations in Scheme by capturing the continuation (reified as a first-class function) in the local variable k. This definition of factorial works by capturing the continuation object, modifying the parameter n and local variable total in place, and then invoking the continuation with itself as an argument. This causes something of a local GOTO and rebinds the original continuation to the local variable k. This process continues until n reaches 1 at which point we return the total back to the user. The let* form is important here as it will expand into a series of nested let forms. Without this, we'd rebind total to the original value of n on each invokation of the continuation object. With let* we can store a running total.
As you can see, we have used our power over control flow to compute factorial without using recursion or any built in looping construct. Of course, this implementation also suffers from two obvious downsides: 1) It's somewhat esoteric and hard to understand and; 2) It mutates local state, which makes it harder to prove that this function will maintain referential transparency.
With the power of first-class continuation handling at our disposal, we have the ability to define several control flow constructs common to other programming languages, such as backtracking, try/catch exception handling and "green" threads.
Studying a topic like continuations helps us to solidify our knowledge of several concepts we may generally grow to take for granted. The idea of managing control flow is so rudimentary to our field that we may never stop to think about how such an implicitely understood idea can be explained effectively.
In terms of further reading, I'd suggest the call-with-current-continuation, Continuation and Call stack articles on Wikipedia. The R5RS standard documents call/cc and other control-handling procedures that are available in Scheme. The discussion on understanding continuations at Lambda the Ultimate also helped me in writing this article.
And, finally, I think when such high-level control of continuations is given to a programmer, the words of the great Uncle Ben must be remembered:
"WITH GREAT POWER THERE MUST ALSO COME - GREAT RESPONSIBILITY!"
- Amazing Fantasy #15 (August 1962) - The first Spider-Man story.
Understanding lexical closures
Ok, lets get started...
A closure is a language feature in which a lambda expression (read: function) is bound to the referencing environment that was available when it was first defined. The function is said to close over it's referencing environment, hence the name. If you are unfamiliar with the term "referencing environment" (some people call it the "lexical environment"), we can think of it as a simple dictionary (key -> value) mapping free variables to their respective values. By "free variables", I refer to bindings that have occured in an outer scope (such as the surrounding function). In lexically scoped languages, a functions referencing environment can generally be known simply by looking at the source code of a program.
I don't expect you have any idea what that really means yet - just keep reading!
A simple way to think of a closure in practical terms is as a package that contains two things:
1) A body of code
2) The referencing environment that was active when said code was defined
So, when we invoke a closure (execute the body of code in the package), rather than creating a new referencing environment the closure will grab it's saved environment from the package and use that instead.
The following example demonstrates how we can use closures to hide helper functions that would serve no purpose outside their defining context. Sorry, I could not help but add a factorial example:
The function "fact-helper" is now private to the letfn expression. The named function "fact" is created in the global scope, but retains access to the referencing environment that was available when it was defined. In OOP, we generally achieve this same effect by making a class and defining private functions within it.
On the flip-side, languages that support first-class and/or anonymous functions must also ensure that a referencing environment has "unlimited extent". That is, the interpreter or compiler must guarantee that the environment will not be destroyed until the garbage collector can prove that there is no reference to it. This is important because a closure may be returned from a function and thus may outlive the scope in which it was defined. The somewhat contrived example below illustrates this point:
It is important to note that the environment that the closure reinstates will contain the same bindings, but their values may not be the same. This is especially true in imperative languages in which assignment (changing the value of a variable) is key.
Closure is a simple concept that all programmers should atleast have a basic understanding of. Many modern programming languages are adding support for first-class functions, which means that this style of programming may well become more popular in mainstream languages. As more multi-paradigm languages hit the mainstream, it will become increasingly more important that effective programmers can reason about their programs from multiple perspectives.
In terms of further reading, I suggest you take a look at the Wikipedia page and/or read SICP. Finally, I cannot claim to be an expert on this subject, so if you have any feedback and or corrections, please let me know. Thanks!
Fun with fractals #2
Following on from my last article, I'd like to show you a common fractal pattern known as the Sierpinski Triangle. It was first described formally in 1915 by Waclaw Sierpinski. You've probably seen it before -- it's basically a triangle, which contains three smaller triangles within it. This process can be repeated ad infinitum.
There are many ways to construct the Sierpinski Triangle, but one method that really grabbed my attention involved an approximation from the rows of Pascal's Triangle. It's really quite simple, but I'll briefly explain Pascal's Triangle first.
Pascal's Triangle is an ancient (discovered long before Pascals time) pattern of numbers in which each number within the triangle is the sum of the two numbers directly above it. The first row is simply 1, and the sides of the triangle are made up of 1's. So it grows like this: , [1, 1], [1, 2, 1], [1, 3, 3, 1], etc. Take a look at the Wikipedia article for a proper introduction.
As it turns out, if we generate Pascal's Triangle to 64 rows (or, more formally, any 2^n) and then colour all of the odd numbers in, we get a bloody good representation of the Sierpinski Triangle! I'd think that if we continued this process infinitely (or to some very large n), you'd be unable to tell the difference between the two.
Writing a program to demonstrate this cool little trick was pretty simple. It's just a matter of generating the triangle as a list of expanding lists (one for each row) and then painting each row of numbers to the window by traversing it and selecting the correct colour based on the each value (dark blue for odd, light blue for even). It looks like this:
You can take a look at the source code here: http://github.com/buntine/Fractals/blob/master/pascals-triangle.scm
Fun with fractals #1
A few weeks ago I stumbled upon this rather eccentric documentary, narrated by Arthur C. Clarke, about the discovery of the Mandlebrot set and fractal geometry in general: Fractals - The colors of infinity.
A fractal, described very simply, is a geometric pattern that is repeatable at every level of magnification. Fractals, therefore, cannot be described by our traditional models of geometry. Typically, if you inspect a fractal at differing levels of magnification, you will see identical (or very similar) copies of the whole pattern. Of course, such phenomena has always existed, but it hasn't been until the advent of sufficiently powerful computers that we have been able to generate fractals on a large scale of complexity.
In nature, one of the most simple examples of a fractal-like shape is a tree. Using a recursive algorithm and a simple graphics library, we are able to easily generate a fractal tree. I chose the Lisp language PLT Scheme (or Racket, as it's known these days) to write my programs. Here is my first attempt:
Not bad. Well, perhaps a little too uniform. It works by implementing a recursive pattern known (not surprisingly) as tree recursion. That is, each invokation of the function triggers two further recursive calls. Each call to the function plots a new X/Y position by translating a radian (angle * (PI / 180)) and a length into "cartesian coordinates". It then calls itself again with the new X/Y as a starting position and a new angle, once for (angle - 20) and again for (angle + 20) to simulate the splitting of a branch. This process is repeated a finite number of times and the length of each branch is relative to the current depth of recursion.
By simply adding some randomisation into the angles of each branch (instead of always bending 20 degrees), we start to see a slightly more realistic looking tree! See my second attempt below, which draws four trees with some randomised angles to simulate a burnt-out looking forest:
The source code for both of these programs is available on my Github account here http://github.com/buntine/Fractals/blob/master/uniform-tree.scm; and here: http://github.com/buntine/Fractals/blob/master/random-tree.scm.
Note to self: smbclient
List the available shares:
$ smbclient -U my_user -L samba_hostname
Choose the share you want, and connect:
$ smbclient -U my_user //samba_hostname/Share_Name
If you get "Not enough '\' characters", do not just keep adding them! You probably want to shuffle your arguments around (such as putting -L last).
To turn off those annoying prompts on an mget, just toggle it:
ssh root@localhost '\ ssh root@localhost '\ ssh root@localhost '\ ssh root@localhost '\ ssh root@localhost 'killall -9 wife'''''
Constant outages with Passenger 2.2.11
After pushing a large Rails app live recently, I was constantly struggling to keep Passenger live. Roughly, every 24 hours passenger would simply hang and stop handling requests. The machine (VPS running CentOS 5, 2gb RAM, 4 CPUs) had sufficient free resources and memory. Restarting Apache would instantly fix the issue and flush a bunch of "Broken pipe" errors to the error logs.
I found that killing the passenger process with a SIGABRT signal (6) would cause the process to write a stacktrace to Apaches error log. This was mildly useful.
As it turns out, I was able to solve the problem by finding and killing a couple of mysteriously dormant Ruby processes that passenger had left running. Why these were still running I am not sure. The Passenger team claim that the issue will be fixed in Passenger 3, but in the mean time, keeping an eye on Ruby processes may be beneficial. My (moderately heavy traffic) app has been online for about 7 days without a hitch now. Hopefully it will stay up indefinately now.
It's also worthwhile to make sure Apaches MaxClients setting is at an appropriate level (max_clients * average_process_size). My Passenger seemed to hang up as soon as this was hit. Be careful, though, as it turns out that Apache will keep spawning new clients even if you are out of memory -- I found this out the hard way.
Here are some useful discussions: http://code.google.com/p/phusion-passenger/issues/detail?id=318 and http://code.google.com/p/phusion-passenger/issues/detail?id=199.
Czech metal I NEED
I've been collecting Czech heavy metal LPs for several years now. There are just a few titles that I am missing. Some of these are so rare that only a few people even know they exist. The Sax Pijak is not so rare, but I still haven't got it yet.
I've seen some of them in photos and others I have been told exist from Czech nationals. The only one I have never actually seen is the Merlin "Nevada" LP (edit: Found it!). If anything has one of these, please let me know!
- Sifon - 100 høíchù LP (self, 1991) - FOUND 01/06/10
- Ultra Metal III LP (Monitor, 1993) - FOUND 13/05/11
- TÖRR - Kladivo na čarodějnice LP (Monitor, 1993)
- Alkehol - Alkehol LP (Monitor, 1992) - FOUND 20/05/10
- Alkehol - S úsmìvem se pije líp LP (Monitor, 1993) - FOUND 13/08/10
- Arakain - Schizofrenie EP (NOT LP) (Suprahpon, 1991) - FOUND 23/03/11
- Alien - Kino noc a kino den/Cesta všech EP (Best I.A, 1992)
- Distorze - Distorze / Zrcadlo (Best I.A, 1990)
- Distorze - Fuck of War EP (Private, 1991) - FOUND 03/02/11
- Toxic Trash - Same LP (Pring/Private, 1992) - FOUND 10/11/10
- Tarantula - Peklo pro všechny LP (Tony records, 1992) - FOUND 31/08/10
- Merlin - Nevada LP (Monitor, 1991) - FOUND 17/08/10
- Sebastian - Sny o Marii LP (Monitor, 1991) - FOUND 05/07/10
- Mac Beth - Mac Beth LP (Best I.A, 1992) - FOUND 02/09/10
- Sax Piják - Válka nervù LP (Sacred records, 1992) - FOUND 01/08/10
- Kabát - Děvky ty to znaj LP (Monitor, 1993) - FOUND 09/08/10
- Pathologist - Putrefactive and Cadaverous Odes About Necroticism LP (M.A.B, 1993)
- Apolon - TVÁR SEBCA / ČACHTICKÁ PANI EP (Zeras, 1992)
- Skelet - DOUTNÁ ZLO / MOR HNĚDÝCH KOŠIL (Supraphon, 1987)
- Citron - Radegast LP (ZYX Metal, 1989) - FOUND 24/09/10
- Corona - Naivka/Muzu ti lhat EP (Private/P&R, 1993)
- Calibos - Calibos LP (Globus Intl., 1992) - Does this exist??
- Trezor - V srdci tvém/Pìsti EP (Best I.A, 1991)
- Hematit - Slabý boxér / Dvojka zo správania EP (Private, 1987)
- Triumf - Žádnej soucit/Chci to mít EP (Supraphon, 1989)
- Triumf - Věčnej boj/Tisíc tváří EP (Supraphon, 1989) - FOUND 25/10/10
- Sing Sing - Sex Attraction LP (Multisonic, 1990) - FOUND 15/01/11
- Kern - Women Hunters LP (Supraphon, 1991)
- Kern - ZTRÁTY A NÁŘEZY LP (Tommü, 1993) - FOUND 10/04/11
- Caravela - Self EP (Private, 1994)
New firefox extension: Quick ROT-13/47
ROT13 (that is, ROTate by 13 places) is a very simply substitution cipher that is often used in online forums, games and chat rooms. The idea was invented by Julius Caesar over 2,000 years ago (it was ROT3), but has been popular on the WWW since the '80s.
It works by rotating each character in a string of text by thirteen alphabetical characters, wrapping over to the beginning if necessary. For example:
Hello --> Uryyb --> Hello
As demonstrated above, one of the cool things about this simple technique is that it is it's own inverse. That is, applying the same filter twice will yield exactly what you started with. This, of course, makes it totally useless as a production-level encryption algorithm, but very handy for non-critical uses such as games and puzzles.
A very-similar alternative is ROT47, which applies the same concept to a larger range of characters in the ASCII set (94 to be exact: 33 to 127). This allows us to include several additional characters such as exclamation marks, commas, etc.
I've written a Firefox add-on, which allows you to select some text on a webpage, right-click it, and then generate the ROT13 or ROT47 substitute. From here you can re-apply the cipher and/or copy the new value to your local clipboard.
Please check it out and let me know of any problems you run into. Thanks!
Official download: https://addons.mozilla.org/en-US/firefox/addon/125128/
Or view/modify the source code: http://github.com/buntine/Quick-ROT13/
An assembler for the Bolverk machine emulator
After I finished writing the Bolverk emulator (source, implementation) in early 2009, I instantly had ideas about designing an assembly language for it's native instruction set. It sounded like an interesting project.
After reading the first few hundred pages of Michael L. Scott's brilliant book "Programming Language Pragmatics" during February of this year, I felt that I had enough knowledge about writing simple compilers (most assemblers are not really considered compilers, but shhhh) in order to get started. In fact, the whole process turned out to be far more challenging and rewarding than I would ever have guessed before.
An assembler provides a level of abstraction above the native language of the machine at hand. Most basic assemblers provide a one-to-one mapping between named instructions and native commands (generally ones and zeroes). Many more modern assemblers also support more advanced features like macro expansion and control structures over basic JUMPing.
With an assembler, we can now express instructions to Bolverk in atleast slightly more memorable instructions. This is the reason the primitive procedures defined in an assembly language are often referred to as "mnemonics".
Another great advantage of an assembler is it's ability to define arbitrary abstractions that defy the architectural restrictions of it's target machine. In particular, the bolverk machine specifies that only 4-bits are used to identify an instruction. This leaves us with only sixteen (0..F) possible options. If chosen correctly, these sixteen natives will be more than enough to define useful abstractions by compounding and naming them. As an example, the PVDS mnemonic used in the example below will actally compile to 3 lines of machine code.
With the finished product, we can now write something like this:
-- Print the sum of two decimal numbers (-10 and 120) -- followed by an exclamation mark. VALL 1, -10 VALL 2, 120 PVDS 1, 2 PVCH '!'
Instead of the very-hard-to-remember machine code equivalent:
21f6 2278 2421 5123 33a0 34a1 d1a0 d0a1 c000
The assembly version required only four instructions opposed to the nine required in the machine code version. It was also presented in much more readable manner.
Once we know the following semantics:
- VALL: Load into register X the value Y
- PVDS: Print to STDOUT the value of the sum of register X and register Y
- PVCH: Print to STDOUT, the value of X as though it's an ASCII character
The assembly version becomes very easy to understand.
The source code is openly available from my git repository here: http://github.com/buntine/Bolverk-Assembler. I will add a front-end for it into the web-based implementation of Bolverk (linked above) in the coming days. I've also published the grammar and a language specification in the same git repository.
I feel that my work on this set of projects could really be beneficial in teaching the ideas behind a Von Neumann machine to beginning computer science students. If found by the right person in the right position, I think they could do wonders with it. If you are that person - please give me an email or some feedback! I'd happily donate all of my work to your organisation if it can be put to good use.