Log in

No account? Create an account

End of Line
Well, I've run out of month, if not out of languages. I guess I failed at my goal; still, the exercise was definitely worth doing. I'm definitely going to finish the one I'm currently working on, in C. Doing it continuation-machine style in C was probably a mistake in terms of complexity -- I suspect it's a good order of magnitude worse than any of the previous ones.

At this point I have the general structure complete -- I have a 'step' function that takes a (continuation, expression, environment) tuple, and produces another one of same. My continuations are structured as chains (stacks) of frames, with each frame being specialized to a particular in-progress operation -- either an application or one of the special forms (progn, cond, etc.)

The state tuples and continuation frames are mutable, and I should probably be manually memory-managing them but I'm currently just letting them become garbage. The s-expression structures are immutable (that is, I never mutate them), and are meant to be GC'ed. (When a continuation is captured or thrown to, I deep-copy it; a continuation embedded inside a sexp is immutable and gc-able.)

Since I'm off the clock now, I think I'll go ahead and finish it with a parser and a simple garbage collector.

After that I may continue to do some of the other languages I didn't get to; we'll see. There are definitely a few left I'd like to do.

Intermezzo: Small-stepping environments
I am currently working on an interpreter written in C, with the intention that it should support continuations and call/cc. I am doing this with continuation-machine semantics, which means small-step evaluation. In the process of figuring out how continuations interact with environments, I think I've discovered that my small-step evaluator in SML was broken. It did not make use of substitution; instead it used an environment. I'm convinced at least that my implementation as written is broken: On function application, it turns out to throw away the existing local environment and replace it with the environment that should exist inside the function. But it never restores the old environment when it's done evaluating the function body -- how can it? The old environment's gone!

I don't think it's possible to do this correctly without extending sexprs to hang saved environments off them (messy), or keeping a separate environment stack off to one side (probably doable.)

In the one I'm implementing now, though, I think continuations save me -- if the environment is a property of the current continuation, then as we build and pop stacks of continuations we will correct save and restore the environment. I think. I'm working through it now...

Interpreter 7: Twelf!
I am pretty excited about my Twelf Lisp interpreter. It slices, dices and even parses! I decided to do a parser because it seemed like it would be easy with logic programming (free backtracking!) I am taking slight advantage of the logic-programming features for the list case of the parser, but mostly I ended up just writing in a functional style (more on the problems that caused later.)

NotesCollapse )

The rest of it
So other than that, this was a pretty straightforward exercise. Although I wrote almost exclusively in a functional style, the one place I _did_ take advantage of backtracking -- when parsing lists -- was great, and it would be wonderful to be able to write mostly-functionally with a bit of free backtracking search thrown in. (Cue the Haskellers. Or the Racketeers I guess. Or anyone else whose favorite language has this facility. ;-)

A few general notes:

I have now used at least twice a "standard approach" for implementing first-class object-language primitives without first-class metalanguage functions, which is to represent them by stand-in tokens (Prim "foo"), which pass around in a first-class way, and then put special cases for them in the apply function. I suppose in retrospect this is the obvious approach -- in the same way that it's obvious to represent closures by pairs of function bodies and environments, without using metalanguage closures -- but it took me some thought to come up with.

Extending my demand from the other day that languages should have built-in prettyprinting and built-in logging: I extend it to demand built-in serialization and deserialization of positive types, and built-in structured logging built on that. (Insert joke about shooting hostages if my demands are not met.) [This demand is not related to working in Twelf; it is related to some other thing I was reading about the usefulness of structured logging.]

I had a good two more instances of "forgot a prime on a variable name" bugs. Whoops. I also had a few "this is hopeless and I don't understand what's going on so I will rewrite the whole thing" bugs, all while trying to do parsing in fancy ways, all of which were before I understood exactly what order the search happens in.

Overall: That was actually really fun. But it took too long... ;_;
Source code: here.

Interpreter 6: Perl
Okay, so I went ahead and wimped out and did an easy one because I'm behind. Perl's a language I already know well, and I did basically a straight-up port of the Pike version. They turn out to be very similar languages! There are some interesting points of comparison between them.

NotesCollapse )

More general notes:

Why doesn't any language seem to have a 'multimap' function, which maps over N lists in parallel? I guess the typed languages don't have it because you can't type it without at least dependent types (and I think you also need list kinds if you want it to be heterogeneous), and the non-lisp languages probably because they just don't like that kind of higher-order nonsense anyway. But why doesn't lisp have it? (I bet it does, and I just don't know where it is.) Something like this would be ideal in the functions-on-blocks-pretend-to-be-syntax languages, a-la:

for list1, list2, list3 {|x1, x2, x3|
  return {foo: x1, bar: x2, baz: x3};

It would be like a super-duper-zip.

I've mentioned this in previous posts, but I have trouble stating forcefully enough the importance of including debugging aids in language infrastructure. Logging should be built-in. Tracing (i.e. at the least being able to log, if not execute arbitrary code, on function calls/returns) should be built-in. Error signalling of some kind (exceptions or "die" or what have you) should be built-in. For the love of god, backtraces on error reporting should be built-in. A stringification operator should be built-in (who cares if it's a shitty one, for debugging purposes any unambiguous representation will be fine.)

Bugs I discovered in the previous version while writing this version: Forgot to evaluate the exp in "(define exp)". Oops. Also put a period where I needed an arrow. I think the fact that this wasn't an error leads me to a principle of language design: The more denser your syntax (and hence the higher the probability that any stupid thing will fail to be a syntax error), the more you need types for protection against stupid things that fail to be syntax errors.

Interesting implementation fact about this interpreter: Since none of my primitive functions actually require the context, I forgot to pass it to them. This means I couldn't write set. Since the setq builtin sort of obsoletes the set primitive, I just left it out...

Code: here.
Next: I don't even

I pretty much banged this one out at top speed to try to reduce my behinditude, but it is still about three days. I'm not sure if I should do a legit one next, or another fast one, but whatever it is, I'd better start it quick.

Interpreter 5: Pike
Definitely downshifting to one every two days, so 15 interpreters for the month. It's a much more feasible speed. (And I'm already a day behind it! But I have a hope of catching up this time. ;-)

NotesCollapse )

It's interesting to note that a metalanguage-level "map", even with mutation, is not sufficient by itself to create a mutable environment that can be captured in closures. We need a structure that lets a closure shadow _some_ old mappings, but still "see through" to other old mappings; thus my class Env which performs this function. I rewrote to that about halfway through implementation.

Code: here.

Interesting bugs: Converting from environment map to environment class, forgot to change how I checked presence in the mapping (for maps there is a nasty builtin implementing "does this returned zero have the magic 'not in the hash' bit set on it, or is it a real zero?"); in eq primitive, returned metalanguage truth (zero or one) instead of object language truth (empty list or anything-else -- zero is true in lisp); in minus primitive, implemented (- x y_0 ... y_n) as "x - x - y_0 - ... - y_n" instead of "x - y_0 - ... - y_n", i.e. I mapped over too much of the arg array.

Next: Maybe Scala? (or OCaml, Go, Forth...)

Day "4": SML/NJ
Welp, I think I jinxed it when I declared that it should be quick, since this one took about 3.5 days.

NotesCollapse )

I discovered a couple more bugs in my version of lisp. I remembered that "cond" is supposed to have what's called an "implicit progn", which means that each condition/expression pair is supposed to take an arbitrary number of expressions, but I only take one. Pretty minor, and I haven't bothered fixing it.

I did change my implementation of scoping to be more Lisp-like; bindings made with define now go into the global scope, which is magically-dynamic, whereas all function parameter scoping (and by extension let scoping, if I had let) is lexical. This is done because it allows for things like free mutual recursion, and redefinition of existing functions in the interactive environment.

I had one very interesting bug in my implementation: I decided to use SML lists for Lisp lists, which raises some adequacy issues. In particular, because Lisp cons is not typed, Lisp allows the tail of a cons to be a nonlist, which is of course impossible in my representation. I consider this an okay tradeoff for compactness of code. But, when I made this switch, I took out CONS but left in NIL, so I had both NIL and (LIST []) representing the same thing. So all my cases had NIL in them, but code that recursively emptied lists would bottom out at (LIST []) instead. Oops.

SML's errors are really insufferable. Honestly? They're worse than C++'s. And that's terrible. The random and inconsistent expansion of type aliases in unification failure errors means I end up having to copy-paste the errors into an editor to figure out where the problem is.

Overall experience: This one really dragged, for some reason. It ended up being twice the length of the Haskell one, I guess because of the small-step semantics (I haven't compared them.) I may be less masochistic with the next one, just to get it done.
Source code: here.

Future plans: I am considering dropping the rate to one interpreter every two days, which appears to be more realistic, and is about where I am anyway at this point. Opinions welcome in the comments.
Tomorrow: Maybe Pike for real this time. Or maybe something weirder. Forth?

Day 3: C++ template metalanguage
Oh dear lord.

For those of you who aren't aware, C++ has a non-parametric form of type polymorphism implemented with a system called "templates". As a long-time C++ programmer, I'm pretty familiar with templates; in particular, I'm aware of the fact that the template language can do Turing-complete computation at the type level, at compile time. (I've written an untyped lambda calculus evaluator and an SKI combinator calculus evaluator in it before.)

Further notesCollapse )

It's a bit like Groundhog Day, doing the same exercise in a different language each day, and finding some of the bugs from the previous day every time. Bugs unearthed this time: I was missing 'cons' yesterday. There was a bug in 'define' yesterday which included an extra layer of list around variable definitions.

I was cross with myself for implementing primitive functions as cases in 'eval' the last two days, and wanted to make them real functions instead. (In fact, thinking back on it, that means they haven't been first class! Oops.) Today I made them first-class, but ended up having to implement them as special cases in "apply" instead, because the underlying language does not have first-class functions! (Oops again.)

In Haskell yesterday, it was nice to be able to use laziness to implement recursive functions -- I defined the function in an environment that contained itself, and laziness handled the self-reference. By contrast, today I used backpatching, taking advantage of the "mutable" heap I threaded through everything.

The way I've been handling "define" is not the same semantics as a real lisp, and the semantics I've given it are kind of weird. They're lexical, but the scope they're valid in extends outward to the boundary of function bodies, or everywhere at toplevel. So you can "define" inside a progn, as expected (otherwise you could never do anything at all.) You can also define inside, e.g., an argument to a call to cons.

That's kind of acceptable, on reflection, but lisp normally treats "define" as being a dynamic setting of a name bound in a global scope, and I might switch to that in the future.

Overall experience: Painful, but awesome.
ETA: Major bugs: forgot 'typename' in a lot of places; forgot ::r_val (like ::result above) in a couple places; changed some types around and forgot that template instantiations do not consider e.g. "char[]" and "char*" to be the same type.
Source code: Here.
Amusing C++ template error messages (note the huge backtrace!): Here.
Tomorrow: Well, since "tomorrow" is today because I'm late with this one, maybe SML, since that should be really fast.

Day 2: Haskell
Day 2 language: Haskell!

Commentary!Collapse )

Find the source here. (Check that directory for yesterday's source, too, and my Git repo (in a hidden .git directory).)

Tomorrow: Pike, perhaps?

EDIT: I've added links in my profile to the intro post and the list of languages. I'm keeping the list of languages updated as I go along.

(no subject)
Writing Day 2 in Haskell. In the process, I realized that Day 1 is not actually complete -- it doesn't provide for defining recursive functions. So it's not self-hosting. Oops. (This would be why the version of 'define' that defines functions is not just lambda plus the ability to bind globals -- the latter doesn't give you recursion.)

Not sure whether I should fix Day 1 now that it's done. I will at least endeavor not to have the same bug in Day 2. :-)

As a note to myself, but also to you my readers, here are some languages I'm considering:

Huge list of languagesCollapse )

For some of the languages in that last list, I don't even know whether it's _possible_ to do a Lisp interpreter in them; for others I doubt my ability to learn them well enough in a day, or to express a Lisp interpreter concisely enough to write it that fast. I also subsumed APL under J, because I doubt my ability to produce the APL character set.

EDIT: I have gone through and noted for which languages I have interpreters/compilers installed on my system. I will update the list as I install more.