mercurial basics

So I’m teaching a class this semester, Introduction to Programming for Non-Majors. One of my goals is to familiarize the students with version control, a method of tracking changes made by one or more people to a group of files. The specific version control system we are using is called “mercurial”. Mercurial is an open source project, written in the programming language Python. It can be accessed from the command line on the mac or on a linux box.

For windows, TortoiseHg is an application that contains its own version of mercurial, python, and some modifications to windows explorer. If you are running windows, you should install TortoiseHg and look at their documentation for instructions on how to accomplish the commands in the rest of this post. The concepts are the same, only the interface changes.

A mercurial *repository* can be viewed as two components: a special file that tracks or makes changes to a specific folder, and a bunch of commands that let you interact with that file. The special file has to be inside the folder it is tracking, which means if you copy the folder from one computer to another, you copy the entire repository.
However, this process has to be done correctly, which is where the concept of CLONING or PUSHING comes in. You use the command:

hg clone [what you want to clone] [where you want to put it]

To copy a repository, and the command

hg push

To update the original folder with your changes.

To make changes to the repository, you can use the commands ADD, REMOVE, and COMMIT.

If I create a new file in the repository, or copy a file into the repository, the special file doesn’t automatically know what to do with it. The command

hg add [file you want to add]

tells mercurial to start tracking that file and make note of any changes that are made.

The command

hg remove [file you want to remove]

tells mercurial to STOP tracking a file, and ignore any changes.

hg commit -m 'some commit message' -u 'someuser'

saves the changes you have made to the special mercurial file. It creates a new version of the repository, like a snaphot in time of what all the files that are being tracked look like at the time you commit.

So the basic order of events is:

Step 1: create an online repository
Step 2: CLONE it to your hard drive
Step 3: place or create text files in the folder you created when you
cloned the repository.
Step 4: ADD those files to your repository
Step 5: COMMIT those changes
Step 6: Repeat steps 3-6 as needed
Step 7: PUSH your changes.

Repeat steps 3-7 as needed.

stupid *nix tricks

I am not a *nix guru. I am in fact, a long, long way from being a *nix guru. But I really enjoyed using a little trick I cooked up while grading today, that goes like this:

lp -d "PRINTER_NAME" */commonly_named_directory/*

See, I put every students submision under their name, then in a folder for the assignment (with my name) then their file name (which varies). So that piece of BASH there prints every file in the assignment folder from every student in the directory. How cool is that?

Okay, so it isn’t that cool. But it saved me a few minutes.

just the lexer, ma’am

So in my compilers class I’m supposed to write a recursive descent parser for the WAE grammar. To remind you, this grammar looks like this:

WAEStart ::= WAE SEMI
WAE ::= num | {+ WAE WAE} | {- WAE WAE} | {with {id WAE} WAE} | id

I’ve created this in python, using the PLY parser. That code has scoping issues, and is available on google code. I’ve also built this using treetop, with some modifications on the unary minus (a new rule, WAE::= -WAE , and that’s available from google code, bitbucket, and github.

But I also wanted to try to build both a recursive descent parser and possibly a packrat parser by hand, and maybe using a combinator, so I’m working on another version of this grammar using rparsec. I chose rparsec because I needed a lexer with simple, easy to understand, useful output. Possibly, there are a dozen rubygems that solve this problem, but rparsec is the one in “Practical Ruby Programming”, so that’s the one I used.

I also wanted to test out the pygments tool, which is a syntax highlighter that converts code for display in various formats. So, using pygments, my lexer looks like this:

require 'rubygems'

require 'rparsec'

module WAEParser
  include RParsec
  extend RParsec::Parsers

  Id = regexp(/[a-z]/)
  Num = number.token(:number)

  Reserved = Keywords.case_sensitive(%w{with exit})
  Ops = Operators.new(%w{+ - \{ \} ;})

  Lexer = longer(Id.token(:id) , Reserved.lexer) | Num |  Ops.lexer

  Lexeme = Lexer.lexeme << eof
end

The method Lexeme.parse(string) produces the array of tokens.

WAE Grammar in Treetop


So an interesting project that I undertook this semester was to use the Treetop parser to create a grammar for a simple language. The code as of this writing is available on bitbucket, version 49dfaf799f82 is the tip right now.

One of the problems that I ran into when I was writing this grammar was a pair of error messages that appeared when I attempted to split the .treetop file away from an object behavior. One error was this one, private method `eval' called for nil:NilClass (NoMethodError), which was incredibly frustrating.

This turned out to be a relatively simple fix. The parent of the expression being evaluated may need to have a method named ‘eval’ as well, that calls the eval method of the child object.

What this means is that if you have three different leaves defined like this:

rule waestart

wae ';'

end

rule wae

id / num / expression

end

rule num

"0" / [1-9] [0-9]* {

def eval(env = {})

text_value.to_i

end

}

end

then if you try to parse a number, you’ll get the error private method `eval' called for nil:NilClass (NoMethodError). What I noticed was that if deleted the top rule, I stopped getting the error… or rather, what I noticed was that if I added the top rule, I started getting the error. Because of my process, this wasn’t immediately clear.

My fix for this was to change the top rule to this:

rule waestart

statement:wae ';' {

def eval

statement.eval

end

}

end

I’m not positive why this works, but I’ve had to employ this fix for at least one other function in my parser. It looks like methods can only be called by parents, not grandparents. By binding the value of a waestart eval to the statement, we create the same situation that we would have if our start token were wae instead of waestart.

mercurial

I’m taking compilers this semester, and we’re storing our code on google code with mercurial as our version control system. I like mercurial a lot, it it easy to use, fast, flexible, and sort of fun. We’re branching and merging way more than is reasonable, considering that there are only two developers and 3 files.

But I did find one thing frustrating, and that was the merging process. If I merged two files, I sometimes had to go in and edit those conflicts by hand, rather than using a convenient tool like the FileMerge application that comes in the Mac Developers kit.

After some searching, I found out how. All of the information is on selenic, but this is the brief version.

  1. Find out what folder stores the binary. Open a terminal and enter “which hg”. If this is the /usr/local/bin/, you will need to create (unless you already have one), a file in /usr/local/etc/mercurial/ called ‘hgrc’.
  2. In some folder on your path put the following shell script, named opendiff-w
    #opendiff returns immediately, without waiting for FileMerge to exit.
    # Piping the output makes opendiff wait for FileMerge.
    opendiff "$@" | cat
  3. make sure that opendiff-w is executable. Try chmod 744 opendiff-wNow, in your 'hgrc' file, write this: [merge-tools]
    filemerge.executable = /path/to/file/opendiff-w
    filemerge.args = $local $other -ancestor $base -merge $output

That's it. The next merge that requires your attention should call FileMerge!

matrix multiplication

I never can remember exactly how to multiply matrices. So, I’m going to explain it to you, and hopefully, that will explain it to me.

One of the first questions you might have is: “What is a matrix, and why would I want to multiply it?”

When we talk about a matrix in computer science, and probably in engineering, physics, chemistry, and probably in a movie starring Keanu Reaves and an absurd quantity of latex, we’re usually talking about an array of numbers. Like this:

1010
1001
1111

That’s a 4×3 array of numbers, a two dimensional matrix. A matrix is just an m dimensional group of numbers.

Typically, in comsci, we start labeling dimensions at i. So the array in Fig. 1 has two dimensions, i and j. If it were a three dimensional array, I would add a dimension k and so on.

A one dimensional matric is called a vector.

So why do we want to multiply matrices? There are a lot of reasons, but the one that is simplest for me to keep in my head is that a point on a screen is represented by a matrix, and to move that point through a rotation, translation, or scale, you multiply it times a transformation matrix. In other words, if you are playing a video game, using an illustration program like inkscape, or rendering an animation, there’s a whole lot of matrix multiplication going on!

Because matrix multiplication is fundamental to so many problems that computers have to solve, it’s a popular problem for discussing algorithm design and models of computation. It acts as a baseline measure, much like sorting. In parallel computation this is very important, because there is no settled and accepted general model for parallel computation and because parallel computation for graphics processing is so common. Your computer almost certainly has a graphics card in it processes in parallel.

And that’s why computer scientists should know how to multiply matrices. Because otherwise, you can’t talk intelligently with other computer scientists about important things, and you won’t understand papers that discuss those problems either.

There’s a great scene in Stand and Deliver in which James Olmos recites “A negative times a negative is a positive, a positive times a positive is a positive, a negative times a positive is a positive” and all the kids chant with him. After a few cycles, he asks “Why?”

The rule for vector multiplication are: a vector times a vector is a matrix. A vector times a vector is a scalar. A matrix times a matrix is a matrix.

This is confusing. The confusion is because there are two ways to multiply vectors in linear algebra, the inner product (also called the dot product) and the outer product. So, given an vector x and a vector y, there are two ways we can multiply it together. x · y will yeild a single (scalar) value. x × y will yield a matrix value.

Given two vectors of length n, such as

[ 1 2 3 4 5 ] [ 6
7
8
9
0 ]

How do we multiply them together? If we want the dot product (x ⋅ y or xTy), we do this: (1 × 6)+(2 × 7)+(3 × 8)+(4 × 9)+(5 × 0), or 6+14+24+36+0, or [80]. A vector times a vector is a scalar.

The outer product is a little different. When we take the outer product, written as xyT or x × y, we would get a matrix:

1*6 1*7 1*8 1*9 1*0
2*6 2*7 2*8 2*9 2*0
3*6 3*7 3*8 3*9 3*0
4*6 4*7 4*8 4*9 4*0
5*6 5*7 5*8 5*9 5*0

Which brings us to the question of multiplying matrices. So, if you multiply a matrix times a matrix, you don’t get a super matrix, or a vector… you just get another matrix. How does this work?

Well, if you have a matrix with i rows and j columns, you can multiply it times a matrix with j rows and k columns to get a matrix with i rows and k columns. So to find the value of a cell i,kin that matrix, you just take the dot product of the ith row with the kth column.

Easy as pie.

Why this is so… honestly I’m not sure. The theory of linear algebra is a closed book to me. But I think I’ll remember this for qualifiers now.

consciousness

Some time ago I read Susan Blackmore’s excellent book, The Meme Machine, and while I can’t say that it changed my life, I will say that it changed the way that I understand my life. And yours, for that matter. Of course, this change is difficult to articulate and to put into effect, and its compounded by a few other books I read at the time.

This is just as I was starting my Graduate Studies, and I read Complexity. And it worked really well with Blackmore’s work, to the extent that it just sort of slotted in what I had been thinking about what she had been writing with what Stuart Kaufmann and various other people had been on about for the better part of a decade. In other words, I realized I was right about something.

And I think I’m fortunate that I was right about this at the beginning of my graduate studies rather than later, for reasons that will become apparent shortly. But first, I want to talk about the halting problem.

The Halting Problem is one of the basic principles of computer science, and it has to do with the theoretical limits of the computer you’re using to read this. Basically, the problem is this: given a web browser, and a web page, will that page ever stop loading?

This seems like a trivial problem from one point of view. For example, the W3C markup validation service can assure you that a particular page is correct, and has no open tags, and is a properly constructed web page. And you would think that therefore, the page will load in a browser.

But of course, every page has the potential to contain some arbitrary commands in PHP or Javascript or other langauges, and while those programs may be correctly constructed, they may or may not ever finish loading. It just depends on the content of the script.

It seems like it should be easy to find this out. After all, if you can scan the program to make sure that it doesn’t contain any open tags, you should be able to find out if it has an ifinite loop in it. And for trivial situations this is true, and most modern coding tools will actually warn you if you create a loop with no end conditions.

But for non-trivial cases, there is the halting problem.

The halting problem was a big deal back when Turing introduced it. Like Gödel’s incompleteness proof, it was revolutionary. People at the time assumed that the limits of human understanding and the limits of our formal inventions, like mathematics, were of the same order. And that just turns out not to be true. There are things that we just can’t do.

But to the next generation of scientists, this is just part of the way the world is. We can’t write a special tester that will look at web pages and know whether they load before actually loading them in a browser. Fine.

I had a point here, that has to do with expectations and exploration, but I need to come back to it tomorrow.

presentations

Something that seems strange to me in presentations is that I have a tendency to read off the wall, instead of off the screen in front of me. Strange habit, and one I need to break.

Anway, I gave a presentation friday to the DIMOS group that went very well. The BSR implementation that we are contemplating seems more plausible, not less, as we examine it more closely and do more work. That’s always a good sign.

Had a very long conversation with my advisor concerning the precision differences between chaotic and complex systems. This is not a particularly subtle distinction, actually, but it is an esoteric one. If you don’t specialize in either, chances are you don’t know how to define either.

Unfortunately for me, I don’t actually understand either particularly well, so my formal definitions for both are confused. Fortunately, Mark Chu-Carroll of Good Math, Bad Math is writing an excellent series of articles on this topic. Based on those, I think that the immune system, which is one of the three systems that we’re simulating for my dissertation, is chaotic: there’s periodicity in the system, it is sensitive to initial conditions… although how much is probably a big question, and the entire system depends of neighborhood overlap, that’s how information propogates.

But what is a complex system? Melanie Mitchell, in her book, Complexity: a Guided Tour and Flake, in The Computational Beauty of Nature mention some basic properties that are interesting.

  1. Homogeneity: A system composed of a large number of similar parts with relatively simple behavior.
  2. Emergence: The interactions of simple components give rise to complicated, difficult to predict behavior.
  3. Computation: A complex system transmits and processes data within and outside the system

Mitchell adds the concept of adaptation to this list, but Flake differentiates between complexity and adaptive complexity.

All of which leaves me wondering about the Venn Diagram… are all complex systems chaotic?

as many b’s as a’s






This is the first half of a proof that is the first question from the may qualifiers. Tougher to write up than I thought it would be.

Let L be the language defined by the following recursive definition:

  1. Λ and b are in L
  2. If x is in L, then axb and bxa are in L
  3. If x and y are in L, then xy is in L
  4. No other strings are in L


Give a simple non-recursive definition of L. Give a formal proof that your answer is correct.


2  Solution

Definition 1  A:
The language of all strings over
{a,b}*, where the number of bs is greater than or equal to the number of a’s, that is, where |b|≥|a|.

To be proved: L = A.

This will be proved in two steps.


  1. LA
  2. AL

Proof.LA

We proceed by structural induction. Note that for a string x, |a|x is the positive integer number of as in x.

Basis Step: Λ and b are in A. This is obvious, both are strings over {a,b}* | |b|≥|a|

Induction Hypothesis x, yA

Statement to be shown in induction step: xy, axb, bxaA, that is, applying any production rule for L to a string in A produces another string in A.

Proof of induction step:

Theorem 1  
If
xA and yA then xyA.

We know that for x, |a|x ≤ |b|x and for y, |a|y ≤ |b|y. By the rules of concatenation, xy has |a|x+|a|y, |b|x+|b|y.
By properties of integers, |a|x+|a|y ≤ |b|x+|b|y, so xyA

Theorem 2  axb, bxaA


x has |a|x ≤ |b|x by definition. axb or bxa both form new strings with the property that |a| = |a|x+1, |b|=|b|x+1. By properties of integers, |a| ≤ |b| → |a|+1 ≤ |b|+1

Therefore xy, axb, bxa are all in A, and LA, as was to be shown.







This document was translated from LATEX by
HEVEA.

Slow

It’s been slow going on the academic front lately. I spent some time designing a flyer for the Atlanta Skeptics, spend some time on MPI programming, and read a few sections of The Computational Beauty of Nature, which is an excellent book and a great introduction to complex systems.

Mostly I’ve been thinking, more philosophy than computer science. Philosophy is the pornography of the intellect, a cheap visceral thrill that ultimately doesn’t satisfy any real needs. It’s easy to think about the economy, or society, or even health care, and to come up with plans that work in your head.

But people tend to skip the step of developing realistic models that will help them understand whether those plans will work in the real world, or to at least sketch the probable points of failure.

When I was in art school, I ran into this a lot. I’d look at some completed piece and think how much better it looked in my head, before I had to work through the details.