Note to self: NAT'd SSH connections

23. April 2012, 17:21

Here's a simple way to "alias" SSH targets that are behind a NAT and would require you to ssh into a proxy on an unusual port (which of course you'd have to remember).

At our company, we have to SSH into a client server via port forwarding on their gateway, like this:

ssh gateway -l root -p 1349

Which is, of course, quite troublesome.

Fortunately, SSH (in combination with netcat) can be configured to work around this. You can even make a pretty "alias" for that connection.

Add the following lines to your .ssh/config file:

Host my_alias
ProxyCommand nc gateway 1349

Now, you can simply connect to the NAT'd server like this:

ssh root@my_alias

Done!

Dave

,

Comment

---

Writing a programming language: Implementing the Sieve of Eratosthenes on DVM01

9. April 2012, 23:04

As promised in my previous post, I'm now writing about my next program for the DVM01 - the Sieve of Eratosthenes. It's a fairly simple algorithm to find prime numbers. Read up on Wikipedia if you don't know it.

Okay, so how do we implement this in DVM01? Since we (currently) have poorno support for memory management, the first thing I thought was that we need some agreement over how to do the marking. Usually, you set up a bunch of memory first, initializing it. Better implementation don't use a list of booleans, but use every bit separately, thus saving a huge load of space. We won't do that, because it makes the code more complex. Also, we won't initialize the memory, as the DVM01 does this for us (every cell is zero at first access).

Then, "traditional" implementations do the output and calculation in the same loop: If your cell has not been marked as non-prime, output it before increasing the step size. For simplicity, I've decided to rip out the print loop and write it separately. Other than that, the implementation follows the normal rules of the algorithm.

So, how does it work?

Okay. The first step is pretty easy - It's a NOP slide initialisation to make space for some faux "registers".

START # 0: contains program counter
NOP   # 1: Points to the beginning of the data
NOP   # 2: Largest number to check
NOP   # 3: Current step size
NOP   # 4: Pointer to currently evaluated cell
NOP   # 5: Start address of outer marker loop and print loop
NOP   # 6: Start address of inner marker loop
NOP   # 7: Conditional accumulator for loop termination
NOP   # 8: Counter for printing

SET 1 100 # Marks the beginning of data section. TODO: calculate dynamically
          # Note: this has been determined by using DUMP. When the program
          # expands, this MUST be increased, or you WILL get ugly results :)

SET 2 100 # Biggest number to check for. Well, approximately, as we're not
          # very careful about when to stop (allows us some simplifications)

 

Pretty easy, huh? Well that was only the initialisation, so don't yawn too early! The next step is a nested loop: The outer loop basically increases the step size and re-sets the current-cell pointer. The inner loop walks the "data area" and marks every seen number (which is a multiple of the step-size) as non-prime. Then, only some small-god magic is required to figure out whether to terminate the loop or not.

# Begin outer marker loop (counting up the step size).

SET 3 2   # Step size, begins with 2
SET 4 @1  # Pointer to currently evaluated cell.

SET 5 @0   # Marker for outer loop

    SET 4 @1   # Start out at position zero
    ADD 4 @3   # Add current-step-size to our pointer.

    # Begin inner marker loop (do the actual marking)

    SET 6 @0   # Marker for inner loop
        ADD 4 @3   # Add current-step-size to our pointer
        SET @4 1   # Mark position as non-prime

        # Figure out if we need to re-run inner loop:
        # If pointer is bigger than position of largest number
        # to check, we abort the loop.

        SET 7  @1     # beginning-of-data
        ADD 7  @2     # add number range
        JGT @6 @7 @4  # Re-run loop if upper bound (@7) > current-ponter (@4)

    ADD 3  1 # Increase step-size

    # Figure out if we need to re-run inner loop:
    # If step-size is bigger than half of the number range,
    # we abort the outer loop.

    SET 7  @3     # 7 is now step-size
    MUL 7   2     # 7 is now step-size * 2
    JGT @5  @2 @7 # Re-run loop if number-range (@2) > step-size * 2 (@7)

Okay, now that we have the primes sorted, it's only a question of telling the user about our discovery. Not very hard:

SET 5 @0

    ADD 8 1       # add 1 to output counter
    ADD 4 1       # Evaluate next cell
    COPY 7 @4     # De-reference pointer-to-current-cell, to check it's content
    JNZ @5 @7     # Restart loop if current-cell is nonzero (it had been marked)
                  # Note that this only works because uninitialized memory is always zero
    WRITEI @8     # Output number in register 7 - it's a prime number.
    WRITEC 10     # Newline

    JGT @5 @2 @8  # Restart loop if number-range (@2) > our counter (@7)

# End program

END

Of course, you can grab a copy of all examples, plus the implementation of the VM on Github:

 

Post Scriptum

Oh by the way - I've changed the source a bit to make it run on node.js, as well. You can run it with logging enabled or disabled, like this:

# Full Parser/VM logging output:
node noderunner.js source-file.txt

# Disable Parser/VM output - only allow your program to write:
node noderunner.js -nolog source-file.txt
Dave

,

Comment

---

Writing a programming language: First programs for DVM01

6. April 2012, 17:40

I've just published an article about my first implementation of a virtual machine, on the way to learning to develop a whole programming language.

Anything that can be programmed must have a few things: Documentation, and a few examples. This being about programming, it's tradition to write a "Hello World" program first. So without further ado, let's have a go:

Hello World!

This program just writes out that well-known message to it's output, then quits:

START

WRITEC 72
WRITEC 101
WRITEC 108
WRITEC 108
WRITEC 111

WRITEC 32

WRITEC 87
WRITEC 111
WRITEC 114
WRITEC 108
WRITEC 100

WRITEC 33

END

Pretty simple, huh? The numbers are just the ASCII (and, because of Javascript's inner workings, also Unicode) representations of the letters, space and interpunction of the message. The only really important thing is this: The program MUST start with the "START" statement, and it MUST end. Why?

Intermission: VM guts

As described in the previous article, the VM doesn't have any registers - it just uses a well-defined memory section for certain types of work. As a program gets executed, the VM needs to know which statement comes next. Traditionally, this information is stored in a place called the "program counter". I didn't want to make an exception to the above rule for this important piece of information, so, by convention, the program counter lives at the first location in memory, which is the address 0.

Now - The program, after having been parsed, is also loaded into memory. To not complicate it any further, the program starts at address 0. OOOOppps! When loading the program into memory, we just overwrite the program counter! Damn. This is how the START command came into life.

If you need any space to store temporary values, you can employ a technique called "NOP slide" to reserve memory at well-known locations that you can later use. Our next code example shows what I mean.

Implementing the Fibonacci series

Some interesting "problem" that shows up in many programming courses and examples is the generation of the Fibonacci series. Simply put, it's a series of numbers, where each number is the sum of the previous two.

The series starts with the numbers 1, 1. The next value, then is the sum of both, 2. Summing up 1 and 2, we get 3. Repeat that with 2 and 3, and you get 5, and so on.

There are many ways to implement a program that outputs that series, most people prefer a recursive way (which however is not very efficient if you're not careful). Our VM doesn't support recursive function calls, or rather, we would have to write a lot of code to emulate it, so we take another route: Since the series only ever cares about the previous two numbers, it's of no use to carry all the others with us - so we don't.

Here's the implementation. I've added plenty of comments, so you should get a rough idea of how it works.


# First, initialize a nop slide for "registers". Note that
# the first one is required, as address 0 is used by the
# VM to store the program counter.

    START # 0: contains program counter
    NOP   # 1: first accumulator
    NOP   # 2: second accumulator
    NOP   # 3: helper
    NOP   # 4: loop counter
    NOP   # 5: stores loop jump address

# Initialisation: Address 4 is used as a countdown loop counter.
# Set it to a higher number to get more fibonacci numbers :)

    SET 4 30

# Initialize accumulators 1 and 2

    SET 1 1
    SET 2 1

# Be clever: store the program counter to address 5 as a loop "marker".
# This way, we don't have to worry too much about changing the program -
# the marker will adjust accordingly.

    SET 5 @0

# The loop begins here.

    SET 3 @1  # Use the helper "register", put the value of accumulator 1 in it
    ADD 3 @2  # Add value of accumulator 2 to the helper
    WRITEI @1 # Output the value in accumulator 1
    WRITEC 10 # Newline - we don't have explicit text support
    SET 1 @2  # Add content of accumulator 2 to accumulator 1
    SET 2 @3  # Set new sum of both accumulators in accumulator 2.

# Decrease iteration counter.

    SUB 4 1

# Jump to beginning of the loop, unless the counter is zero.
# This uses the clever "marker" from further up to know where
# to jump.

    JNZ @5 @4

# If the counter was zero, the jump didn't happen, and the program
# must end.

    END

Okay, that's two examples on how to program for the DVM01. I'm trying to come up with some more advanced programs next, implementing the Sieve of Eratosthenes for example, or implementing Subroutine calls. Those will be covered in another article, so stay tuned :)

Dave

,

Comment

---

Writing a programming language: Step 1 - a simple virtual machine, called DVM01

6. April 2012, 17:10

For a very long time now, I really wanted to write my own programming language. Not because I think I can do better, but just to learn what processes are involved and why things are the way they are.

Many times, I started hacking up a specification of the core features of "my" language, only to discover that such a spec is a huge project in itself: The more spec I wrote, the more details turned up that I just couldn't keep track of. In the end, I always got demotivated moved on, only to try again a few weeks or months later.

So, this time, I've decided to try more of a bottom-up approach: Before even starting on another language specification, I wanted to implement something simple, something which could then evolve with my knowledge of the domain.

 

A plan!

As that idea rolled around in my head, I finally set out to specify some very simple, assembly-like vocabulary and a concept of a virtual machine that could run it. I thought of the minimum requirements so I could later write some "useful" program with it.

After a little bit of thinking, I decided on the following set of operations:

  • READ - to get input from the user
  • WRITE - to output something
  • SET - to initialize a memory location with a value
  • JNZ - to jump to an address conditionally (here: jump if some value is non-zero)
  • All four basic math operators (+, -, *, /)

The idea emerged that my VM should have a Von-Neumann style Memory architecture (data and program share the same address space).

After some further amount of thinking, I figured that I would require some indirect access to memory as well, since it simplifies wiriting programs for that VM by a very large amount.

 

More details

After some polishing, I finally got around to write up a spec that does the above. Here are the main points:

  • No functions
  • No variables (addresses only)
  • No registers - you can use a NOP slide at the beginning of the program to free some well-defined space. Then you can use that to simulate registers
  • One address space
  • Program counter is directly modifiable (stored at address 0)
  • IO: only stdin/stdout available, very simplified: You can write (unicode) characters or integers, and read unicode characters. Internal representation is always as an integer.
  • Only datatype: integer
  • Variable-length opcodes
  • Operands can be indirect or direct. Indirect uses the address that the operand points to, direct uses operand literally (either as address or value, depending on the operation):
    • SET     80     5      # set address 80 to value 5
    • WRITEI  @80           # output integer stored at address 80
  • In other words: if a value is prefixes with an @, it is interpreted as an address, and the value at this address is used. This also works for non-addr operands, as can be seen by the following examples:
    • JZ 0    0      # always jumps to addr 0 (restart program).
    • JZ 0    @0     # Jump to addr 0 if value at that addr is 0

 

Here's a full list of all the operations that are supported.

Opcode Description
START Your program needs to start with this. It's actually just an alias for NOP. However, you really need this to make space for the program counter, which sits at address 0.
NOP Doesn't do anything.
READC addr Read character from stdin, storing it's integer (unicode) value at addr
WRITEC addr Writes integer at addr as an unicode character to stdout
WRITEI addr Writes integer at addr in decimal representation to stdout
SET addr n Sets value at given addr to value n
ADD addr n Adds value n to value at addr
SUB addr n Subtracts value n from value at addr
MUL addr n Multiplies value at addr by n
DIV addr n divides value at addr by n
JZ addr n Jumps to opcode at addr if n is zero
JNZ addr n Jumps to opcode at addr if n is not zero
JGT addr n1 n2 Jumps to opcode at addr if n1 is greater than n2.
COPY addr1 addr2 Writes content of addr2 to addr1 (Use this to de-reference pointers)
DUMP addr1 addr2 Dumps the memory area between addr1 and addr2 (Useful for debugging)
END Exits the program

 

Implementation

Of course, every good thing is not only described, but actually done. So i went out to hack up a simple Web GUI and started writing an implementation in javascript. Why javascript? I think it's just the most accessible language (for you, the reader!), since you don't need any fancy programming language installed (which you may or may not have), and you don't need any compiler / build tool knowledge either to give it a shot.

You can download the current version from Github. As you can see, I've already put up a directory structure that hints at more stuff to come. That's right! I hope to develop that virtual machine further in the future, adding features and improving, until the main goal is reached: Dave's own virtual machine :)

But for now, go grab that code and try it for yourself!

Dave

,

Comment

---

A story about Buddha. I like it.

15. January 2012, 00:08

I got the following text off /b/. Wonder where it really came from… but that’s not the matter right now, it’s the content. Read for yourself:

The Buddha visited a town called Kesputta. The inhabitants of this town were known as Kalama. When they heard that the Buddha was in their town, the Kalamas paid him a visit, and told him:

Sir there are some recluses and brahmanas who visit Kesputta. They explain and illumine only their own doctrines, and despise, condemn, and spurn others’ doctrines. Then come other recluses and brahmanas, and they, too, in their turn, explain and illuminate only their doctrines, and despise, condemn, and spurn others’ doctrines. But for us, Sir, we have always doubt and perplexity as to who among these venerable recluses and brahmanas spoke the truth, and who spoke falsehood.

Then the Buddha gave them this advice, unique in the history of religions:

Yes, Kalamas, it is proper that you have doubt, that you have perplexity, for a doubt has arisen in a matter which is doubtful. Now, look you Kalamas, do not be led by reports, or tradition, or hearsay. Be not led by the authority of religious texts, nor by mere logic or inference, nor by considering appearances, nor by the delight in speculative opinions, nor by seeming possibilities, nor by the idea “This is our teacher”. But O Kalamas, when you know for yourselves that certain things are unwholesome and wrong, and bad, then give them up … And when you know for yourselves that certain things are wholesome and good, then accept them and follow them.

Dave

,

Comment

---

« Older