Section 1 of 2

"Understanding redcode programs"

by

Skybuck Flying

1. Basis:

Redcode programs consist out of redcode instructions which conceptually look like this:

<Instruction> <Instruction A> <Instruction B>

<Instruction> <Instruction A> <Instruction B>

<Instruction> <Instruction A> <Instruction B>

<Instruction> <Instruction A> <Instruction B>

and so on...

Above is the terminalogy used in redcode programming.

I shall translate the above to "general purpose programming languages like C and Pascal/Delphi":

Conceptual translation:

<Instruction operation> <Data Pointer A> <Data Pointer B>

<Instruction operation> <Data Pointer A> <Data Pointer B>

<Instruction operation> <Data Pointer A> <Data Pointer B>

<Instruction operation> <Data Pointer A> <Data Pointer B>

2. Instructions are Data, Data are Instructions:

In redcode there is no difference between instructions and data.

Everything can be seen as instructions, Everything can be seen as data.

Instructions can use other instructions as "data input" or "data output";

Data input by reading other instructions and using it as data (input).

Data output by overwriting other instructions and using it as data (output).

This is also called "self modifieing code".

Conceptual translation:

Instructions and Data are all the same.

There is no seperate "code section".

There is no seperate "data section".

CODE = DATA

DATA = CODE

There is only one section.

"THE CODE/DATA SECTION".

This is also called the "core".

3. Basic instruction operation:

Each instruction operates on either instruction A or instruction B or both and in a certain way/order.

When I "say" instruction A this can also be thought of as data A.

When I "say" instruction B this can also be thought of as data B.

The basic instruction operation is described in these lines below:

First the instruction code follows (3 letters), then the description of what it does/how it operates:

DAT terminate process

MOV move from A to B

ADD add A to B, store result in B

SUB subtract A from B, store result in B

DIV divide B by A, store result in B if A <> 0, else terminate

MOD divide B by A, store remainder in B if A <> 0, else terminate

JMP transfer execution to A

JMZ transfer execution to A if B is zero

JMN transfer execution to A if B is non-zero

DJN decrement B, if B is non-zero, transfer execution to A

SPL split off process to A

SLT skip next instruction if A is less than B

CMP same as SEQ

SEQ Skip next instruction if A is equal to B

Skybuck's alternative explanation:

SEQ Execute next instruction if A <> B else skip next instruction

SNE Skip next instruction if A is not equal to B

Skybuck's alternative explanation:

SNE Execute next instruction if A = B else skip next instruction

NOP No operation

LDP Load P-space cell A into core address B

STP Store A-number into P-space cell B

What these lines tell you is what the operation basically does and which "data pointers" (or InstructionA/InstructionB) it uses.

This is always the same, this is set in stone.

The operation always works like that no matter what else happens.

4. Basic addressing:

The "basic addressing operator" (that's how I shall call it) is a dollar sign.

This is also known as "direct addressing"

Conceptual translation:

It's the @ operator in Pascal, pointer := @Variable;

It's the & operator in C, pointer = &Variable;

It is now time to update or conceptual view/model of a redcode program.

With this updated understanding the conceptual view now becomes:

<Instruction> <$> <Instruction A> <$> <Instruction B>

<Instruction> <$> <Instruction A> <$> <Instruction B>

<Instruction> <$> <Instruction A> <$> <Instruction B>

<Instruction> <$> <Instruction A> <$> <Instruction B>

The dollar sign is actually an "addressing mode", it's a redcode concept which is unimportant for now, but at least now you heard of it.

For now it's sufficient to think of all instructions to operate on direct addressing.

In either words InstructionA tells the instruction where to fetch the data from for A.

In either words InstructionB tells the instruction where to fetch the data from for B.

Conceptual translation:

Data Pointer A tells the instruction where to fetch the data from for A.

Data Pointer B tells the instruction where to fetch the data from for B.

Time for a little conceptual example:

<Instruction 0> <$> <1> <$> <2>

<Instruction 1> <$> <A> <$> <B>

<Instruction 2> <$> <A> <$> <B>

The focus is on instruction 0, it will execute as follows:

If the operation works with A then it will fetch A from instruction 1.

If the operation works with B then it will fetch B from instruction 2.

So all the 1 does is it points to instruction 1.

So all the 2 does is it points to instruction 2.

At this point the A and B of instruction 1 and 2 is irrelevant, this comes into play only later (modifiers) (which will be explained in the next section).

4. Basic data fetch operation (modifiers):

Now that you understand the basic operation and basic addressing it is time to explain the basic data fetch operation.

The basic data fetch operation is specified with a modifier so it's now time to update your conceptual model of a redcode program:

<Instruction code> <Instruction modifier> <Instruction A> <Instruction B>

<Instruction code> <Instruction modifier> <Instruction A> <Instruction B>

<Instruction code> <Instruction modifier> <Instruction A> <Instruction B>

<Instruction code> <Instruction modifier> <Instruction A> <Instruction B>

The instruction modifier tells the instruction which field to fetch from the data pointers.

To keep it simple for now I will mention the easy/basic ones:

.A (fetches A from A and fetches A from B)

.B (fetches B from A and fetches B from B)

.AB (fetches A from A and fetches B from B)

.BA (fetches B from A and fetches A from B)

This might seem somewhat confusing to you ? But that's the way it more or less works.

But I shall try to explain better:

.A (fetches A from Data Pointer A and fetches A from Data Pointer B)

.B (fetches B from Data Pointer A and fetches B from Data Pointer B)

.AB (fetches A from Data Pointer A and fetches B from Data Pointer B)

.BA (fetches B from Data Pointer A and fetches A from Data Pointer B)

(Notice how Data Pointer A is column)

(Notice how Data Pointer B is column)

What's it all about you might be wondering ?!?

Well what the instruction is trying to do is... it's trying to "replace data pointer a" and "replace data pointer b" with actual data.

The "actual data" needs to come from either field A or field B from whereever the data pointers are pointing to.

So take .BA for example.

You should read it as follows:

The B in .BA means:

"Take the B field of whereever Data Pointer A is pointing towards".

The A in .BA means:

"Take the A field of whereever Data Pointer B is pointing towards".

(Other fields can be taken as well, that's what the other/new modifiers are for).

Now time for a little example:

<Instruction 0> <.BA> <$1> <$2>

<Instruction 1> <A> <B>

<Instruction 2> <A> <B>

Reading/understanding instruction 0 leads to:

Instruction 0, Data Pointer A points to instruction 1.

Instruction 0, Data Pointer B points to instruction 2.

.BA means

Take B for Data Pointer A

Take A for Data Pointer B

So:

Take instruction 1 B  for Data Pointer A.

Take instruction 2 A for Data Pointer B.

And execute.

So this data is copied internally into instruction 0 and executed as if it said:

<Instruction 0> <Instruction 1 B> <Instruction 2 A>

So if the data was:

<Instruction 0> <.BA> <$1> <$2>

<Instruction 1> <5> <17>

<Instruction 2> <9> <14>

It would execute:

<Instruction 0> <17> <9>

Seems quite complex doesn't it ? ;)

But it gets even more complex time to go to the next section, advanced addressing modes.

5. Advanced addressing modes:

Each A and each B also have an Addressing Mode,

Called Addressing mode A for A.

Called Addressing mode B for B.

Or simply short:

AddressingA

AddressingB

Or even shorter (extension 2009 terminology):

Y

Z

It is now time to update the conceptual model:

<Instruction code> <Instruction modifier> <Addressing Mode A> <Instruction A> <Addressing Mode B> <Instruction B>

<Instruction code> <Instruction modifier> <Addressing Mode A><Instruction A> <Addressing Mode B> <Instruction B>

<Instruction code> <Instruction modifier> <Addressing Mode A><Instruction A> <Addressing Mode B> <Instruction B>

<Instruction code> <Instruction modifier> <Addressing Mode A> <Instruction A> <Addressing Mode B> <Instruction B>

There are 8 addressing modes, each has one symbol, followed by a description of how it works:

#    immediate data, (works on self) ready for data fetch.

$    direct addressing, points to data for data fetch.

@    indirect addressing via b, points to pointer located in b for data fetch.

<    indirect addressing via b, points to pointer located in b and decrements it (second b) before data fetch.

>    indirect addressing via b, points to pointer located in b and increments it (second b) after data fetch.

*    indirect addressing via a, points to pointer located in a for data fetch.

{    indirect addressing via a, points to pointer located in a and decrements it (second a) before data fetch.

}    indirect addressing via a, points to pointer located in a and increments it (second a) after data fetch.
 

I am not going to discuss each one of them...

I am just going to give one example for you to learn from, and then you should be able to figure out the rest and experiment with it in case you need to learn it:

Conceptual example:

Instruction 0  @2,  *3

Instruction 1 A, B

Instruction 2 A, B

Instruction 3 A, B

The @ sign tells instruction 0 to look at instruction 2 and take the B field of instruction 2 and then it should go from instruction 2 to whereever the B field is pointing towards from there as

Data Pointer A.  (Direct Pointer + Indirect Pointer added together to form final destination/offset from instruction 0)

(Alternative description):

Data Pointer A. (First Pointer + Second Pointer added together to form final destination/offset from instruction 0)

 

The * sign tells instruction 0 to look at instruction 3 and take the A field of instruction 3 and then it should go from instruction 3 to whereever the A field is pointing towards from there as

Data Pointer B.  (Direct Pointer + Indirect Pointer added together to form final destination/offset from instruction 0)

(Alternative description):

Data Pointer B.  (First Pointer + Second Pointer added together to form final destination/offset from instruction 0)

 

The pre-decrement and post-increment operators simply decrement or increment the indirect pointer as stated for the final destination/offset calculation.

(Alternative description):

The pre-decrement and post-increment operators simply decrement or increment the second pointer as stated for the final destination/offset calculation.

 

6. Basis of Data Values

The last thing to discuss are the data value's themselfes.

These are also called the A and B for short

or Value A and Value B for a longer description.

Another way of looking at redcode programs is the following conceptual model:

<Instruction code> <Instruction modifier> <Addressing Mode A> <Value A> <Addressing Mode B> <Value B>

<Instruction code> <Instruction modifier> <Addressing Mode A><Value A> <Addressing Mode B> <Value B>

<Instruction code> <Instruction modifier> <Addressing Mode A><Value A> <Addressing Mode B> <Value B>

<Instruction code> <Instruction modifier> <Addressing Mode A> <Value A> <Addressing Mode B> <Value B>

The value's themselfes are important to understand as well because they are mostly related to the "Core Size".

The core size is what specifies how many instructions there can be in the core and it influences the value's.

Not only that but the value's must always be positive.

Internally positives can be subtracted though so it's possible to do the following internally:

Positive - Positive = Potentially Negative.

Redcode is designed in such a way that negative value's are wrapped back to positive value's.

It is important to understand how this wrapping around works to be able to write warriors most effectively.

The basic and most robust formula for wrap around is:

ModdedValue = InputValue mod CoreSize

if ModdedValue < 0 then ModdedValue = ModdedValue + CoreSize

OutputValue = ModdedValue

This takes care of two things:

It wraps around large positive value's to the range of 0 to CoreSize-1.

It wraps around large negative value's to the range of 0 to CoreSize-1.

Some examples:

Suppose the core size is 24.

Suppose A is 20, Suppose B is 5

Suppose subtraction occurs B - A.

The following happens:

5 - 20 = -15

The wrap around kicks in:

-15 mod 24 = -15

-15 < 0: yes

-15 + 24 = 9

Wrap around value for -15 is 9.

So result is: 5 - 20 = 9

This can be checked as follows suppose we want to go back:

9 + 20 = 29

29 mod 24 = 5

Or suppose we do a full core cycle -24 we should end up on 9 again:

9 - 24 = -15

-15 < 0: yes

-15 + 24 = 9

(This same formula is also used to wrap around code, modifier and addressing fields in extension 2009 except instead of using core size it will use code count, or modifier count or addressing count).

Some other formula's are at play as well for slight optimizations.

Suppose two positive values are multiplied then it's sufficient to only do a mod, the branch and addition is not done to safe cpu cycles.

Now that I have attempted to explain to you the basis/basics of redcode programs it is time to reveal to you the full specification (and power !;):)) of the Extension 2009 !

7. Extension 2009

This is the part you have all probably been waiting for ! =D

This part, section 7, describes the extension 2009 as I, Skybuck Flying, designed it and implemented it in my own Delphi core executor/mars and soon-to-be C/PMARS.

Section 2 of 2

(Begin of Redcode Extension 2009, Specification 1.00 created on 16 december 2009 by Skybuck Flying):

"Full Specification for Redcode Extension 2009":

by

Skybuck Flying

18 instruction codes:

DAT = 0

MOV = 1

ADD = 2

SUB = 3

MUL = 4

DIV = 5

MOD = 6

JMP = 7

JMZ = 8

JMN = 9

DJN = 10

SPL = 11

SLT = 12

CMP = 13

SEQ = 13  same as CMP

SNE = 14

NOP = 15

LDP = 16

STP = 17

27 instruction modifiers:

A = 0

B = 1

AB = 2

BA = 3

F = 4

X = 5

I = 6

C = 7    * NEW *

CA = 8    * NEW *

CB = 9    * NEW *

AC = 10    * NEW *

BC = 11    * NEW *

M = 12    * NEW *

MA = 13    * NEW *

MB = 14    * NEW *

AM = 15    * NEW *

BM = 16    * NEW *

Y = 17    * NEW *

YA = 18    * NEW *

YB = 19    * NEW *

AY = 20    * NEW *

BY = 21    * NEW *

Z = 22    * NEW *

ZA = 23    * NEW *

ZB = 24    * NEW *

AZ = 25    * NEW *

BZ = 26    * NEW *

8 instruction addressing modes:

# = 0

$ = 1

@ =2

< = 3

> = 4

* = 5

{ = 6

} = 7

The (implementation) specifications for the old modifiers are unchanged:

.A Instructions read and write A-fields.

.B Instructions read and write B-fields.

.AB Instructions read the A-field of the A-instruction and the B-field of the B-instruction and write to B-fields.

.BA Instructions read the B-field of the A-instruction and the A-field of the B-instruction and write to A-fields.

.F Instructions read both A- and B-fields of the the A- and B-instruction and write to both A- and B-fields (A to A and B to B).

.X Instructions read both A- and B-fields of the the A- and B-instruction and write to both A- and B-fields exchanging fields (A to B and B to A).

.I Instructions read and write entire instructions.

The (implementation) specifications for the new modifiers are new:

.C Instructions read and write the Code-field.

.CA Instructions read the Code-field of the A-instruction and the A-field of the B-instruction and write to A-fields.

.CB Instructions read the Code-field of the A-instruction and the B-field of the B-instruction and write to B-fields.

.AC Instructions read the A-field of the A-instruction and the Code-field of the B-instruction and write to Code-fields.

.BC Instructions read the B-field of the A-instruction and the Code-field of the B-instruction and write to Code-fields.

 .M Instructions read and write the Modifier-field.

.MA Instructions read the Modifier-field of the A-instruction and the A-field of the B-instruction and write to A-fields.

.MB Instructions read the Modifier-field of the A-instruction and the B-field of the B-instruction and write to B-fields.

.AM Instructions read the A-field of the A-instruction and the Modifier-field of the B-instruction and write to Modifier-fields.

.BM Instructions read the B-field of the A-instruction and the Modifier-field of the B-instruction and write to Modifier-fields.

.Y Instructions read and write the AddressingA-field.

.YA Instructions read the AddressingA-field of the A-instruction and the A-field of the B-instruction and write to A-fields.

.YB Instructions read the AddressingA-field of the A-instruction and the B-field of the B-instruction and write to B-fields.

.AY Instructions read the A-field of the A-instruction and the AddressingA-field of the B-instruction and write to AddressingA-fields.

.BY Instructions read the B-field of the A-instruction and the AddressingA-field of the B-instruction and write to AddressingA-fields.

.Z Instructions read and write the AddressingB-field.

.ZA Instructions read the AddressingB-field of the A-instruction and the A-field of the B-instruction and write to A-fields.

.ZB Instructions read the AddressingB-field of the A-instruction and the B-field of the B-instruction and write to B-fields.

.AZ Instructions read the A-field of the A-instruction and the AddressingB-field of the B-instruction and write to AddressingB-fields.

.BZ Instructions read the B-field of the A-instruction and the AddressingB-field of the B-instruction and write to AddressingB-fields.

Modifier notes:

.C (acts on Code-field)

.M (acts on Modifier-field)

.Y (acts on AddressingA-field)  (AddressingA=Addressing Mode A)

.Z (acts on AddressingB-field) (AddressingB=Addressing Mode B)

All other new modifiers are combinations of these 4.

Additional instruction notes:

DAT

NOP

SPL

JMP

are not affected by modifiers.

Additional p-space notes:

Single field's can be stored in PSpace. (Warrior Private Space)

PSpace fields/elements must be large enough to store code, modifier, addressing mode or value -field.

Important:

P-Space elements must contain enough bits to store whatever field is largest !

Extreme example:

CoreSize = 8

Bits used for value field: 3

Bits used for code field: 5

PSpace elements/fields should have at least 5 bits of storage space.

Additional implementation tips and tricks and recommendations:

There are 18 instructions this means codes need 5 bits.

There are 27 modifiers this means modifiers need 5 bits.

There are 8 addressing modes this means addressing mode a need 3 bits.

There are 8 addressing modes this means addressing mode b need 3 bits.

5 + 5 + 3 + 3 = 16 bits which fits nicely in one word.

It is therefore recommended to compress these four fields into one 16 bit word field as follows:

Bit position:

15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0

Bit data:

C4, C3, C2, C1, C0, M4, M3, M2, M1, M0, Y2, Y1, Y0, Z2, Z1, Z0

This is not required, it is just a recommendation.

Additional wrap around notes:

Code field must wrap around CodeCount which is 18. (Code mod 18) (After mod, if negative add 18).

Modifier field must wrap around ModifierCount which is 27. (Modifier mod 27) (After mod, if negative add 27).

AddressingA field must wrap around AddressingCount which is 8. (AddressingA mod 8) (After mod, if negative add 8).

AddressingB field must wrap around AddressingCount which is 8. (AddressingB mod 8) (After mod, if negative add 8).

Important:

Future extensions must apply these wrap around modes to warriors written for this specification/extension 2009 for correct execution.

Warriors should have some way to indicate for what specification/extension they were written.

For example for parsers a comment section could be included:

;REDCODE2009

or

;EXT2009

For example for binary storage a "redcode version field" could be included:

REDCODE88 = 0

REDCODE94 = 1

REDCODE2009 = 2

Conceptual instruction model for Extension 2009:

<Code>.<Modifier> <Addressing Mode A> <Value A>, <Addressing Mode B> <Value B>

or slightly shorter:

<Code>.<Modifier> <AddressingA> <ValueA>, <AddressingB> <ValueB>

or super short and in relation to new modifiers:

Code = C

Modifier = M

AddressingA = Y

ValueA = A

AddressingB = Z

ValueB = B

<C>.<M> <Y> <A>,<Z> <B>

Final word/comments from me/Skybuck:

I hope you like the new specification/extension 2009, pmars executables and source code will follow shortly. Possibly followed by special hill software for running and outputting hills to textfiles and html.

I also look forward to working on specification/extension 2010 which will have new instructions and possibly even time travelling support ! ;) :)

Also if you have trouble implementating this specification/extension into your own core executor and have questions about it, don't hesitate to contact me ! ;)

I can be reached (via e-mail) at skybuck2000@hotmail.com

Sometimes I also read the following newsgroup:

rec.games.corewar

I hope you will have fun with this specification/extension in the future !

I hope you will be writing lot's of interesting/new or improved warriors !

I hope to see some new 2009 hills !

For now you can think about new capabilities until the new software is released which should follow within a few days so stay tuned ! ;) =D

(No warrior examples yet, you will have to figure that out for now... maybe I add some later on... but so many combinations a bit much to be doing examples for all of them.. but maybe I add some in the future as examples ;) :))

Bye for now,

Skybuck Flying ! The one and only ! ;) =D =D

(End of Redcode Extension 2009, Specification 1.00 created on 16 december 2009 by Skybuck Flying)