Why does Forth reserve two cells per variable?
While trying to figure out how Forth manages the dictionary (and memory in general), I came across this page. Being familiar with C, I have no problem with the concept of pointers and I suppose I got it right. However, there are some exercises at the end of the page , and I noticed something strange here.
Exercise 9.4, assuming what is DATE
defined as VARIABLE
, asks what is the difference between
DATE .
and
' DATE .
and Exercise 9.5 does the same using a custom variable BASE
.
As per the answers provided, both phrases will give the same result (also with BASE
). Trying to do this with Win32Forth , however, gives results with 4 bytes (1 cell) difference. Here's what I did:
here . 4494668 ok
variable x ok
x . 4494672 ok
' x . 4494668 ok
Creating another variable gives a similar result:
variable y ok
y . 4494680 ok
' y . 4494676 ok
So it looks like each variable gets not one cell (for the value), but two cells. The variable itself points to where the actual value is stored and retrieving the content in the execution token (using ' x ?
) gives 0040101F
for both variables.
For exercise 9.5 my results are:
base . 195F90 ok
' base . 40B418 ok
They are not even close to each other. However, the answer to this exercise indicates that the results may depend on how it is defined BASE
.
Returning to normal variables, my main question is, why are two cells reserved for each variable?
Additionally:
- Since only one cell contains the actual value, what does the contents of the other cell mean?
- Is this specific to Win32Forth? What happens in other implementations?
- Is this different for runtime and compile time variables?
- How do the answers to the above questions apply to custom variables (e.g.
BASE
)?
EDIT1: Ok, so Forth also stores a header for each variable, and with, '
you get the address of that header. From my tests, I would then end up that the header only uses one cell, which doesn't match all the information the header should contain. Second, according to the exercise, getting the address of a variable should give the same result in both cases, which seems to contradict the existence of the header at all.
My sense of feeling is that this is all very implementation specific. If so, what is happening in Win32Forth and what should happen according to the exercise?
source to share
This is roughly what the definition looks like in a dictionary using a traditional memory layout. Note that the implementation can vary significantly from this, sometimes a lot. In particular, the order of the fields can be different.
Link to previous word (one cell)
Flags (a few bits)
Name length (one byte, less a few bits)
Name string (variable)
Code field (one cell)
Parameter field (variable)
Everything except the code and parameter fields is considered a header. The code field usually comes right before the parameters field.
Including a word with '
gives you an XT or an execution token. It could be anything implementation related, but in many cases it is the address of the code field.
Doing a word created with CREATE
or VARIABLE
gives you the address of the parameter field.
This is probably why in Win32Forth two addresses differ by 4 bytes or one cell. I don't know why there should be no difference in the answers to the exercises.
Assuming it BASE
is a custom variable, it probably works like this: each task has its own custom scope in which the custom variables are allocated. All user variables know their specific offset within this scope. Ticking BASE
gives you XT, which is the same for all tasks. Execution BASE
calculates the address by adding its offset to the base of the user area.
source to share