Сколько нужно примитивов для реализации форт системы?

2020-12-27 в 1:39, admin, рубрики: forth, ненормальное программирование

В 1992-м году проходил очередной конкурс по обфусцированному программированию на языке С. Один из представленных проектов был небольшой форт системой. Меня поразило, что виртуальная машина была реализована всего в 794 байтах С кода. Остальная часть форт системы загружалась из исходника на форте. После изучения проекта первоначальный восторг уступил место разочарованию, так как автор использовал не совсем “честный” трюк: для парсинга фортового исходника он использовал функцию scanf(). С этого момента меня терзал вопрос — сколько нужно примитивов для реализации форт системы без подобных трюков?

Через некоторое время я познакомился с форт системой sod32 написанной Lennart Benschop. У этой форт системы виртуальная машина написана на С, а двоичный образ, который загружается в виртуальную машину, не зависит от порядка байтов в слове хост системы. sod32 имеет 32 примитива, вот они:





NOOP
( --- )
Do nothing


SWAP
( x1 x2 --- x2 x1 )
Swap the two top items on the stack.


ROT
( x1 x2 x3 --- x2 x3 x1 )
Rotate the three top items on the stack.


0=
( x --- f)
f is true if and only if x is 0.


NEGATE
( n1 --- -n1)
Negate top number on the stack.


UM*
( u1 u2 --- ud )
Multiply two unsigned numbers, giving double result.


C@
( c-addr --- c)
Fetch character c at c-addr.


@
( a-addr --- x)
Fetch cell x at a-addr.


+
( w1 w2 --- w3) 
Add the top two numbers on the stack.


AND
( x1 x2 --- x3)
Bitwise and of the top two cells on the stack.


OR
( x1 x2 --- x3)
Bitwise or of the top two cells on the stack.


XOR
( x1 x2 --- x3)
Bitwise exclusive or of the top two cells on the stack.


U<
( u1 u2 ---- f)
f is true if and only if unsigned number u1 is less than u2.


<
( n1 n2 --- f)
f is true if and only if signed number n1 is less than n2.


LSHIFT
( x1 u --- x2)
Shift x1 left by u bits, zeros are added to the right.


RSHIFT
( x1 u --- x2)
Shift x1 right by u bits, zeros are added to the left.


UM/MOD
( ud u1 --- urem uquot)
Divide the unsigned double number ud by u1, giving unsigned quotient and remainder.


+CY
( n1 n2 cy1 --- n3 cy2)
Add n1 and n2 and the carry flag cy1 (must be 0 or 1) giving sum n3 and carry flag cy2. (n3 cy2 can be seen as a double number)


SCAN1
( x d --- u)
Scan x for the first 1 bit. u is the position of that bit (counted from the scan direction) and 32 if x=0. d is the scan direction, 0 is left-to-right, 1 is right-to-left.


SPECIAL
( x ---)
Any of a large number of special instructions, indicated by x.


DROP
( x ---)
Discard the top item on the stack.


>R
( x ---)
Push x on the return stack.


C!A
( c c-addr --- c-addr)
Store character c at c-addr, keep address.


!A
( x a-addr --- a-addr)
Store cell x at a-addr, keep address.


DUP
( x --- x x )
Duplicate the top cell on the stack.


OVER
( x1 x2 --- x1 x2 x1)
Copy the second cell of the stack.


R@
( --- x)
x is a copy of the top of the return stack.


R>
( --- x)
Pop the top of the return stack and place it on the stack.


0
( --- 0)
The constant number 0.


1
( --- 1)
The constant number 1.


4
( --- 4)
The constant number 4.


LIT
( --- lit)
Push literal on the stack (literal number is in-line).

Я понял, что у меня появился шанс найти ответ на свой вопрос и я начал превращать примитивы в высокоуровневые определения. Хочу сразу отметить, что вся эта деятельность имеет чисто академический смысл. Применить полученные результаты на практике вряд ли получится из-за потери производительности. В процессе своих экспериментов я отслеживал изменения размера двоичного образа форт системы и время выполнения набора тестов за авторством John Hayes. Новый двоичный образ я строил командой

echo 'S" extend-cross.4" INCLUDED '|./sod32 kernel.img

а тесты запускал вот так:

time ./sod32 kernel.img <<< $(echo -e 'S" tester.fr" INCLUDEDn12345nBYE')

В таблице ниже вы можете увидеть как каждое изменение повлияло на размер и производительность. Ссылки из колонки «изменение» ведут на соответствующий коммит в гитхабе.

изменение	размер kernel.img	время исполнения tester.fr
original sod32	10164	0m0.059s
lshift, rshift	10312	0m0.071s
+, um*, um/mod	11552	0m0.123s
c@, c!a	11952	0m0.795s
0=, negate <, u<	12340	0m2.800s
drop	12784	0m5.022s
swap, rot, over	13436	0m5.860s
sp@, sp!, rp@, rp!, dup	13680	0m8.696s
r@, r>, >r	14160	0m15.463s
and, or, xor	14336	0m21.198s
0, 1, 4	15236	0m21.671s
0branch	15912	0m41.765s

В итоге размер двоичного образа форт системы увеличился с 10164 до 15912 (+57%), производительность упала в 708 раз (почти 3 порядка). Возможно производительность можно было бы улучшить если отпрофилировать код и оптимизировать узкие места, но я уверен, что результат все равно будет очень медленным по сравнению с исходной sod32.

С 32-х примитивов плюс дополнительной логики во внутреннем интерпретаторе я пришел к 7: nop, nand, !, @, um+, special, lit. Во внутреннем интерпретаторе осталась логика для исполнения примитивов и высокоуровневых определения (call), а также логика для завершения высокоуровневого определения (next). Ответ на свой вопрос я нашел: форт систему можно построить на базе 9-ти примитивов (или 8-ми, если nop не обязательно должен быть примитивом). Меня смущает то, что для доступа к памяти присутствует целых 3 примитива: @,! и lit, но я не придумал, как этого можно избежать. Я вполне мог что-то упустить, так что если вы знаете как можно избавиться от бОльшего количества примитивов — пожалуйста напишите в комментариях.

Автор: kt97679

Источник

Информация

Комментарии

Рекомендуем

Сколько нужно примитивов для реализации форт системы?

NOOP	( --- )	Do nothing
SWAP	( x1 x2 --- x2 x1 )	Swap the two top items on the stack.
ROT	( x1 x2 x3 --- x2 x3 x1 )	Rotate the three top items on the stack.
0=	( x --- f)	f is true if and only if x is 0.
NEGATE	( n1 --- -n1)	Negate top number on the stack.
UM*	( u1 u2 --- ud )	Multiply two unsigned numbers, giving double result.
C@	( c-addr --- c)	Fetch character c at c-addr.
@	( a-addr --- x)	Fetch cell x at a-addr.
+	( w1 w2 --- w3)	Add the top two numbers on the stack.
AND	( x1 x2 --- x3)	Bitwise and of the top two cells on the stack.
OR	( x1 x2 --- x3)	Bitwise or of the top two cells on the stack.
XOR	( x1 x2 --- x3)	Bitwise exclusive or of the top two cells on the stack.
U<	( u1 u2 ---- f)	f is true if and only if unsigned number u1 is less than u2.
<	( n1 n2 --- f)	f is true if and only if signed number n1 is less than n2.
LSHIFT	( x1 u --- x2)	Shift x1 left by u bits, zeros are added to the right.
RSHIFT	( x1 u --- x2)	Shift x1 right by u bits, zeros are added to the left.
UM/MOD	( ud u1 --- urem uquot)	Divide the unsigned double number ud by u1, giving unsigned quotient and remainder.
+CY	( n1 n2 cy1 --- n3 cy2)	Add n1 and n2 and the carry flag cy1 (must be 0 or 1) giving sum n3 and carry flag cy2. (n3 cy2 can be seen as a double number)
SCAN1	( x d --- u)	Scan x for the first 1 bit. u is the position of that bit (counted from the scan direction) and 32 if x=0. d is the scan direction, 0 is left-to-right, 1 is right-to-left.
SPECIAL	( x ---)	Any of a large number of special instructions, indicated by x.
DROP	( x ---)	Discard the top item on the stack.
>R	( x ---)	Push x on the return stack.
C!A	( c c-addr --- c-addr)	Store character c at c-addr, keep address.
!A	( x a-addr --- a-addr)	Store cell x at a-addr, keep address.
DUP	( x --- x x )	Duplicate the top cell on the stack.
OVER	( x1 x2 --- x1 x2 x1)	Copy the second cell of the stack.
R@	( --- x)	x is a copy of the top of the return stack.
R>	( --- x)	Pop the top of the return stack and place it on the stack.
0	( --- 0)	The constant number 0.
1	( --- 1)	The constant number 1.
4	( --- 4)	The constant number 4.
LIT	( --- lit)	Push literal on the stack (literal number is in-line).

Информация

Комментарии

Рекомендуем

Сколько нужно примитивов для реализации форт системы?

Рекомендованный контент