monic part 5: writing C programs on the calculator
|
01-01-2021, 01:35 AM
(This post was last modified: 12-26-2022 11:58 PM by F-73P.)
Post: #1
|
|||
|
|||
monic part 5: writing C programs on the calculator
Previous: monic part 4: arbitrary-precision rational arithmetic
Next: monic part 6: the mC programming language and compiler I've started writing an editor and translator for a subset of the C programming language. I'm not sure what the proper computer science term for the translator is. Given a program, e.g. find the factorial of the positive integer on stack level 1: Code:
the translator produces a sequence of codes. When the program is run the codes are fed to a switch statement, which executes the corresponding instruction. Does this make the translator a compiler, or is it an interpreter? The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
01-01-2021, 11:20 AM
(This post was last modified: 01-01-2021 11:23 AM by grsbanks.)
Post: #2
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(01-01-2021 01:35 AM)F-73P Wrote: the translator produces a sequence of codes. When the program is run the codes are fed to a switch statement, which executes the corresponding instruction. Does this make the translator a compiler, or is it an interpreter? More like an interpreter. An interpreter executes the "translated" program as it is converting it (as does your system). A compiler transforms the original into something else that you can store and then run at a later date once the "translation" is complete. A couple of optimisations of your code, BTW. You can't do "n--" if n is a floating point number. Also, this assumes that the value of n has been checked for validity and that it is in fact a positive integer value (expressed as a floating point). Code: int main(void) { There are only 10 types of people in this world. Those who understand binary and those who don't. |
|||
01-05-2021, 03:50 AM
Post: #3
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(01-01-2021 01:35 AM)F-73P Wrote: the translator produces a sequence of codes. When the program is run the codes are fed to a switch statement, which executes the corresponding instruction. Does this make the translator a compiler, or is it an interpreter? Very nice work, judging from your description! This is not a trivial accomplishment. I assume you are lexing/parsing the source code just once, so this is a compiler that produces (byte) codes. Your virtual machine executes them in a loop with a switch statement. This approach is in principle similar to the JVM. Just to compare, I wrote a small C compiler that produces JVM class files from "mini C" source code. "I count on old friends to remain rational" |
|||
01-06-2021, 04:28 AM
(This post was last modified: 02-01-2021 06:08 AM by F-73P.)
Post: #4
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(01-01-2021 11:20 AM)grsbanks Wrote: You can't do "n--" if n is a floating point number. "n--" compiles without errors for n of type double on the compilers I tried (and gives n-1.0), so I assume it's supported by the standard they are using. However it's not good programming style, as "n--" could be mistaken for the predecessor of the floating-point number, which in general is not equal to n-1.0, so your suggestion is better. (01-05-2021 03:50 AM)robve Wrote: I wrote a small C compiler that produces JVM class files from "mini C" source code. Great work! I also like the manual, comprehensive and very professional. I wrote the code before I read anything about compilers. The front end consists of (what I now know is called) a lexical analyser and the parser is just Dijkstra's shunting yard algorithm. I'm now wondering whether I should: a) Continue with my ad hoc approach. This is the easiest option but there is no syntax or semantic checking, so it will never be a complete programming environment. b) Learn about grammars, abstract syntax trees and top-down parsing and write the parser by hand. This sounds very difficult. c) Find an open source compiler and try to port it across. My preference is for option b), despite the difficulties. In that case I think I should choose a programming language with a simpler grammar than C as the source language. The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
01-06-2021, 06:34 AM
Post: #5
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Option b, definitely. You wrote a compiler without knowing anything about compilers-- imagine what you'll accomplish once you know what you're doing!
|
|||
01-06-2021, 09:59 AM
Post: #6
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(01-06-2021 06:34 AM)mpark Wrote: Option b, definitely. You wrote a compiler without knowing anything about compilers-- imagine what you'll accomplish once you know what you're doing! I too would recommend b. Tools like flex and bison can help tremendously. It's been a while since I used them last time but IIRC you get by with just cursory knowledge of the grammars theory. |
|||
01-08-2021, 02:59 AM
Post: #7
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Definitely option b). By contrast, it will quickly become much more complicated to complete option a) when the language gets more complex. Option b) is not as difficult to comprehend as you may think. You could implement this with a hand-written lexical analyzer and write a recursive descent parser to do the translation. Or use flex and bison. The first chapters of the https://en.wikipedia.org/wiki/Compilers:..._and_Tools aka the "(Red) Dragon Book" shows how. Here are some lecture notes to get started. To limit the syntax to arithmetic syntax, operator precedence parsers are very handy and relatively easy to implement, see https://en.wikipedia.org/wiki/Operator-p...nce_parser The syntax of Prolog is parsed with operator precedence parsing. The interesting part is that the language syntax itself can be extended on the fly by the user defining his/her prefix, index, and postfix operators. Also Pakrat parsing has gained some popularity, see https://en.wikipedia.org/wiki/Parsing_ex...on_grammar Hope this helps! "I count on old friends to remain rational" |
|||
01-12-2021, 08:01 PM
Post: #8
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(01-06-2021 09:59 AM)vaklaff Wrote: Tools like flex and bison can help tremendously. It's been a while since I used them last time but IIRC you get by with just cursory knowledge of the grammars theory. I second using Flex and Bison instead of rolling your own parser as it's *much* easier. You should get the book "flex & bison: Text Processing Tools" by by John Levine and published by O'Reilly. The book has a well written introduction to BNF grammars. I skipped that part though as I've read the "Dragon book" and due to my knowledge of theoretical computer science from uni. A fellow HP48 hacker and I were coding a "high-level assembler" for the Saturn processor with the "assembler" being a kind of C and Saturn assembly hybrid. I had bought both the paperback and Kindle editions of the book and it was a really great read -- it made coding the assembler / compiler or whatever-you-want-to-call-it *much* easier than rolling our own as the C99 grammar is very complicated. The project is on hold now as I've had numerous other things to deal with in my Real Life [TM], but I'll eventually get back to it and finish it, hopefully Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
02-01-2021, 06:36 AM
(This post was last modified: 02-03-2021 01:08 AM by F-73P.)
Post: #9
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Thank you to everyone for the links and great advice - option b) was definitely the best choice.
I've ported the source code for the "TINY" compiler presented in Compiler Construction, Principles and Practice to the calculator: Execution time is not too bad, with 1000! (2568 digits, on stack level 6) calculated in about 6 seconds: I haven't added scrolling yet so I reduced the font size to see more digits: The "TINY" language has the following BNF grammar: and the following extended BNF grammar: I added bytecode generation to the parser and this was pretty straightforward, except for if-then-else and repeat-until statements; for these I used a stack to keep track of the destination labels and jump-if-false bytecodes for nested statements. I would now like to implement the "C-" language from the same book, which is more powerful than "TINY" ("C-" adds support for functions and arrays). Here is a factorial program written in "C-": "C-" has the following BNF grammar: My impression is that a recursive descent parser is not too difficult to write provided the programming language is small and the grammar is written in extended BNF. Flex and Bison are of course very powerful and useful tools but my philosophy with the "monic" calculator project has been to minimise the number of tools and libraries used. So if I can rewrite the grammar for "C-" in extended BNF I can use the "TINY" parser as a guide to write a parser for "C-". If anyone is interested in helping me with this please let me know. EDIT: Thank you to Jonathan Busby for translating the "C-" BNF grammar above into an ISO/IEC 14977 "standard" (see the Wikipedia article) EBNF grammar, which I will use to write the "C-" parser: Code: program = declaration-list ; The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
02-03-2021, 09:32 PM
Post: #10
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-01-2021 06:36 AM)F-73P Wrote: [snip] Note that the above EBNF grammar has a slight glitch, which is my fault. I accidentally incorrectly translated the "C-" BNF grammar arithmetic expressions into an EBNF grammar which made all the arithmetic expressions change from left-associative to right-associative. Here is the fix : Code: additive-expression = term , { addop , term } ; Hope this doesn't cause any inconvenience or confusion. N.B. Also, note that I assumed when writing the EBNF grammar that the Monic author would be implementing an LL(1) recursive descent parser, which reads its input from left to right and recurses in a depth-first manner. If this is not the case, then the EBNF grammar will need to be modified. Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
02-12-2021, 11:01 PM
(This post was last modified: 02-17-2021 10:52 PM by Jonathan Busby.)
Post: #11
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Well, I found more absent-minded stupid associativity errors in the EBNF grammar for C- . Here is the corrected grammar :
Code: program = declaration-list ; Sorry for any inconvenience. Regards, Jonathan NOTE #1 : I had also accidentally left out some non-superfluous non-terminals. Also, the "C-" assignment operator ( "=" ) is *right associative*, just like in full blown C. So, if that part of the grammar looks like it needs correcting, then just know that it doesn't because I intentionally made the EBNF version right associative to be fully compatible with the "C-" BNF grammar. EDIT #1 : Incorporated changes mentioned in later posts into above EBNF grammar. Aeternitas modo est. Longa non est, paene nil. |
|||
02-13-2021, 06:50 PM
(This post was last modified: 02-17-2021 10:22 PM by Jonathan Busby.)
Post: #12
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-12-2021 11:01 PM)Jonathan Busby Wrote: Well, I found more absent-minded stupid associativity errors in the EBNF grammar for C- . Here is the corrected grammar : The following can be simplified : Code: statement-list = statement , { statement } | empty ; The simplified version is : Code: statement-list = [ statement , { statement } ] ; Regards, Jonathan EDIT #1 : Incorporated above changes into EBNF grammar Aeternitas modo est. Longa non est, paene nil. |
|||
02-14-2021, 11:40 AM
Post: #13
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Great work Jonathan!
I've added "++" and "--" to the grammar, as well as "for" loops. I've modified the scanner accordingly and completed a parser recogniser. Now for the fun part: adding code generation. The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
02-15-2021, 11:00 PM
Post: #14
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-14-2021 11:40 AM)F-73P Wrote: I've added "++" and "--" to the grammar Just make sure you used the correct associativity and precedence for the post and pre increment and decrement operators. The post increment and decrement operators ( eg. "x++" and "x--" ) are one level of operator precedence up from the pre increment and decrement operators and they're also left associative. For the lower precedence pre increment and decrement operators ( eg. "++x" and "--x" ) note they're right associative. Furthermore, you have to take into account the differences between the two classes of post and pre increment and decrement operators with respect to how they alter an expression. What I mean is that something like "y = x++" will assign x to y and then increment x whereas "y = ++x" will increment x first and then assign it to y. You should consult the C99 grammar which can be found here ( Look in Annex A ). If you already know all of this, then please don't interpret my comments as patronizing you in any way, I just like to be sure Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
02-17-2021, 09:14 AM
(This post was last modified: 02-17-2021 09:15 AM by F-73P.)
Post: #15
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
Thanks Jonathan for your helpful advice, always appreciated.
I removed "int" from the type_specifier grammar production and replaced it with "apq" (arbitrary precision rational), "apf" (arbitrary precision float - not supported on the calculator yet) and "int32_t" (maybe?). I also added "read" and "write" from the TINY programming language, and post increment and decrement operators (just to increment/decrement variables for now, not used in assignments). There were a few aspects of the original BNF grammar I found confusing: 1) expression_stmt -> expression ; | ; I couldn't figure out what the ; on its own is for, so I changed the production to expression_stmt -> expression ; 2) expression -> var = expression | simple expression At first I thought this was a typo as I couldn't see how you can have e.g. expression -> var = expression Replace non-terminal expression with var = expression to obtain: expression -> var = var = expression e.g. a=b=100 and so I changed the production to expression -> var = simple expression | simple expression But then I saw your comment explaining that = is right-associative and that a=b=100 is indeed legal. So I'll go back and fix that I added some code generation instructions to the parser and now simple C programs like the factorial program below can be executed: The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
02-17-2021, 10:48 PM
Post: #16
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-17-2021 09:14 AM)F-73P Wrote: Thanks Jonathan for your helpful advice, always appreciated. You're welcome Quote:I removed "int" from the type_specifier grammar production and replaced it with "apq" (arbitrary precision rational), "apf" (arbitrary precision float - not supported on the calculator yet) and "int32_t" (maybe?). Nice Keep up the good work Quote:There were a few aspects of the original BNF grammar I found confusing: This is to allow empty expressions. You need this in cases such as : Code: while(x != somevalue) ; Which could be used for busy waiting ( eg. a "spinlock" ) Quote:2) expression -> var = expression | simple expression Yep In the EBNF grammar it is : Code: expression = { var , "=" } , simple-expression ; Quote:I added some code generation instructions to the parser and now simple C programs like the factorial program below can be executed: Nice work! I look forward to seeing how your project evolves Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
02-18-2021, 06:53 PM
Post: #17
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-17-2021 10:48 PM)Jonathan Busby Wrote:Quote:There were a few aspects of the original BNF grammar I found confusing: The above was just a random example. If you google for eg. "C empty expression uses" then you'll get many examples of different uses of "null" or empty expressions other than my simple ( and somewhat contrived ) example. Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
02-19-2021, 10:39 PM
Post: #18
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
(02-18-2021 06:53 PM)Jonathan Busby Wrote:(02-17-2021 10:48 PM)Jonathan Busby Wrote: This is to allow empty expressions. You need this in cases such as : Also, googling for "C while loop empty loop statement" turns up a number of excellent examples of the usage of while loops with empty loop statements ( ie. "while(something) ;" ), especially the StackOverflow answers. Regards, Jonathan Aeternitas modo est. Longa non est, paene nil. |
|||
11-16-2021, 07:58 AM
(This post was last modified: 12-11-2021 02:27 AM by F-73P.)
Post: #19
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
I've created a new type specifier ("local") for local variables (which can be declared at the beginning of a function and are valid only in that function) and another type specifier for arrays ("array"). I've removed the other type specifiers from the grammar, as the first word of each object pushed onto the stack determines the object type. I've also added Boolean data types. Below is the EBNF grammar for the mC programming language (tokens are in uppercase/quotation marks):
Code:
Some test programs: 1) factorial using a while loop Code:
2) factorial using recursion Code:
3) squaring a number using nested for loops Code:
The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
12-05-2021, 09:44 AM
(This post was last modified: 12-11-2021 08:59 AM by F-73P.)
Post: #20
|
|||
|
|||
RE: monic part 5: writing C programs on the calculator
As the compiler for the programming language (mC) introduced in the previous post nears completion, I would like to provide a brief overview of how the compiler works and share some source code.
The compiler is comprised of 3 main components - the scanner, parser and virtual machine. In this post I'll describe the scanner. Below is the C code for the scanner (which is based on the scanner in Kenneth C. Louden's excellent "Compiler Construction:Principles and Practice"): Code:
The code for the "definitions.h" header file is: Code:
The program source code is an array of characters terminated by the null character (i.e. a string). To improve readability, the string is padded with white space (blanks,tabs and newlines). Comments (text between /* and */) are ignored. The source code for the example program in "scanner.c v1.00" is: Code:
The numbers,letters,words and special symbols that make up the source code are called "lexemes". Each lexeme belongs to a special category called a "token". The scanner extracts each lexeme from the source code and determine its token type. To test the scanner, add the following two files to the project: Code:
Code:
Build the project, start a Debug Session (I use the Keil uVision5 simulator) and repeatedly step over the line "token=getToken();" in "parser.c" to extract the lexemes and determine the corresponding token (the lexemes are stored in the variable "lexemeString"): Code:
This information is passed to the parser, which will be the subject of the next post. The C language combines all the power of assembly language with all the ease-of-use of assembly language |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 3 Guest(s)