Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I\'m reading the Dragon Book. The following is from the start of Section 3.1.3.

ID: 654181 • Letter: I

Question

I'm reading the Dragon Book. The following is from the start of Section 3.1.3.

When more than one lexeme can match a pattern, the lexical analyzer must provide the subsequent compiler phases additional information about the particular lexeme that matched. For example, the pattern for token number matches both 0 and 1, but it is extremely important for the code generator to know which lexeme was found in the source program. Thus, in many cases, the lexical analyzer returns to the parser not only a token name, but an attribute value that describes the lexeme represented by the token; the token name influences parsing decisions, while the attribute value influences translation of tokens after the parse.

From what I understand the symbol table stores the variable name and the some details like the type, scope etc. So if a character 0 is found by the lexical analyzer, it matches the pattern for a number so it uses the token name number so the token becomes <number, attrb>.

As per the snippet I have cited above, I don't understand what data is stored in the symbol table for numbers. Is the value of the number stored in the symbol table?

Explanation / Answer

Typical lexers will return a sequence of pairs, where the pair consists of the token type and an optional value. For a token such as 12345, the token type will be something like "number" and the value will be 12345. If the lexer only emitted the information that there was a numeric constant in the input, then the following phases of the parser would have no way to know which number it was, and that is obviously important.

I don't understand why you started talking about the symbol table in the end of your question, and I think you may be confused. The quotation you gave says nothing at all about the symbol table, which usually belongs to a later phase of compilation. A symbol table maps symbols (that is, names) to values. The quotation in your question is about tokens, not symbols. Tokens are not usually stored in a symbol table, and there is no reason to store numbers in a symbol table. Typically a parser will have a stack, and will push whole tokens onto the stack, and pop them off again as needed.