How does a JavaScript parser work?

I am trying to understand how JS actually parses. But my searches either return some very vaguely documented "parser / generator" project (I don't even know what that means), or how to parse JS using the JS Engine using the "parse" magic method. I don't want to check a bunch of code and have been trying my whole life to understand (although I can, it will take too long).

I want to know how an arbitrary line of JS code actually turns into objects, functions, variables, etc. I also want to know the procedures and methods that turn this string into stuff, stored, referenced, executed.

Is there any documentation / links for this?

+3


source to share


2 answers


Parsers probably work in different ways, but basically they go through the tokenization stage first and then give the result to the compiler, which turns it into a program if possible. For example given:

function foo(a) {
  alert(a);
}

      

the parser will remove all leading spaces for the first character, the letter "f". It will collect characters until it receives something that does not belong, a space that indicates the end of the token. It starts over with "f" "foo" until it gets to "(", so it now has the functions of "tokens" and "foo". It knows "(" is a token in itself, so 3 Then it gets "a" followed by ")", which represent two more tokens to make 5, etc.

The only need for a space is between tokens that are otherwise ambiguous (for example, there must be either a space or another token between "function" and "foo").



Once the tokenization is complete, it goes to the compiler, which sees "function" as an identifier and interprets it as the "function" keyword. Then it gets "foo", the identifier that the grammar of the language says, this is the name of the function. Then "(" denotes the opening grouping operator and hence the beginning of the formal parameter list, and so on.

Compilers can process tokens one at a time, or they can grab them in chunks, or do all sorts of weird things to make them run faster.

You can also read How Do C / C ++ Parsers Work? , which gives a few more hints. Or just use Google.

+2


source


While it doesn't closely match the actual JS engines, you might be interested in reading Douglas Crockford's article on Top-Down Priority , which includes code for a small working lexer and parser, written in a subset of Javascript that it parses. This is very readable and concise code (with nice accompanying explanations) that at least gives you a short description of how a real implementation might work.



A more common method than Crockford's Top-Down Operator Precedence is recursive descent , which is used in Narcissus , a complete implementation of JS in JS.

+1


source







All Articles