class SyntaxTree::Parser
Parser
is a subclass of the Ripper library that subscribes to the stream of tokens and nodes coming from the parser and builds up a syntax tree.
Attributes
comments[R]
line_counts[R]
- Array[
SingleByteString
|MultiByteString
] -
the list of objects that
represent the start of each line in character offsets
source[R]
- String
-
the source being parsed
tokens[R]
- Array[ untyped ]
-
a running list of tokens that have been found in the
source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.
Public Class Methods
new(source, *)
click to toggle source
Calls superclass method
# File lib/syntax_tree/parser.rb, line 116 def initialize(source, *) super # We keep the source around so that we can refer back to it when we're # generating the AST. Sometimes it's easier to just reference the source # string when you want to check if it contains a certain character, for # example. @source = source # This is the full set of comments that have been found by the parser. # It's a running list. At the end of every block of statements, they will # go in and attempt to grab any comments that are on their own line and # turn them into regular statements. So at the end of parsing the only # comments left in here will be comments on lines that also contain code. @comments = [] # This is the current embdoc (comments that start with =begin and end with # =end). Since they can't be nested, there's no need for a stack here, as # there can only be one active. These end up getting dumped into the # comments list before getting picked up by the statements that surround # them. @embdoc = nil # This is an optional node that can be present if the __END__ keyword is # used in the file. In that case, this will represent the content after # that keyword. @__end__ = nil # Heredocs can actually be nested together if you're using interpolation, # so this is a stack of heredoc nodes that are currently being created. # When we get to the token that finishes off a heredoc node, we pop the # top one off. If there are others surrounding it, then the body events # will now be added to the correct nodes. @heredocs = [] # This is a running list of tokens that have fired. It's useful mostly for # maintaining location information. For example, if you're inside the # handle of a def event, then in order to determine where the AST node # started, you need to look backward in the tokens to find a def keyword. # Most of the time, when a parser event consumes one of these events, it # will be deleted from the list. So ideally, this list stays pretty short # over the course of parsing a source string. @tokens = TokenList.new # Here we're going to build up a list of SingleByteString or # MultiByteString objects. They're each going to represent a string in the # source. They are used by the `char_pos` method to determine where we are # in the source string. @line_counts = [] last_index = 0 @source.each_line do |line| @line_counts << if line.size == line.bytesize SingleByteString.new(last_index) else MultiByteString.new(last_index, line) end last_index += line.size end # Make sure line counts is filled out with the first and last line at # minimum so that it has something to compare against if the parser is in # a lineno=2 state for an empty file. @line_counts << SingleByteString.new(0) if @line_counts.empty? @line_counts << SingleByteString.new(last_index) end