class SyntaxTree::Parser

Parser is a subclass of the Ripper library that subscribes to the stream of tokens and nodes coming from the parser and builds up a syntax tree.

Attributes

comments[R]
Array[ Comment | EmbDoc ]

the list of comments that have been found

while parsing the source.

line_counts[R]
Array[ SingleByteString | MultiByteString ]

the list of objects that

represent the start of each line in character offsets

source[R]
String

the source being parsed

tokens[R]
Array[ untyped ]

a running list of tokens that have been found in the

source. This list changes a lot as certain nodes will “consume” these tokens to determine their bounds.

Public Class Methods

new(source, *) click to toggle source
Calls superclass method
# File lib/syntax_tree/parser.rb, line 116
def initialize(source, *)
  super

  # We keep the source around so that we can refer back to it when we're
  # generating the AST. Sometimes it's easier to just reference the source
  # string when you want to check if it contains a certain character, for
  # example.
  @source = source

  # This is the full set of comments that have been found by the parser.
  # It's a running list. At the end of every block of statements, they will
  # go in and attempt to grab any comments that are on their own line and
  # turn them into regular statements. So at the end of parsing the only
  # comments left in here will be comments on lines that also contain code.
  @comments = []

  # This is the current embdoc (comments that start with =begin and end with
  # =end). Since they can't be nested, there's no need for a stack here, as
  # there can only be one active. These end up getting dumped into the
  # comments list before getting picked up by the statements that surround
  # them.
  @embdoc = nil

  # This is an optional node that can be present if the __END__ keyword is
  # used in the file. In that case, this will represent the content after
  # that keyword.
  @__end__ = nil

  # Heredocs can actually be nested together if you're using interpolation,
  # so this is a stack of heredoc nodes that are currently being created.
  # When we get to the token that finishes off a heredoc node, we pop the
  # top one off. If there are others surrounding it, then the body events
  # will now be added to the correct nodes.
  @heredocs = []

  # This is a running list of tokens that have fired. It's useful mostly for
  # maintaining location information. For example, if you're inside the
  # handle of a def event, then in order to determine where the AST node
  # started, you need to look backward in the tokens to find a def keyword.
  # Most of the time, when a parser event consumes one of these events, it
  # will be deleted from the list. So ideally, this list stays pretty short
  # over the course of parsing a source string.
  @tokens = TokenList.new

  # Here we're going to build up a list of SingleByteString or
  # MultiByteString objects. They're each going to represent a string in the
  # source. They are used by the `char_pos` method to determine where we are
  # in the source string.
  @line_counts = []
  last_index = 0

  @source.each_line do |line|
    @line_counts << if line.size == line.bytesize
      SingleByteString.new(last_index)
    else
      MultiByteString.new(last_index, line)
    end

    last_index += line.size
  end

  # Make sure line counts is filled out with the first and last line at
  # minimum so that it has something to compare against if the parser is in
  # a lineno=2 state for an empty file.
  @line_counts << SingleByteString.new(0) if @line_counts.empty?
  @line_counts << SingleByteString.new(last_index)
end

Helper methods

↑ top

Ripper event handlers

↑ top