• Java 2 Platform Standard Edition 5.0 API specification
• OpenJDK source code repository for library classes
• Source code from Data structures and problem-solving using Java, third edition
One of the traditional difficulties for novice Scheme programmers is making sure that the parentheses in each expression are correctly balanced and nested. Today we'll write a Java class that provides several methods that can help with this problem.
A correctly balanced and nested expression either contains no parentheses at all, or else consists of a left parenthesis, zero or more correctly balanced and nested expressions, and a right parenthesis, in that order. For the purpose of parenthesis-matching, we'll ignore everything except parentheses; our objective is simply to make sure that every left parenthesis precedes and mates to a right parenthesis, and vice versa, and that everything that is enclosed in a mated pair of parentheses is similarly balanced.
We can test this property with the help of a stack, as follows. We start with an empty stack. Given a string -- the expression to be checked -- we walk through the characters of the string from left to right, looking at each one. Whenever we find a left parenthesis, we push it onto the stack. Whenever we find a right parenthesis, we pop the stack. We ignore all other characters. If, at the end of this process, the stack is once more empty, then we encountered exactly the same number of left and right parentheses; moreover, each right parenthesis can be mated to the left parenthesis that we popped off the stack when we encountered it.
A stack is the appropriate data structure to use here, because of its "last in, first out" discipline: Any right parenthesis that we encounter should be mated to the most recently encountered left parenthesis that has not already been mated (and hence removed from the stack).
What if the nesting of the parentheses is incorrect, as in the string "())())((()", say? Having equal numbers of left and right parentheses
doesn't guarantee that they can be mated up correctly. Using a stack takes
care of that problem, too: In processing an expression with incorrect
nesting, we'll always encounter a point at which we try to pop an empty
stack. In that case, we'll just catch the resulting exception and report
that the expression isn't correctly nested and balanced.
paren package, and within that package a Parser class. Write a method isBalanced for the Parser
class that takes a String as argument and returns a boolean value
indicating whether the parentheses in that string are correctly nested and
balanced, using the stack strategy described above. The stack can be a
local variable of the method. You can use java.util.Stack, java.util.LinkedList, or a Stack class of your own design and
construction, as you prefer.closeOff, that takes a String as
argument and returns a similar String, but with as many right
parentheses added at the end as are required to mate all the unmated left
parentheses in the given string. So, for instance, if the argument is
"((() ()) (()", closeOff should return "((() ())
(()))".Traditionally, grouping in mathematical expressions is indicated not only by parentheses, but also by square brackets, curly braces, angle brackets, and even more exotic symbol pairs. Usually, such expressions aren't considered well-formed unless the symbol that is placed at the left end of a group mates to a symbol at the right end that is its mirror image in shape -- a left square bracket to a right square bracket, a less-than character to a greater-than character, and so on.
isBalanced method so that it pushes any of the
left-end characters (left parenthesis, left square bracket, left curly
brace, less-than character) onto the stack when it is encountered, and pops
the stack when any of the right-end characters (right parenthesis, right
square bracket, right curly brace, greater-than character) is encountered,
but returns false if it discovers that the left-end character popped
in this way is not the mirror-image character of the right-end character
that has just been encountered. (So, for instance, giving it the argument
"({<}>)" should result in false.)closeOff method to finish off an unbalanced String, as in the previous section, but have it supply the correct
right-end mates for the unmated left-end characters in the given String.Parser class a constructor that takes two String values as arguments. The two strings should be of equal length.
A Parser should treat the characters in the first string as left-end
characters and those in the second string as the corresponding right-end
characters. Adapt the isBalanced and closeOff methods to
treat these characters as the grouping symbols. (So new
Parser("([{<", ")]}>") should reproduce the behavior described in this
section, and new Parser("(", ")") the behavior from the previous
section.)Parser a method that takes a correctly balanced string
returns a list containing every substring of that string that begins with a
left-end character and ends with the mated right-end character. (Hint:
When you push the left-end character onto the stack, store its position in
the string along with it.)Parser will
work pretty well on actual Scheme code, provided that none of the comments,
string literals, or character literals that occur in the code contains any
of the symbols used for grouping. If we can just arrange for isBalanced and closeOff to ignore comments, string literals, and
character literals, we have a working utility. Write a static method that
takes any String as argument and returns a similar string, but with
the Scheme comments, character literals, and string literals removed.
(Note: Any semicolon that occurs in the text of a Scheme expression but is
not part of a string or character literal marks the beginning of a comment
that ends at the next following newline character. Any occurrence of the
sequence #\ that is not part of a string literal marks the beginning
of a character literal. The rest of the character literal, after the #\, is either a single character or space or newline. And
any occurrence of a double quotation mark that is not part of comment or
character literal indicates the beginning of a string literal that ends at
the next following unescaped double quotation mark. As one traverses the
interior of the string literal from left to right, any unescaped occurrence
of a backslash causes the character that follows it to be escaped.)isBalanced and closeOff so that they strip out
all Scheme comments, character literals, and string literals before
attempting to mate the parentheses. Make sure that the string that closeOff returns retains all of these comments, character literals, and
string literals, however.