Below you’ll find three different ways of pattern matching XML data within Scala. XML is a first-class entity within Scala; you write XML directly in your source code and the Scala compiler will replace it with correctly-typed objects from the scala.xml package. Note that XML is used directly both within construction of values and within pattern matches; there are limits on how XML can be used as patterns, though.
These are by no means the only way to transform and work with XML. The scala.xml.transform package has several utility classes that can ease the construction of rewrite rules.
The example below compiles with the current (tip) rev of the Scala 2 compiler. Note the use of Scala 2’s new, smarter type inferencing for case matching in “italics2” – when an “Elem” is matched the variable “e” is typed as Elem within the match.
Regular expression pattern matching is still being added to the Scala 2 compiler, so it is a little “twitchy” at the moment.
Example 1 uses inline XML pattern matching. Example 2 uses the case classes that are Scala’s native XML representation. Example 3 uses Scala’s built-in XPath-like matching. Note that example 3 does not have quite the same logic as the other two, and prints a different result.
package demo; /** This code is in the public domain. @author Ross Judson */ object DemoXML { import scala.xml._; def italics(node: Node): unit = node match { case <i>{contents}</i> => Console.println("Italic: " + contents) case <node>{c @ _ *}</node> => for (val child <- c) italics(child) case _ => { } } def italics2(e: Node): unit = { e match { case Elem(_, "node", _, _, c @ _ *) => c.foreach(k => italics2(k)) case Elem(_, "i", _, _, c @ _ *) => Console.println("Italic: " + e.text); case _ => { } } } def italics3(doc: Elem) = for (val ital <- doc \\ "i") Console.println("Italic: " + ital.text); def go(title: String) = { val doc = <node> <node>This is <i>some</i> text content. <node>This is <i>deeper</i> stuff.</node> </node> <node>I am some text. <title>I am <i>{title}</i>.</title> This is a sentence with an <i>italicized</i> entry. </node> </node> ; Console.println("Looking at " + doc \\ "title"); Console.println("First version"); italics(doc); Console.println("Second version"); italics2(doc); Console.println("Third version"); italics3(doc); } def main(args: Array[String]) = { go("XML Patterns"); } }