Seventh Week: Assignments |
Posted 4 Oct 2005 by herman
A reading assignment and the next homework.
Read Chapter Six this week. You can see margin notes
and a list of some of the important keywords in this file:
The fourth homework is due at noon on Friday 14 October (several
days after the midterm exam). The assignment is
here: homework4.pdf, and two examples of
XML you can use for this assignment are
xhtml-0.txt and xfillin-0.txt.
The fourth homework deadline has been moved to noon on
Wednesday 19 October (because of many conflicts with midterm
Here's an interesting Python issue that stumped one student --- and it's
a tricky problem, worth knowing about!
Recall that using xmd.dom.minidom, once an XML file is parsed
into the tree, you navigate the tree by following lists of
children from the root node. How do you get such a list? Suppose
X is a node of the document tree. Now, according to the
online documentation, the Python attribute "childNodes" of X should
return a list of the children of X; the documentation states that this
is a "read-only" attribute.
The student found otherwise. That is, you can change what the
childNodes attribute contains, and this can really
discombobulate your XML tree, to say the least. But why would
you change the childNodes attribute, other than a programming error?
Here's the surprising answer.
As you see, the assignment s = t didn't copy the list, it
just made another reference to it. So what do you think
might happen with this sequence of statements?
>>> t = [1,2,3]
>>> s = t
>>> s += [4,5]
[1, 2, 3, 4, 5]
Yes, that's right, it will change the tree because
the statement adding to the list S actually adds to the
X.childNodes list, on account of S being just a reference
to that list.
S = X.childNodes
S += X.childNodes.childNodes
So what to do about this? Maybe what the programmer
wanted instead was to make S become a new copy of
the list, not just another reference to the same list.
Here's a way to do this in Python, though not very
Is there a shorter-to-type way to do the same thing?
Yes, it turns out that
S =  # make S initially an empty list
for i in X.childNodes:
S.append(i) # (or S += [ i ] would work, too)
will make S a copy of the list, rather than just
a reference to X.childNodes. However, this is
somewhat obscure, I admit (interestingly, it doesn't
work for tuples, which I think is a kind of bug in
the language). In fact, I implicitly used the fact
that splicing creates a copy of the list in my
xml-cleanup.py example, and if you followed
that example to do the homework, you probably didn't
encounter this problem.
S = X.childNodes[0:]
Here's a short program that solves the programming problem:
xmlForm.py. Notice that it uses the "getElementsByTagName"
method to locate a tag with the desired name; this is considerably
simpler than writing another iteration to search the tree of terms (the
getElementsByTagName is documented in "188.8.131.52 Element Objects" in the
online manual for the Python minidom.