XML
Jun 21, 2021
objective: record XML terminologies
- attributes
- elements
XML sample:
<root>
<docs>
<doc id="1"> # id is attribute of doc
<text> blahblah </text> # text is element of doc
<doc id="3">
<text> blahblah2 </text>
</docs>
</root>
To retrieve these elements
docs = root.findall('./docs/doc')
row = []for doc in docs: # retrieve attrib of doc
id = doc.attrib['id'] #retrieve element text from doc && record contents using .text
txt = doc.find('./text').text
row.append({'id': id, 'text': txt})