python - issues turning a dict element into a string to search w/ regex

Question

This one might best be summarized as "Python isn't letting me do something stupid," but I digress -- I have a bad habit of turning XML into strings so I can use regex to fetch something out of it, rather than actually being a good person and doing the Xpath thing.

I'm having an issue currently where I'm looping through a list of dicts (the dicts themselves being several levels deep, containing encoded XML). I'm trying to do re.findall(pattern, str(listitem)), which is giving me an "unhashable type: 'DictionaryElement'" error. Any ideas?

edit: this is pubmed API stuff using biopython:

handle = Entrez.efetch(db="pubmed", id=pmids, retmode="xml")
records = Entrez.read(handle)
records = list(records)

meshterms = {}
for y in records:
    meshterms[y] = re.findall(r'(?<=DescriptorName\'\:\sStringElement\(\').+?(?=\')',str(y))

y would include something that looks like:

{u'MedlineCitation': DictElement({u'OtherID': [], u'OtherAbstract': [], u'CitationSubset': ['IM'], u'KeywordList': [], u'DateCreated': {u'Month': '11', u'Day': '20', u'Year': '2012'}, u'SpaceFlightMission': [], u'GeneralNote': [], u'Article': DictElement({u'ArticleDate': [], u'Pagination': {u'MedlinePgn': '140-54'}, u'AuthorList': ListElement([DictElement({u'LastName': 'Goupil', u'Initials': 'L', u'NameID': [], u'ForeName': 'Louise'}, attributes={u'ValidYN': u'Y'}), DictElement({u'LastName': 'Bekinschtein', u'Initials': 'T', u'NameID': [], u'ForeName': 'Tristan'}, attributes={u'ValidYN': u'Y'})], attributes={u'Type': u'authors', u'CompleteYN': u'Y'}), u'Language': ['eng'], u'PublicationTypeList': ['Journal Article'], u'Journal': {u'ISSN': StringElement('0003-9829', attributes={u'IssnType': u'Print'}), u'ISOAbbreviation': 'Arch Ital Biol', u'JournalIssue': DictElement({u'Volume': '150', u'Issue': '2-3', u'PubDate': {u'Month': 'Jun', u'Year': '2012'}}, attributes={u'CitedMedium': u'Print'}), u'Title': 'Archives italiennes de biologie'}, u'Affiliation': 'MRC Cognition and Brain Sciences Unit, 15 Chauces Road, CB2 7EF, Cambridge,UK Email: louisegoupil@hotmal.fr.', u'ArticleTitle': 'Cognitive processing during the transition to sleep.', u'ELocationID': [StringElement('10.4449/aib.v150i2.1247', attributes={u'ValidYN': u'Y', u'EIdType': u'doi'})], u'Abstract': {u'AbstractText': ['Several dramatic physiological and behaviourl changes occur during the transition from wakefulness to sleep. The process is regarded as a grey area of consciousness between attentive wakefulness and slow wave sleep. Although there is evidence of neurophysiological integration decay as signalled by sleep EEG elements, changes in power spectra and coherence, thalamocortical connectivity in fMRI, and single neuron changes in firing patterns, little is known about the cognitive and behavioural dynamics of these transitions. Hereby we revise the body and brain physiology, behaviour and phenomenology of these changes of consciousness and propose an experimental framework to integrate the two aspects of consciousness that interact in the transition, wakefulness and awareness.']}}, attributes={u'PubModel': u'Print'}), u'PMID': StringElement('23165874', attributes={u'Version': u'1'}), u'MedlineJournalInfo': {u'MedlineTA': 'Arch Ital Biol', u'Country': 'Italy', u'NlmUniqueID': '0372441', u'ISSNLinking': '0003-9829'}}, attributes={u'Owner': u'NLM', u'Status': u'In-Data-Review'}), u'PubmedData': {u'ArticleIdList': [StringElement('23165874', attributes={u'IdType': u'pubmed'})], u'PublicationStatus': 'ppublish', u'History': [DictElement({u'Month': '2', u'Day': '07', u'Year': '2012'}, attributes={u'PubStatus': u'accepted'}), DictElement({u'Minute': '0', u'Month': '11', u'Day': '21', u'Hour': '6', u'Year': '2012'}, attributes={u'PubStatus': u'entrez'}), DictElement({u'Minute': '0', u'Month': '11', u'Day': '21', u'Hour': '6', u'Year': '2012'}, attributes={u'PubStatus': u'pubmed'}), DictElement({u'Minute': '0', u'Month': '11', u'Day': '21', u'Hour': '6', u'Year': '2012'}, attributes={u'PubStatus': u'medline'})]}

where my regex is trying to pull the contents of StringElement below DescriptorName (incidentally, not present in the above record, but you get the idea.

Thanks!

score 0 · Accepted Answer

well, this got really odd, but I was able to do it as follows:

scratch = open("mcs_scratch.txt","wb")
scratch.write(str(y))
scratch = open("mcs_scratch.txt","r")
y = str(scratch.read())

somehow I doubt this qualifies as good practice, but it works.

python - issues turning a dict element into a string to search w/ regex

1 回答 1

Related

Reference