The new RefSeq release from NCBI is compatible wit

2019-05-18 21:12发布

I'm new with python and especially with Biopython. I'm trying to take some information from an XML file with Entrez.efetch and then read it. Last week this script worked well:

handle = Entrez.efetch(db="Protein", id="YP_008872780.1", retmode="xml")
records = Entrez.read(handle)

But now I'm getting an Error:

> Bio.Entrez.Parser.ValidationError: Failed to find tag 'GBSeq_xrefs' in
    the DTD. To skip all tags that are not represented in the DTD, please
    call Bio.Entrez.read or Bio.Entrez.parse with validate=False.

So I run this:

records = Entrez.read(handle, validate=False)

But I'm still getting an Error:

TypeError: 'str' object does not support item assignment

After some research I realized that NCBI made new changes concerning the RefSeq which creates new tags in the xml file (of GenPept)

Do I need to change something in the DTD to support these new tags?

1条回答
可以哭但决不认输i
2楼-- · 2019-05-18 21:52

It appears that my DTD file was out of date.
A new version can be found here or here.

查看更多
登录 后发表回答