Pull Request #418 Output a more helpful error message on malformed bulk upload CSV data.

Malformed metadata keys in a bulk upload CSV file previously lead to a
somewhat obscure ConnectionResetError. Now the offending key is named
in an error message which should greatly help users who encounter this
problem.

The XML parser in the standard library does not properly verify tag
names. I avoided the regular expression solution at first and chose
lxml because the proper regex to verify XML tag names is rather
complex. But as it turns out, the Internet Archive only allows a very
limited subset of those characters in their metadata keys
(`'[A-Za-z][.-0-9A-Za-z_]+`). (Verified by trying to create keys
containing all other Unicode characters from 0 to x10FFFF against the
metadata API.)

See:
https://archive.org/services/docs/api/metadata-schema/index.html#internet-archive-metadata

maxz

Pull request event #1087 passed

  • Ran for
  • New branch build
AMD64
no language set
Git
Sorry, we're having troubles fetching jobs. Please try again later.