Commit d59c9cd5 authored Jun 29, 2017 by David L. Jones

[lit] Fix some convoluted logic around Unicode encoding, and de-duplicate...

[lit] Fix some convoluted logic around Unicode encoding, and de-duplicate across modules that used it.

Summary:
In Python2 and Python3, the various (non-)?Unicode string types are sort of
spaghetti. Python2 has unicode support tacked on via the 'unicode' type, which
is distinct from 'str' (which are bytes). Python3 takes the "unicode-everywhere"
approach, with 'str' representing a Unicode string.

Both have a 'bytes' type. In Python3, it is the only way to represent raw bytes.
However, in Python2, 'bytes' is an alias for 'str'. This leads to interesting
problems when an interface requires a precise type, but has to run under both
Python2 and Python3.

The previous logic appeared to be correct in all cases, but went through more
layers of indirection than necessary. This change does the necessary conversions
in one shot, with documentation about which paths might be taken in Python2 or
Python3.

Reviewers: zturner, modocache

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34793

llvm-svn: 306625

parent 17277f13

Show whitespace changes

Inline Side-by-side

Please to comment