decodetree: Open files with encoding='utf-8'

When decodetree.py was added in commit 568ae7efae, QEMU was
using Python 2 which happily reads UTF-8 files in text mode.
Python 3 requires either UTF-8 locale or an explicit encoding
passed to open(). Now that Python 3 is required, explicit
UTF-8 encoding for decodetree source files.

To avoid further problems with the user locale, also explicit
UTF-8 encoding for the generated C files.

Explicit both input/output are plain text by using the 't' mode.

This fixes:

  $ /usr/bin/python3 scripts/decodetree.py test.decode
  Traceback (most recent call last):
    File "scripts/decodetree.py", line 1397, in <module>
      main()
    File "scripts/decodetree.py", line 1308, in main
      parse_file(f, toppat)
    File "scripts/decodetree.py", line 994, in parse_file
      for line in f:
    File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
      return codecs.ascii_decode(input, self.errors)[0]
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
  ordinal not in range(128)

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210110000240.761122-1-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This commit is contained in:
Philippe Mathieu-Daudé 2021-01-10 01:02:40 +01:00 committed by Richard Henderson
parent 10061ffe56
commit 4cacecaaa2
1 changed files with 6 additions and 3 deletions

View File

@ -20,6 +20,7 @@
# See the syntax and semantics in docs/devel/decodetree.rst.
#
import io
import os
import re
import sys
@ -1304,7 +1305,7 @@ def main():
for filename in args:
input_file = filename
f = open(filename, 'r')
f = open(filename, 'rt', encoding='utf-8')
parse_file(f, toppat)
f.close()
@ -1324,9 +1325,11 @@ def main():
prop_size(stree)
if output_file:
output_fd = open(output_file, 'w')
output_fd = open(output_file, 'wt', encoding='utf-8')
else:
output_fd = sys.stdout
output_fd = io.TextIOWrapper(sys.stdout.buffer,
encoding=sys.stdout.encoding,
errors="ignore")
output_autogen()
for n in sorted(arguments.keys()):