forked from openkylin/debiandoc-sgml
5adbb15d2a | ||
---|---|---|
debian | ||
examples | ||
sgml | ||
tools | ||
COPYING | ||
Makefile | ||
README |
README
This is DebianDoc-SGML, an SGML-based documentation formatting package used for the Debian manuals. To install it on a non-Debian system edit the Makefile and then run `make', `make install'. The changelog is in the debian subdirectory. Ardo van Rangelrooij <ardo@debian.org> Ian Jackson <ijackson@gnu.ai.mit.edu> ----------------------------------------------------------------------------- Message to the future maintainer(s): (Osamu Aoki) I have re-factored and extended the DebianDoc-SGML package while adding some UTF-8 support, DebianDoc-SGML pretty print support, XHTML support, DocBook-XML output support, Wiki support, etc. since 2005. I have to say this has been a steep learning experience for me who had no formal SGML education before. In order to help future maintainer to get started quickly, I will summarize helper information here at the end of README file here which is only be seen in the source tree. * package structure (Please refer to the user documentation on this for explanation based on the installed file location) This package is made with following files: |-- COPYING (GPL2) |-- Makefile |-- README (This file) |-- debian (Debian package meta-data) | |-- README.Debian (User documentation) | |-- TODO | |-- changelog | |-- compat | |-- control | |-- copyright | |-- debiandoc-sgml.install | |-- debiandoc-sgml.postinst | |-- debiandoc-sgml.postrm | |-- debiandoc-sgml.prerm | |-- debiandoc-sgml.sgmlcatalogs | `-- rules |-- sgml (DTD definition) | |-- dtd | | |-- catalog | | |-- debiandoc.dcl | | `-- debiandoc.dtd | `-- entities | |-- catalog | |-- debiandoc-lat1 | `-- debiandoc-lat2 `-- tools |-- bin (source for executables) | |-- fixlatex | |-- mkconversions | |-- saspconvert | `-- template |-- lib | |-- Format (output formatting engine) | | |-- Alias.pm (alias (.pm) definition of format) | | |-- Driver.pm | | |-- Format.pm | | |-- HTML.pm (format driver for HTML) | | |-- LaTeX.pm (format driver for LaTeX) | | |-- Texinfo.pm | | |-- Text.pm | | |-- TextOV.pm | | `-- XML.pm | |-- Locale (locale and format specific data) | | |-- Alias.pm (alias definition of locale values) | | |-- SGML (locale independent data for SGML) | | |-- XML (locale independent data for XML) | | |-- convert-encoding (conversion script for the locale data) | | |-- ca_ES.ISO8859-1 (data for the ca_ES.ISO8859-1 locale) | | | |-- HTML (locale specific data for HTML) | | | |-- LaTeX (locale specific data for LaTeX) | | | |-- Texinfo | | | |-- Text | | | `-- TextOV ......... (directory for all locales) | `-- Map (Mapping for non ASCII characters) | |-- Alias.pm | |-- HTML.pm | |-- LaTeX.pm | |-- Texinfo.pm | |-- Text.pm | |-- TextOV.pm | `-- XML.pm `-- man (manual page) `-- debiandoc-sgml.1 * How to add new locale 1. Create locale named directory by copying en_US.ISO8859-1 2. Translate phrases and make needed changes 3. Create alternative encoding data such as UTF-8 ones using convert-encoding script 4. Adjust UTF-8 data for Unicode. utf-8 for HTML utf8 for LaTeX 5. Add new locales to Locale/Alias.pm . * main conversion scripts debiandoc2* In order to make all debiandoc2* commands to be consistent, I have merged all of them completely in to one template file 'tools/bin/template' and introduced few new format support using existing script as my guide. All the debiandoc2* commands are generated by the script 'tools/bin/mkconversions' while parsing this unified script source 'tools/bin/template'. (This infrastructure of shell/sed combination was there when I started so please do not ask me why I did not use CPP for this.) For the debug purpose, I provide 'make diff' which creates 'diff -u' for all the debiandoc2* commands against the current installed version. This functionality is added to help developer to understand implication of the changes made to the 'tools/bin/template' file and to avoid unintended changes to the existing scripts when adding features. Basically these generated script uses SGML parser to produce output text file such as plain text, HTML, LaTeX source, etc. For PostScript and PDF output, LaTeX source is further processed to produce desired results. Since Chinese Big5 encoding is not compatible with TeX (thus neither with LaTeX), internal fixlatex script is run on the source before handing generated LaTeX source to LaTeX. This is because 2nd byte of 16bit Big5 encoding uses ASCII ranges which makes some 16 bit character to collide with meta characters such as \ { } used in the LaTeX context. (The same problem should happen with Japanese Shift-JIS encoding but we do not support this encoding now thus no problem suffered.) New -X option enable to use user provided Locale dependent data. Execution of "make test" will execute test build sequence using package source version of Locale dependent data. This -X is most useful when fixing Locale dependent problem or testing new Locale data. The use of -s option with updated fixlatex script can be used to add Japanese Shift-JIS encoding support. But, -X option is better choice for most case for debugging. For adjusting language specific data such as the LaTeX starting code: * study Format/LaTeX.pm , * play with -X option as described in README.Debian and manpage to find out right /usr/share/perl5/DebianDoc_SGML/Locale/* data alternative. * adjust tools/lib/Locale/Alias.pm and tools/lib/Locale/xx_YY.encoding/LaTeX.pm files in the source code. * The meaning of %locale This has following contents for LaTeX. The Format/LaTeX.pm file use the value defined here. %locale = ( 'babel' => '', 'inputenc' => '', 'abstract' => '', 'copyright notice' => '', 'before begin document' => '', 'after begin document' => '', 'before end document' => '', 'pdfhyperref' => '' ); * The first 2 are used to define language scheme based on the babel macro. For CJK, this can be undefined. * The next 2 are for the word used for abstract and copyright notice in that pertinent language. * The next 3 are recent addition which provide very flexible ways to create proper LaTeX source. CJK uses these (Can be omitted for European languages) * The last one defines how hyperref for PDF are generated with hyperref package. (We may need this to be defined otherwise for UTF-8 but I do not know?) "hypertex" is the default value if none is given. If UTF-8 locale, I use unicode at this moment as the value. * For LaTeX language dependent parameter, I use babel name of "*.sty" from /usr/share/texmf-texlive/tex/generic/babel if available. Exception: * vietnam * lithuanian * Read "The Not So Short Introduction to LaTeX 2ε" by Tobias Oetiker to get some LaTeX idea. * Read "The CJK package for LaTeX 2ε — Multilingual support beyond babel" by Werner Lemberg to get some CJK idea. It looks like current CJK environment (2007/08) is not good enough for UTF-8. * Read CTAN archive for unicode. (me too.) * http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=unicode * http://tug.ctan.org/tex-archive/macros/latex/contrib/unicode/ Similar thing can be done for HTML with %locale. The Format/HTML.pm file use the value for "charset" in this when generating HTML. Package requirements: As for required packages (especially for LaTeX processing (PS, PDF formats)), see cjk-latex-* packages. Please note that a ghostscript interpreter such as gs-gpl, gs-esp should (not must) be installed too for PDF thumnail generation. Conversion functions back to normalized SGML and XML formats are available. The XML generated require some manual action. Osamu Aoki <osamu@debian.org> Sat, 04 Aug 2007 21:46:45 +0900