Babel: past work

Updated: 2013-04-01


Due to an unexpected change in the working hours in my job I had to delay the beta to (I hope) December. The good news is I have time to continue maintaining babel in the near future. On the other hand I’m somewhat stuck with one of the last tasks: hooks.


Bug: \bibitem is out of sync with \selectlanguage in the aux file. The reason is \bibitem uses \immediate (and others, in fact), while \selectlanguage doesn’t.

Another bug: \aliasshorthand didn’t make an alias, but just let the new shorthand to the original character at the moment this macro is used, whatever meaning has got (usually the non-active value in the preamble). Other problem is user shorthand are activated and deactivated by languages, so that they have a somewhat unpredictable behaviour. Perhaps a new \useshorthands*, which will make sure the shorthand is always activated, would be useful.

By the way, for the sake of clarity I’m using the following terminology: “active” is used in the TeX sense, ie, with catcode 13, and therefore “non-active” means “letter” or “other” (or in the case of ^, “superscript”). “Activated” applies to an active character which behaves like a shorthand, while “deactivated” applies to a shorthand using the “system” value, typically the corresponding non-active char (but not always; for example, ~ as a non-breaking space). Note a deactivated shorthand is still an active character, whose definition is usually certain non-active char.

For babel/4196 I’ve extended \defineshorthand so that it can be used to redefine language shorthands. (A few macros for discretionaries will be added, too.) This way, we’ll have both simplicity and flexibility.

You could start with, say (with pseudo-code):

\defineshorthand{"*}{(use a soft hyphen)}
\defineshorthand{"-}{(use a hard hyphen)}

However, behaviour of hyphens are language dependent. For example, in languages like Polish and Portugese, a hard hyphen inside compound words are repeated at the beginning of the next line. You could set:

\defineshorthand[polish,portugese]{"-}{(use a repeated hyphen)}

You have a single unified shorthand ("-), with a content-based meaning (“compound word hyphen”) whose visual behavior is that expected in each context.


The switching mechanism of \foreignlanguage and otherlanguage* does not work correctly, because \originalTeX is either missing or incorrectly built. For example, with


the German text is typeset using Greek characters. The aux file is wrong, too. Now, except for the date and captions groups, they will mimic the behaviour of \selectlanguage (in other words, they will share the code).


Bugs discovered recently are related to patterns and primes.

Patterns were loaded before the previous language was closed (even before \originalTeX). Now are loaded after the selected language has been setup (after \extras...).

Primes and hats are not handled correctly. The following raises an error (‘Double superscript’):


And the following, too:



One of the problems one must face when loading files from different sources and authors is that of incompatibilities. While packages can be loaded separately and there are tools like \PassOptionsToPackage, babel languages are loaded in one go, and there is no way to insert some code among them.

I’ve analyzed 5 approaches:

1) Loading a 'prebabel’ package which would define \AtEndOfLanguage, with code to be used when the main babel package is loaded. I find this somewhat clumsy, but it could be useful if other packages and classes need to preset some option.

2) Delaying the loading of all languages to a \babelinput, so that you can use \AtEndOfLanguage. After some tests, I still find it a bit confusing, and you can do nothing before loading babel (eg, “passing” some option to a language) .

3) Delaying the loading of separate languages, so that \AtEndOfLanguage is not required. This option doesn’t seem possible with current babel.

4) Loading named local config files, much like bblopts.cfg but only when requested (with a package option config=file). This could useful in general and I’ve added it. However, presetting options is not easy (but not impossible).

5) Executing macros following certain naming conventions (eg, \babelafteritalian), so that you can define them with \(re)newcommand. Unfortunately, this leaves to the user the responsibility of defining them in the proper way (remember LaTeX doesn’t provide a way to add stuff to an existing macro).

Well, there are still further approaches, like having package options after-latin=\string\dosomething or after-latin=dosomething, but things are getting worse.


When investigating babel/3800 I discovered some issues to be addressed related to options, mainly the fact combining package and global options has not a well-defined behaviour (the main language sometimes is the last named one, but sometimes it’s not). The doc says you can use package options or global options, but this is not enforced and it can be done without complaint (just with an unexpected behaviour).

Furthermore, ldf files not bundled with babel are not recognized as global options. So, I’ve added some code to make sure global options are properly recognized, to raise an error (or perhaps only a warning, I’m not sure) if the main language is not the last named one and to set explicitly the main language in cases like


Oddly, this set english as the main language. Now it complains with a message saying you have to use


To seize the opportunity, I’ve added another option which I had already written, to control shorthands (but not yet finished nor tested thoroughly).