Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with GNU gettext #165

Open
JulienPalard opened this issue Jan 15, 2016 · 18 comments
Open

Integration with GNU gettext #165

JulienPalard opened this issue Jan 15, 2016 · 18 comments

Comments

@JulienPalard
Copy link

I think Smarty would beneficiate from a well designed gettext plugin. But to do this, I think we should discuss about what "well designed" mean for a gettext plugin.

Specs:

  • Plural with count
  • Context
  • Domain
  • Able to mix HTML contexts (url (urlencode), html (htmlencode))
  • Allow translators to swap parameters (Typically using %2s, %1s)

I'll present the ideas using the following, typical example:

I pushed <a href="%s">%s on %s</a>!

First things first I think we can't use a |trans modifier, as modifier parameters can't be modified, and, at least capitalizing a parameter to be given to gettext may be usefull.

So we may use either a function:

{trans string="string"}

Or a block:

{trans}string{/trans}

The example using the function may give:

{trans string="I pushed <a href="%s">%s on %s</a>!" 1=$url|urlencode 2=$what|htmlencode 3=$where|htmlencode}

And with a block:

{trans 1=$url|urlencode 2=$what|htmlencode 3=$where|htmlencode}I pushed <a href="%s">%s on %s</a>!{/trans}

That's more or less the same but some things to note here:

  • 1, 2, 3 are ugly, we better use positional parameter (see Positional parameters ? #164)
  • Positional parameter are better but still not that readable about what parameter maps to which place

Just for the record, a version with positional parameters and a context:

{trans $url|urlencode $what|htmlencode $where|htmlencode}I pushed <a href="%s">%s on %s</a>!{/trans}

To enhance readability one may though of:

{trans}I pushed <a href="{$url|urlencode}">{$what|htmlencode} on {$where|htmlencode}</a>!{/trans}

That's clearly more readable, sadly it won't work with current implementation of block plugins as content is processed (strings are replaced) before being given to the block.

This may be doable using a preprocessor plugin rewriting the "inplace syntax" to a "parametrized syntax".

Obviously this last syntax need a specialized xgettext that'll replace {....} in strings by %1s, %2s, etc...: there's NO way we get {$url|urlencode} in po files (Also think that the translation may be used by other languages, so %1s is cool (supported by C, Java, whatever...)

I'm waiting for feedback on those ideas, while trying to implement the preprocessor.

@uwetews
Copy link
Contributor

uwetews commented Jan 27, 2016

So what about

{trans url=$url|urlencode what=$what|htmlencode where=$where|htmlencode}I pushed <a href="%url">%what on %where</a>!{/trans}

It could be implemented as standard block plugin and the index names of the $params array in the plugin would correspond to %name in the content which could be easily replaced,

@JulienPalard
Copy link
Author

That's true, but:

{trans url=$url|urlencode what=$what|htmlencode where=$where|htmlencode}I pushed <a href="%url">%what on %where</a>!{/trans}

Is less readable, less maintanable, a little bug magnet (typo on param name vs placeholder name), and longer than:

{trans}I pushed <a href="{$url|urlencode}">{$what|urlencode} on {$where|urlencode}</a>!{/trans}

Plus, with my proposed syntax, migration of non-translatable to translatable sentences is easy, just add {trans} and {/trans} arround your sentence.

Also, using my proposed syntax allow the {trans} tag to take parameters without conflicting with the values to be replaced, and gettext typically needs some parameters like count, context, and domain.

Finally, rewriting using a preprocessor seems necessary to allow syntax like:

{trans count=$foobar_count}One foobar.{plural}{$foobar_count}foobars.{/trans}

It's clearly more readable, in the case placeholders are used, than giving the plural form as a parameter.

I started to work on a demo preprocessor as a smarty plugin (yay) using regex ( :( ) and it works well, I just need some time to meditate on the pro and cons of using it, that's also why I need feedback.

@Stadly
Copy link

Stadly commented Feb 1, 2016

Great work! I'm searching for a gettext implementation to use myself, and have been looking at these projects:

But I really like your proposed syntax. So readable and easy to use!

{trans}I pushed <a href="{$url|urlencode}">{$what|urlencode} on {$where|urlencode}</a>!{/trans}
{trans count=$foobar_count}One foobar.{plural}{$foobar_count} foobars.{/trans}

@manuelcanga
Copy link

Maybe with this and a callback function when a var is used...example:

function smarty_block_translate($params, $content, Smarty_Internal_Template $template, &$repeat)
{
 if($repeat) {
   $smarty = new Smarty();
   $smarty->callback_when_var_is_used(function($var) use( $params) {
        return __($params[$var]);
   });
    $content = $smarty->fetch('string:'.$content);
    $repeat = 0;
 }

return $content;
}

@JulienPalard
Copy link
Author

@manuelcanga So you're translating variables one by one ?

@cdp1337
Copy link

cdp1337 commented May 3, 2016

I'm just leaving what we do currently, as it addresses i18n from within smarty.

We created a function called t and use string placeholders for the text to replace.

For example:

<a href="/somewhere" title="{t 'STRING_GO_SOMEWHERE'}">
{t 'STRING_CLICK_TO_GO_SOMEWHERE'}
</a>

Then, the 't' function uses our internal i18n logic to translate that string to the indexed string in a text file based on user language selection.

We needed a plain text solution for the translation strings because end-users need to be able to edit them when necessary.

As for singular/plural scenarios, 'STRING_N_THINGS' gets mapped to 3 indexes from within the translation file; STRING_0_THINGS, STRING_1_THINGS, and STRING_N_THINGS. Toggling between them is based on input values.

{assign var='qty' value=0}
{t 'STRING_THERE_ARE_N_THINGS' $qty}
// "There are 0 things"

{assign var='qty' value=1}
{t 'STRING_THERE_ARE_N_THINGS' $qty}
// "There is 1 thing"

{assign var='qty' value=2}
{t 'STRING_THERE_ARE_N_THINGS' $qty}
// "There are 2 things"

@JulienPalard
Copy link
Author

@cdp1337 Beware that your version don't handle context, which are sometimes necessary.

Also how do your translate and encode correctly I pushed <a href="%s">%s on %s</a>! ?

@cdp1337
Copy link

cdp1337 commented May 3, 2016

I hadn't thought of contexts, but the developer is more than welcome to prefix their i18n string with something meaningful for the context; such as STRING_MENU_PRINTER_OPEN and STRING_MENU_FILE_OPEN.

For the example of I pushed <a href="%s">%s on %s</a>!:

// en.ini
STRING_I_PUSHED_LINK_S_S_ON_S = 'I pushed <a href="[%1%]">[%2%] on [%3%]</a>!'
// de.ini
STRING_I_PUSHED_LINK_S_S_ON_S = 'Ich drückte <a href="[%1%]">[%2%] am [%3%]</a>!'

// Smarty template
{t 'STRING_I_PUSHED_LINK_S_S_ON_S' '/place/not/here' 'foobar' '2016.01.01'}

In the event that you want one variable repeated, you simply use the index however many times necessary. Such as:

// en.ini
STRING_THING_ABOUT_S = 'The thing about [%1%] is that [%1%] are [%1%], and only [%1%] are [%1%].'

// Smarty template
{t 'STRING_THING_ABOUT_S' 'cats'}
{t 'STRING_THING_ABOUT_S' 'dogs'}

Now, I'm not inferring that my method is the best way to achieve this, in fact I'd love feedback on how it could be improved! It's just how I implemented it thus far.

@JulienPalard
Copy link
Author

And your example can properly encode (given urlencode as u and htmlencode as h, but that's just an example):

{t 'STRING_I_PUSHED_LINK_S_S_ON_S' '/place/not/here'|u 'foobar'|h '2016.01.01'|h}

Which is nice, how do you disambiguate parameters and quantity ? Also smarty let you pass positional parameters ? How ?

About enhancements, I'm clearly not a big fan of having "translations keys" like "STRING_I_PUSHED_LINK_S_S_ON_S", they don't give enough context on what is "S" to the translators, so they force you to produce a first translation (double work).

I prefer the GNU gettext approch of having a "C" language which is the "developper one", which can be translated to english and any other language. This way, your untranslated interface is still nice, and you provide a lot of information to translators (an already "correct" sentence).

Finally, why not considering using .po format with .mo in production and gnu gettext extension ? they are not harder to generate that .ini files, and they are more explicit about how plural is handled (differ between languages !) and natively handle contexts (Like you, by prefixing the string in the mo file, but presented nicely in the po file.)

@cdp1337
Copy link

cdp1337 commented May 5, 2016

cdp1337@9aba258

I modified Smarty core by changing the line

$compiler->trigger_template_error('too many shorthand attributes', null, true);

with

$_indexed_attr[] = $mixed;

I found it strange that some internal functions could handle positional arugments, but that isn't exposed to 3rd party developers. By going with this approach, any developer can write a function to accept index-based arguments. Albeit this does introduce some ambiguity in some cases, but I felt the chance of ambiguous was outweighed by simple functions, (such as date and other functions that only accept one parameter), being able to have cleaner calls in the template. {date date="2016-01-01"} just seemed silly/redundant.


In my system, the translation string calls the positioned argument by the key [%1%], [%2%], etc. This does introduce chance of confusion, as the order of the parameters must be exact in the template, but that hasn't been an issue for us yet. If the developer switches the order of parameters in the template, then the corresponding logic will need to change. To alleviate this concern, we can pass named arguments into the function and call them with the same syntax; [%VARNAME%].


Personally, I've found that removing the translation strings away from the template has helped speed up development. eg: I don't have to worry about what text is contained in message N, all I care is that it's STRING_FOR_SOMETHING and move on; the translation can be done later by someone else if necessary. This also helps segregate different portions of codes; eg: the tpl is responsible for the layout and how data is displayed ONLY, the translation strings reside in a different location entirely, just like CSS/JS. This is a little extreme of an example, but it seems logical for me. It's also how Android development functions; the interface XML contains literal strings which are then referenced by the lang files elsewhere. The framework benefits because it has a list of what literal strings are available in the system and translators can benefit because everything they need is in one file, as opposed to digging through the templates to see what's there.

I'm also doing a trick of writing the en_US version of the string in the supplemental translation files as a comment just above the literal string. This way if someone bilingual is doing translation, they have the English version of the word/phrase available right there while they're scrolling through the file.

One important downside I've found thus far is ini files are single-line only, so if you need to pass in a paragraph for translation, that entire paragraph is on one line :( At some point we may switch to JSON or YAML files to alleviate this, which will be a simple fix in the translation subsystem to check for LANG.yml in addition to LANG.ini, so I'm not overly concerned about that migration.

@uwetews
Copy link
Contributor

uwetews commented May 5, 2016

@cdp1337
Your modification does break Smarty's error handling and is causing the continuous integration test on Travis CI to fail.

@cdp1337
Copy link

cdp1337 commented May 5, 2016

@uwetews because there's a test case that checks this exact bug/feature that I fixed. It's expecting an exception to be thrown in that event and using my script it goes through cleanly. Would you like me to update that test case and re-request another pull request?

@JulienPalard
Copy link
Author

@cdp1337 @uwetews I like a lot the possibility to use positional arguments, they sometimes are really nice and clear (like you said, I prefer {date "2016-01-01"} over {date date="2016-01-01"}, a bit off-context but not too much þ

@JulienPalard
Copy link
Author

@cdp1337 I though a bit of your syntax, and came by to my abandonned git branch where I worked on it on January ^^

First, I prefer the %1s, %2s, %3s as markers as they are compatible with printf, so compatible with many languages, allowing you (and other smarty user if we kind of impose a syntax at the end of the thread) to reuse your translations (if you have multiple applications, multiple backends, or if you switch from a language to another, even if you write an extension in C whatever...).

Also, using positional markers is highly important, because a lot of languages use different order in their sentences. But using named markers may be easier for translators to work with, giving them a bit of context about what is what.

Using your syntax, and allowing positional arguments, we can do:

{t "I pushed <a href="%1s">%2s on %3s</a>!" "/foo" "commit" "git"}

Leaving the named arguments to gettext parameters, which is nice and readable:

{t "I pushed <a href="%1s">%2s on %3s</a>!" $url|urlencode $subject|htmlencode $target|htmlencode count=5 context=menu}

Yet I prefer my syntax, having the downside of needing a prefilter, which is:

{trans}I pushed <a href="{$url|urlencode}">{$subject|htmlencode} on {$target|htmlencode}</a>!{/trans}

Ultimately the prefilter only change the {trans}...{/trans} block in a single {t tag, changing the {...} in positional, numbered %s, so both syntax are not incompatible, one simply use another.

This last version really mixes text in the template, the complete oposite of hiding the string in a tag like:

{t 'STRING_I_PUSHED_LINK_S_S_ON_S' $url|urlencode $subject|htmlencode $target|htmlencode}

I read your argument about "tags speed up development" but I'm not sure I'm getting it. For me, the {trans} bloc allow developpers to write pure smarty code like I pushed <a href="{$url|urlencode}">{$subject|htmlencode} on {$target|htmlencode}</a>! and just dropping {trans} around it and BIM it's internationalized. (Note that this way is particularly handy to internationalize code which is not).

On the other hand with your syntax, the developer have to declare the TAG in another file, which is more work, and make porting of non-internationalized code very time-consuming.

With both syntax, we can imagine a 'xgettext' tool parsing smarty code to automatically generate .pot files.

I currently have an implementation of the prefilter if someone want to test it that I can share, my prefilter changes {gettext}Hello {$target}{/gettext} to {gettext $target}Hello %1s{/gettext} and I have a smarty_block_gettext "taking the relay".

I still need to write the "xgettext" to have it really usefull, we don't want to generate pot files manually.

@cdp1337
Copy link

cdp1337 commented May 6, 2016

Oddly enough I've never been a huge fan of printf/sprintf... don't know why. Probably why I opted for [%N%] over %N[sdf].

What I meant by speeding up development is when initially writing the template file I don't have the translation string available so I usually had to come up with it on-the-fly. This usually meant I'd spend a few minutes going back and forth on a few variations instead of continuing on with the template. As for the tag in another file, the framework handles all that automatically.

@JulienPalard
Copy link
Author

About not having the translation while writing the tpl, it's normal. In the GNU gettext way, you write your message in the tpl in an "imaginary developper langauaged" called "C", then it can be translated by translators to proper english (and other languages).

There's, in fact, a very slim difference between allcaps-tags and "C strings", they are tags too, almost not aimed to be shown to users. But they're cool because:

  • Sometimes they don't even need to translate them (like login, logout, ...)
  • As a developer you have nothing more to do (not editing an external language file) (in fact with my prefilter, the only thing to do is add {trans} and {/trans} and keep normal smarty syntax inbetween.
  • Your developer version is "looking good"

@wisskid
Copy link
Member

wisskid commented Jan 29, 2020

@wisskid
Copy link
Member

wisskid commented Apr 29, 2024

See #1005 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants