Discussion:
Website translations with Haunt
pelzflorian (Florian Pelz)
2017-12-09 18:06:19 UTC
Permalink
Hello,

First of all, I want to say thank you to the Guile, Haunt, ffi-helper
and related projects’ developers.

I built my personal website [1] using David Thompson’s Haunt [2] and
recently talked my university’s Islamic Students’ Association
(Islamische Hochschulvereinigung) into using Haunt for their
not-yet-finished website as well, because I think the concept of Haunt
and SHTML is superior to alternatives. However in order to make the
website multilingual (the user can choose to view it in German or
English) so far I used an association list with assoc-ref which is not
very comfortable since all strings have to be added in two places,
i.e. in the SHTML code and in the association list where the code
looks for translations.

I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.

I did not want to use the ordinary gettext functions in order to not
call setlocale very often to switch languages. It seems the Gettext
system is not designed for rapidly changing locales, but maybe I am
wrong about this and very many setlocale calls would not be that bad.

Using Matt Wette’s ffi-helper [3] and the libgettextpo from GNU
Gettext, [4] I now wrote code to convert a po file into an association
list where I can now look up translations. (Note that I had to make
some patches to the c99dev branch of nyacc before building nyacc and
ffi-helper because the Makefile had spaces instead of tabs at one
place and it did not find guile and guild. Maybe these patches were
just necessary because of a broken setup on my part though.)

I used this dot.ffi file to create libgettextpo bindings:

(define-ffi-module (gettext-po)
#:include '("gettext-po.h")
#:library '("libgettextpo"))

Then I wrote a procedure to convert a po file to an association list
to look up the msgstr for a msgid. Note that by lingua I basically
mean a locale.

(use-modules
(gettext-po)
(system ffi-help-rt)
((system foreign)
#:prefix ffi:)
[
])

(define xerror-handler-struct
(make-struct-po_xerror_handler)) ; TODO SET HANDLERS:
;; [
]

(define (translations-for-lingua lingua)
"Returns po/<lingua>.po converted to an association list of msgid–msgstr pairs."
;; TODO: STILL DISREGARDING PLURALS AND OTHER INFORMATION
(let* ((po-file
(po_file_read_v3
(string-append "po/" lingua ".po")
(pointer-to xerror-handler-struct)))
(translations
(if (ffi:null-pointer? (unwrap~pointer po-file))
'()
;; otherwise:
(let ((iter (po_message_iterator po-file ffi:%null-pointer)))
(let loop ((message (po_next_message iter)))
(if (ffi:null-pointer? (unwrap~pointer message))
(begin
(po_message_iterator_free iter)
'())
;; otherwise:
(cons
(cons
(ffi:pointer->string (po_message_msgid message))
(ffi:pointer->string (po_message_msgstr message)))
(loop (po_next_message iter)))))))))
(if (not (ffi:null-pointer? (unwrap~pointer po-file)))
(po_file_free po-file))
translations))



I did this for every locale and made a second association list mapping
the locales to the msgid–msgstr association lists. Then I wrote
translated-msg to do the lookup.


(define (translations-entry-for-lingua lingua)
"Returns a pair of LINGUA and an association list of its translations."
(cons
lingua
(translations-for-lingua lingua)))

(define translated-msg
;; gettext is not used directly because it would require repeated
;; setlocale calls, which should not be necessary.
;; See: https://stackoverflow.com/questions/3398113/php-gettext-problems
(let ((translation-lists
(map translations-entry-for-lingua linguas)))
(define (with-default value default)
(if value value
default))
(lambda (msgid lingua)
"Returns the msgstr for MSGID from the po file for LINGUA."
(let ((translations (assoc-ref translation-lists lingua)))
(with-default
(assoc-ref translations msgid)
msgid)))))


As a Gettext-like shorthand I wrote a macro called _ which calls the
above translated-msg function. It takes the locale from the
current-lingua variable, so the macro deliberately breaks hygiene.

(define-syntax _
(lambda (x)
"Gettext-like shorthand for translated-msg with what currently is current-lingua."
(syntax-case x ()
((_ msg)
(with-syntax ((current-lingua (datum->syntax x 'current-lingua)))
#'(translated-msg msg current-lingua))))))


I use it like in this excerpt:

(define (back-button-for-lingua lingua)
"SXML for a link back to the home page."
(let ((current-lingua lingua))
`(a (@ (href ,(string-append "/index" "-" lingua ".html"))
(class "full-width-link"))
,(_ "← Back to home page"))))


Then I ran the xgettext program from the terminal to create a pot file
from all strings marked with _.

xgettext -f po/POTFILES -o po/pelzfloriande-website.pot --from-code=UTF-8 --copyright-holder="" --package-name="pelzfloriande-website" --msgid-bugs-address="***@pelzflorian.de" --keyword=_

Xgettext autodetected all the strings that were marked for
translation. This is much better than my previous approach where I
had to list all of them manually in a manually written association
list.

To create a po file from a pot file, I do the usual:

cd po
msginit -l de --no-translator
msginit -l en --no-translator


Then I filled out the po files using gtranslator. I can now run
“haunt build” with an appropriate GUILE_LOAD_PATH, for me currently:

GUILE_LOAD_PATH=$HOME/keep/projects/pelzfloriande-website:$HOME/build/nyacc/src/nyacc/examples:$GUILE_LOAD_PATH GUILE_LOAD_COMPILED_PATH=$GUILE_LOAD_COMPILED_PATH:$HOME/.cache/guile/ccache/2.2-LE-8-3.A/home/florian/keep/projects/pelzfloriande-website haunt build



Is this the right approach?

You can find my unfinished, not very clean code at [5].

Regards,
Florian


[1] https://pelzflorian.de
[2] https://haunt.dthompson.us/
[3] https://savannah.nongnu.org/projects/nyacc/
[4] https://www.gnu.org/software/gettext/manual/html_node/libgettextpo.html
[5] https://pelzflorian.de/git/pelzfloriande-website/commit/?id=5f97bf157eaddcfe722c97dcab349b7dcfbbcd9d
ng0
2017-12-09 18:15:29 UTC
Permalink
Hey Florian,

interesting work. I opened a bug a very long time ago for Guix,
and the content applied to Guile (as well as to the project
I initiated (infotropique) and very likely many other Haunt
using websites like Haunt itself, 8sync, etc).

I've seen the first versions of this a while back in the code
for your website, I'll check out the changes soon, exciting work!

My idea was a reimplementation of prep (a text format and html document
generator that is capable to include multiple languages in its source
files, written by lynX back in the early 90s (or earlier) iirc (and
still used today).
Post by pelzflorian (Florian Pelz)
Hello,
First of all, I want to say thank you to the Guile, Haunt, ffi-helper
and related projects’ developers.
I built my personal website [1] using David Thompson’s Haunt [2] and
recently talked my university’s Islamic Students’ Association
(Islamische Hochschulvereinigung) into using Haunt for their
not-yet-finished website as well, because I think the concept of Haunt
and SHTML is superior to alternatives. However in order to make the
website multilingual (the user can choose to view it in German or
English) so far I used an association list with assoc-ref which is not
very comfortable since all strings have to be added in two places,
i.e. in the SHTML code and in the association list where the code
looks for translations.
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
I did not want to use the ordinary gettext functions in order to not
call setlocale very often to switch languages. It seems the Gettext
system is not designed for rapidly changing locales, but maybe I am
wrong about this and very many setlocale calls would not be that bad.
Using Matt Wette’s ffi-helper [3] and the libgettextpo from GNU
Gettext, [4] I now wrote code to convert a po file into an association
list where I can now look up translations. (Note that I had to make
some patches to the c99dev branch of nyacc before building nyacc and
ffi-helper because the Makefile had spaces instead of tabs at one
place and it did not find guile and guild. Maybe these patches were
just necessary because of a broken setup on my part though.)
(define-ffi-module (gettext-po)
#:include '("gettext-po.h")
#:library '("libgettextpo"))
Then I wrote a procedure to convert a po file to an association list
to look up the msgstr for a msgid. Note that by lingua I basically
mean a locale.
(use-modules
(gettext-po)
(system ffi-help-rt)
((system foreign)
#:prefix ffi:)
[
])
(define xerror-handler-struct
;; [
]
(define (translations-for-lingua lingua)
"Returns po/<lingua>.po converted to an association list of msgid–msgstr pairs."
;; TODO: STILL DISREGARDING PLURALS AND OTHER INFORMATION
(let* ((po-file
(po_file_read_v3
(string-append "po/" lingua ".po")
(pointer-to xerror-handler-struct)))
(translations
(if (ffi:null-pointer? (unwrap~pointer po-file))
'()
(let ((iter (po_message_iterator po-file ffi:%null-pointer)))
(let loop ((message (po_next_message iter)))
(if (ffi:null-pointer? (unwrap~pointer message))
(begin
(po_message_iterator_free iter)
'())
(cons
(cons
(ffi:pointer->string (po_message_msgid message))
(ffi:pointer->string (po_message_msgstr message)))
(loop (po_next_message iter)))))))))
(if (not (ffi:null-pointer? (unwrap~pointer po-file)))
(po_file_free po-file))
translations))
I did this for every locale and made a second association list mapping
translated-msg to do the lookup.
(define (translations-entry-for-lingua lingua)
"Returns a pair of LINGUA and an association list of its translations."
(cons
lingua
(translations-for-lingua lingua)))
(define translated-msg
;; gettext is not used directly because it would require repeated
;; setlocale calls, which should not be necessary.
;; See: https://stackoverflow.com/questions/3398113/php-gettext-problems
(let ((translation-lists
(map translations-entry-for-lingua linguas)))
(define (with-default value default)
(if value value
default))
(lambda (msgid lingua)
"Returns the msgstr for MSGID from the po file for LINGUA."
(let ((translations (assoc-ref translation-lists lingua)))
(with-default
(assoc-ref translations msgid)
msgid)))))
As a Gettext-like shorthand I wrote a macro called _ which calls the
above translated-msg function. It takes the locale from the
current-lingua variable, so the macro deliberately breaks hygiene.
(define-syntax _
(lambda (x)
"Gettext-like shorthand for translated-msg with what currently is current-lingua."
(syntax-case x ()
((_ msg)
(with-syntax ((current-lingua (datum->syntax x 'current-lingua)))
#'(translated-msg msg current-lingua))))))
(define (back-button-for-lingua lingua)
"SXML for a link back to the home page."
(let ((current-lingua lingua))
(class "full-width-link"))
,(_ "← Back to home page"))))
Then I ran the xgettext program from the terminal to create a pot file
from all strings marked with _.
Xgettext autodetected all the strings that were marked for
translation. This is much better than my previous approach where I
had to list all of them manually in a manually written association
list.
cd po
msginit -l de --no-translator
msginit -l en --no-translator
Then I filled out the po files using gtranslator. I can now run
GUILE_LOAD_PATH=$HOME/keep/projects/pelzfloriande-website:$HOME/build/nyacc/src/nyacc/examples:$GUILE_LOAD_PATH GUILE_LOAD_COMPILED_PATH=$GUILE_LOAD_COMPILED_PATH:$HOME/.cache/guile/ccache/2.2-LE-8-3.A/home/florian/keep/projects/pelzfloriande-website haunt build
Is this the right approach?
You can find my unfinished, not very clean code at [5].
Regards,
Florian
[1] https://pelzflorian.de
[2] https://haunt.dthompson.us/
[3] https://savannah.nongnu.org/projects/nyacc/
[4] https://www.gnu.org/software/gettext/manual/html_node/libgettextpo.html
[5] https://pelzflorian.de/git/pelzfloriande-website/commit/?id=5f97bf157eaddcfe722c97dcab349b7dcfbbcd9d
--
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://c.n0.is/ng0_pubkeys/tree/keys
WWW: https://n0.is
pelzflorian (Florian Pelz)
2017-12-09 21:08:57 UTC
Permalink
Post by ng0
Hey Florian,
interesting work. I opened a bug a very long time ago for Guix,
and the content applied to Guile (as well as to the project
I initiated (infotropique) and very likely many other Haunt
using websites like Haunt itself, 8sync, etc).
I've seen the first versions of this a while back in the code
for your website, I'll check out the changes soon, exciting work!
Oh, thank you. :) Some code is not clean and more proof-of-concept.
Post by ng0
My idea was a reimplementation of prep (a text format and html document
generator that is capable to include multiple languages in its source
files, written by lynX back in the early 90s (or earlier) iirc (and
still used today).
Hmm I can find neither prep nor lynX in a quick Web search.

Po files are separate from the source code files. I don’t know if
prep has such a separation or if such a separation really is better.

Then again, another issue with HTML/SHTML is that e.g. a hyperlink within
the text is difficult to handle with traditional format strings
without breaking the separation:

* The link text should be translated in the context of the text
surrounding the link

* but the link formatting and maybe the link destination URL maybe
should not be changeable by the translator.

Does prep handle this?

Regards,
Florian
ng0
2017-12-09 22:29:49 UTC
Permalink
Post by pelzflorian (Florian Pelz)
Post by ng0
Hey Florian,
interesting work. I opened a bug a very long time ago for Guix,
and the content applied to Guile (as well as to the project
I initiated (infotropique) and very likely many other Haunt
using websites like Haunt itself, 8sync, etc).
I've seen the first versions of this a while back in the code
for your website, I'll check out the changes soon, exciting work!
Oh, thank you. :) Some code is not clean and more proof-of-concept.
Post by ng0
My idea was a reimplementation of prep (a text format and html document
generator that is capable to include multiple languages in its source
files, written by lynX back in the early 90s (or earlier) iirc (and
still used today).
Hmm I can find neither prep nor lynX in a quick Web search.
lynX is the author of many things, among them the psyc and psyc2 protol,
psyced, the /me command in IRC etc. this page is generated from prep: http://my.pages.de
(http://my.pages.de/me should show more).

So prep is at git://git.psyced.org/git/prep unless I have used the
wrong .org name for the onion I use for it.

git clone git://git.psyced.org/git/stru should show one project, how
a complete website can look like. I've only helped with the translation
of one article on a different website. It's definitely interesting to
work with.
Post by pelzflorian (Florian Pelz)
Po files are separate from the source code files. I don’t know if
prep has such a separation or if such a separation really is better.
Then again, another issue with HTML/SHTML is that e.g. a hyperlink within
the text is difficult to handle with traditional format strings
* The link text should be translated in the context of the text
surrounding the link
* but the link formatting and maybe the link destination URL maybe
should not be changeable by the translator.
Does prep handle this?
To some extent. prep is not SGML. It's been a while since I've looked
into it but basically you have a unique syntax. The following snippet
(from pages/my/convivenza.mlm) doesn't show it's full usage,
but it show how links are handled:


I)#section Approfondimenti
D)#section Vertiefung
E)#section Further Reading

<ol>
<li>(de) [https://wiki.piratenpartei.de/Wahlen/Bund/2013/Analyse/Disziplin_und_Fairness Piratenpartei, Analyse zur Bundestagswahl 2013: Disziplin und Fairness];
<li>(en) [http://piratetimes.net/what-is-happening-in-germany/ Andrew Reitemeyer, What is Happening in Germany];
<li>(en) [http://my.pages.de/ppp-hurting carlo von lynX, How to build a Participatory Political Party: Stop the Hurting], citing (en) Rick Falkvinge, Swarmwise – Paragraphs on Infighting and Moderation;
I)<li>(it) [https://it.wikipedia.org/wiki/Classificazione_delle_fallacie Classificazione delle fallacie];
DE)<li>(en) [https://en.wikipedia.org/wiki/Formal_fallacy Logical Fallacies];
[https://yourlogicalfallacyis.com yourlogicalfallacyis
], [http://www.logicalfallacies.info logicalfallacies.info];
## <li>(en) [http://www.sociology.org/content/vol006.001/liu.html What Does Research Say about the Nature of Computer-mediated Communication?], 2002: «Kiesler et al. (1985) reported that they could not find any influence of CMC environments on physiological arousal, nor on emotions or self-evaluations. In addition, Kiesler et al. found that participants in CMC groups evaluated each other lower than those in FtF groups. From the perspective of Kiesler et al. (1985), CMC environments were impersonal.»;
## <li>(en) Kiesler, Sproull, [https://www.sciencedirect.com/science/article/pii/074959789290047B Group decision making and communication technology]: «Experiments show that, compared with a face-to-face meeting, a computer-mediated discussion leads to delays; more explicit and outspoken advocacy; “flaming;” more equal participation among group members; and more extreme, unconventional, or risky decisions.»;
## (((fn (en) [https://blog.discourse.org/2013/03/the-universal-rules-of-civilized-discourse/ The Universal Rules of Civilized Discourse], 2013: «The principles in the default Discourse community behavior FAQ were distilled, as best we could, from the common, shared community guidelines of more than 50 forums active for a decade or more.»)))
<li>(it) [https://pad.partito-pirata.it/p/2014-rispetto discussione strutturata sulla Convivenza, 2014];
<li>(it) [https://pad.partito-pirata.it/p/2014-altrostatuto GdL Stru, Raccomandazioni Strutturali per facilitare una Partecipazione Orizzontale all'AltraEuropa, 2014];
</ol>

## I) TODO FIXME
D)#section Alternative AnsÀtze
E)#section Alternative Approaches

D)Unter
DE)Restorative Justice\
DE)(((fn (en) [https://www.rpiassn.org/practice-areas/what-is-restorative-justice/ Restorative Justice])))
D)versteht sich eine Methode, Frieden zwischen den Mitwirkenden zu schaffen durch quasi-öffentlicher Aussprache und Einigung auf Maßnahmen zur Wiederherstellung einer Gerechtigkeit.
E)is a method to re-establish peace between the participants by nearly public ventilation and agreement on measures to restore justice.
D)Die Einfachheit dieses Ansatzes ist verlockend, wahrt allerdings nicht die BedÌrfnisse nach PrivatsphÀre der Mitwirkenden, weswegen ein gewÀhltes Schiedsgericht diese Herausforderung besser erfÌllen kann.
E)The simplicity of this method is inviting, it doesn't however take the needs for privacy of the participants in consideration, whereby a Court of Arbitration can serve a better job.
D)Ausserdem setzt es erst nach entstandenem Schaden an, wÀhrend unserer Ansatz auf PrÀvention zielt.
E)Also, it only comes into play after damage done, whereas our approach is oriented on prevention.

I)#section Fonti
D)#section Quellen
E)#section References
## I)#section Annotazioni
## D)#section Fußnoten
## E)#section Footnotes

#footnotes

#space 4
#repost convivenza





while the format for links [ ], []() etc can be defined as far as I remember.
So far I haven't done very much on this. I've had a chat about prep with its
developer, and that's about it.
To my best knowledge nothing has been developed that allows to conveniently
define multiple languages in one source file outside from plain S/XML, and
this is what I find interesting about prep. Wether I end up writing a reimplementation
in Guile or not, there are some interesting ideas that could possibly be applied
to your approach.
I'll need some time to find time to compare the two approaches and find similarities.

Maybe my reply already helps, with reading the provided (really small) codebase
of prep.
Post by pelzflorian (Florian Pelz)
Regards,
Florian
Thanks,
N.
--
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://c.n0.is/ng0_pubkeys/tree/keys
WWW: https://n0.is
pelzflorian (Florian Pelz)
2017-12-13 14:53:33 UTC
Permalink
Post by ng0
Post by pelzflorian (Florian Pelz)
Post by ng0
My idea was a reimplementation of prep (a text format and html document
generator that is capable to include multiple languages in its source
files, written by lynX back in the early 90s (or earlier) iirc (and
still used today).
Hmm I can find neither prep nor lynX in a quick Web search.
lynX is the author of many things, among them the psyc and psyc2 protol,
psyced, the /me command in IRC etc. this page is generated from prep: http://my.pages.de
(http://my.pages.de/me should show more).
I see. lynX deserves a better Duckduckgo search ranking than he has. ;)

There is a lot to look at; thank you for the link. I now also
understand better what kind of organization youbroketheinternet is.

Either way,
Post by ng0
Post by pelzflorian (Florian Pelz)
Po files are separate from the source code files. I don’t know if
prep has such a separation or if such a separation really is better.
Then again, another issue with HTML/SHTML is that e.g. a hyperlink within
the text is difficult to handle with traditional format strings
* The link text should be translated in the context of the text
surrounding the link
* but the link formatting and maybe the link destination URL maybe
should not be changeable by the translator.
Does prep handle this?
To some extent. prep is not SGML. It's been a while since I've looked
into it but basically you have a unique syntax. The following snippet
(from pages/my/convivenza.mlm) doesn't show it's full usage,
I)#section Approfondimenti
D)#section Vertiefung
E)#section Further Reading
[
]
To my best knowledge nothing has been developed that allows to conveniently
define multiple languages in one source file outside from plain S/XML, and
this is what I find interesting about prep. Wether I end up writing a reimplementation
in Guile or not, there are some interesting ideas that could possibly be applied
to your approach.
I'll need some time to find time to compare the two approaches and find similarities.
Maybe my reply already helps, with reading the provided (really small) codebase
of prep.
Thank you, I looked at it. This is certainly interesting but
combining SHTML and translation in one file is not what I want. I
agree that it has advantages, but I have settled on this for now:


(define (paragraph-about-a-manatee-for-lingua lingua)
(let* ((current-lingua lingua))
`(p
,@(__ "img1|This|| is an image of a ||link1|https://en.wikipedia.org/wiki/Manatee|manatee||."
`(("img1" .
,(lambda (text)
`(img (@ (src "Loading Image...")
(alt ,text)))))
("link1" .
,(lambda (url text)
`(a (@ (href ,url))
,text))))))))





I implemented it like this:

(define translated-msg
;; gettext is not used directly because it would require repeated
;; setlocale calls, which should not be necessary.
;; See: https://stackoverflow.com/questions/3398113/php-gettext-problems
(let [
]
(lambda (msgid lingua)
"Returns the msgstr for MSGID from the po file for LINGUA."
[
])))

(define (translated-multipart-msg msg lingua assoc)
"Looks up MSG for LINGUA via Gettext and returns a list of its parts.

Parts are separated by two vertical bars ||. If a part is prefixed by
a text followed by a vertical bar the text is looked up a lambda in
the association list ASSOC. If found, the lambda is called on the
remainder of the part and the result added to the list. If it is not
found or there is no vertical bar, the entire part is added to the
list."
(define (split-along-unescaped-matches str pattern)
"Splits along pattern unless pattern is escaped by a backslash."
(let loop ((remainder str) ; what to match with
(start 0)) ; where to start matching, used to ignore escaped matches
(let ((match (string-match pattern remainder start)))
(if match ; if there is a match:
(if (and
;; if match not at the beginning
(not (= (match:start match) 0))
(eq? ; and escaped by a backslash
(string-ref
remainder
(- (match:start match) 1))
#\\))
;; then continue matching after the escaped match:
(loop
(string-append ; the same as remainder but
(string-drop-right (match:prefix match) 1) ; drop backslash
(match:substring match)
(match:suffix match))
(- (match:end match) 1))
;; otherwise:
(cons
;; everything before the match
(match:prefix match)
(loop ; recursive call
(match:suffix match) ; on everything after the match
0))) ; start matching at start
;; if pattern did not match:
(list remainder)))))
(let ((msgstr-parts
(split-along-unescaped-matches
(translated-msg msg lingua)
"\\\|\\\|")))
(map
(lambda (msgstr-part)
(let* ((subparts
(split-along-unescaped-matches
msgstr-part
"\\\|"))
(part-lambda (assoc-ref assoc (car subparts)))
(args (cdr subparts)))
(if part-lambda
(apply part-lambda args)
msgstr-part)))
msgstr-parts)))

(define-syntax __
(lambda (x)
"Gettext shorthand for multipart messages separated by || in the string."
(syntax-case x ()
((__ msg assoc)
(with-syntax ((current-lingua (datum->syntax x 'current-lingua)))
#'(translated-multipart-msg msg current-lingua assoc))))))



Regards,
Florian
Matt Wette
2017-12-10 15:22:55 UTC
Permalink
Post by pelzflorian (Florian Pelz)
(define xerror-handler-struct
;; […]
First of all, FFI helper + Guile can't deal with this pattern: using varargs function
members in structs. This would require things like `va_arg' in libffi and Guile. I
have posted a request on the libffi dev site. Your example also brought up some gaps
in the ffi helper. I think I may have a workaround for you, though. Try to add code
like the following to your dot-ffi file. In functions calls that want a error handler
specified use std-po-error-handler.


(define-ffi-module (gettext-po)
#:include '("gettext-po.h")
#:library '("libgettextpo"))

(define-public std-po-error-handler
(let* ((error
(lambda (status errnum format)
(simple-format #t "~A\n" (ffi:pointer->string format))))
(error-p
(ffi:procedure->pointer ffi:void error (list ffi:int ffi:int '*)))
;;
(error_at_line
(lambda (status errnum filename lineno format)
(simple-format #t "~A\n" (ffi:pointer->string format))))
(error_at_line-p
(ffi:procedure->pointer ffi:void error_at_line
(list ffi:int ffi:int '* ffi:int '*)))
;;
(multiline_warning
(lambda (prefix message)
(simple-format #t "~A ~A\n"
(ffi:pointer->string prefix)
(ffi:pointer->string message))))
(multiline_warning-p
(ffi:procedure->pointer ffi:void multiline_warning (list '* '*)))
;;
(multiline_error
(lambda (prefix message)
(simple-format #t "~A ~A\n" prefix message)))
(multiline_error-p
(ffi:procedure->pointer ffi:void multiline_error (list '* '*)))
;;
(eh-struct (make-struct-po_error_handler)))

(fh-object-set! eh-struct 'error error-p)
(fh-object-set! eh-struct 'error_at_line error_at_line-p)
(fh-object-set! eh-struct 'multiline_warning multiline_warning-p)
(fh-object-set! eh-struct 'multiline_error multiline_error-p)
;;
(make-po_error_handler_t
(ffi:pointer-address
((fht-unwrap struct-po_error_handler*)
(pointer-to eh-struct))))))
pelzflorian (Florian Pelz)
2017-12-10 19:21:43 UTC
Permalink
Post by Matt Wette
Post by pelzflorian (Florian Pelz)
(define xerror-handler-struct
;; [
]
First of all, FFI helper + Guile can't deal with this pattern: using varargs function
members in structs. This would require things like `va_arg' in libffi and Guile. I
have posted a request on the libffi dev site. Your example also brought up some gaps
in the ffi helper.
Thank you. I’m sorry to say that it did not work.

Actually it is not the “struct po_error_handler” but the
“struct po_xerror_handler” which I need. I believe the “struct
po_error_handler” is not used anymore in current Gettext but I am not
sure. varargs are not needed for “struct po_xerror_handler” (even
though support for them is desirable in general).

Hmm I tried mostly the same as you propose before for the xerror
handler and it did not work: fh-object-set! apparently did not have
any effect, i.e. a subsequent fh-object-ref returned 0 and on error
the callback handler function was called at address 0, causing a
SIGSEGV.

Either way, I tried your code for “struct po_error_handler” and put it
in my dot.ffi to see if it works.
Post by Matt Wette
I think I may have a workaround for you, though. Try to add code
like the following to your dot-ffi file. In functions calls that want a error handler
specified use std-po-error-handler.
(define-ffi-module (gettext-po)
#:include '("gettext-po.h")
#:library '("libgettextpo"))
(define-public std-po-error-handler
(let* ((error
(lambda (status errnum format)
(simple-format #t "~A\n" (ffi:pointer->string format))))
(error-p
(ffi:procedure->pointer ffi:void error (list ffi:int ffi:int '*)))
;;
(error_at_line
(lambda (status errnum filename lineno format)
(simple-format #t "~A\n" (ffi:pointer->string format))))
(error_at_line-p
(ffi:procedure->pointer ffi:void error_at_line
(list ffi:int ffi:int '* ffi:int '*)))
;;
(multiline_warning
(lambda (prefix message)
(simple-format #t "~A ~A\n"
(ffi:pointer->string prefix)
(ffi:pointer->string message))))
(multiline_warning-p
(ffi:procedure->pointer ffi:void multiline_warning (list '* '*)))
;;
(multiline_error
(lambda (prefix message)
(simple-format #t "~A ~A\n" prefix message)))
(multiline_error-p
(ffi:procedure->pointer ffi:void multiline_error (list '* '*)))
;;
(eh-struct (make-struct-po_error_handler)))
(fh-object-set! eh-struct 'error error-p)
I inserted an

(display (fh-object-ref eh-struct 'error)) (newline)

at this point at this point in the dot.ffi file. Then when I ran

(use-modules (gettext-po))

from the REPL it printed 0, so presumably this does not work either.
It seems like the same issue.
Post by Matt Wette
(fh-object-set! eh-struct 'error_at_line error_at_line-p)
(fh-object-set! eh-struct 'multiline_warning multiline_warning-p)
(fh-object-set! eh-struct 'multiline_error multiline_error-p)
;;
(make-po_error_handler_t
(ffi:pointer-address
((fht-unwrap struct-po_error_handler*)
(pointer-to eh-struct))))))
By the way, what I forgot to mention is that I needed to replace

#include <stdlib.h>

in the gettext-po.h header file by

typedef long size_t;

otherwise “guild compile-ffi gettext-po.ffi” would fail with the error
message

ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed

So this change is needed in order to reproduce my issue.

Regards,
Florian
Matt Wette
2017-12-10 22:35:16 UTC
Permalink
Thank you. I’m sorry to say that it did not work.
Actually it is not the “struct po_error_handler” but the
“struct po_xerror_handler” which I need. I believe the “struct
po_error_handler” is not used anymore in current Gettext but I am not
sure. varargs are not needed for “struct po_xerror_handler” (even
though support for them is desirable in general).
Hmm I tried mostly the same as you propose before for the xerror
handler and it did not work: fh-object-set! apparently did not have
any effect, i.e. a subsequent fh-object-ref returned 0 and on error
the callback handler function was called at address 0, causing a
SIGSEGV.
Either way, I tried your code for “struct po_error_handler” and put it
in my dot.ffi to see if it works.
OK. I will look at this.
By the way, what I forgot to mention is that I needed to replace
#include <stdlib.h>
in the gettext-po.h header file by
typedef long size_t;
otherwise “guild compile-ffi gettext-po.ffi” would fail with the error
message
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed
The FH compiler executed gcc to find all the hidden include directories. If you don't have gcc
in your path (you didn't seem to have guile in your path) then it won't find those directories.
You can add `-I path-to-gcc-inc-dirs' arg to your path or make make sure "gcc" is in your path.

Matt
pelzflorian (Florian Pelz)
2017-12-12 07:51:46 UTC
Permalink
Post by Matt Wette
Post by pelzflorian (Florian Pelz)
By the way, what I forgot to mention is that I needed to replace
#include <stdlib.h>
in the gettext-po.h header file by
typedef long size_t;
otherwise “guild compile-ffi gettext-po.ffi” would fail with the error
message
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed
The FH compiler executed gcc to find all the hidden include directories. If you don't have gcc
in your path (you didn't seem to have guile in your path) then it won't find those directories.
You can add `-I path-to-gcc-inc-dirs' arg to your path or make make sure "gcc" is in your path.
Matt
Thank you for the tip with the -I option.

Hmm I’ve since destroyed the Parabola operating system and
switched to GuixSD. It still happens when I run

$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include -I /gnu/store/n6nvxlk2j8ysffjh3jphn1k5silnakh6-glibc-2.25/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed

A gnu/stubs-64.h exists though.

Apparently I need to install the 32-bit version of glibc in
order for it to work. I tried

$ guix build -s i686-linux glibc
[
]
@ build-succeeded /gnu/store/g7fj77yfv1m4xilfqxvzggm5kd20i10z-glibc-2.25.drv -
/gnu/store/km57wad98gyghrbj8pwydcscsh9y4n4d-glibc-2.25-debug
/gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25

Now
$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include -I /gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
wrote `gettext-po.scm'

worked with the original header file.

(

Removing /usr/bin/gcc leads to a
different error message:

$ guild compile-ffi gettext-po.scm
/bin/sh: gcc: command not found
[
]

)

Regards,
Florian
ng0
2017-12-12 08:03:02 UTC
Permalink
Post by pelzflorian (Florian Pelz)
Post by Matt Wette
Post by pelzflorian (Florian Pelz)
By the way, what I forgot to mention is that I needed to replace
#include <stdlib.h>
in the gettext-po.h header file by
typedef long size_t;
otherwise “guild compile-ffi gettext-po.ffi” would fail with the error
message
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed
The FH compiler executed gcc to find all the hidden include directories. If you don't have gcc
in your path (you didn't seem to have guile in your path) then it won't find those directories.
You can add `-I path-to-gcc-inc-dirs' arg to your path or make make sure "gcc" is in your path.
Matt
Thank you for the tip with the -I option.
Hmm I’ve since destroyed the Parabola operating system and
switched to GuixSD. It still happens when I run
$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include -I /gnu/store/n6nvxlk2j8ysffjh3jphn1k5silnakh6-glibc-2.25/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed
A gnu/stubs-64.h exists though.
Apparently I need to install the 32-bit version of glibc in
order for it to work. I tried
$ guix build -s i686-linux glibc
[
]
@ build-succeeded /gnu/store/g7fj77yfv1m4xilfqxvzggm5kd20i10z-glibc-2.25.drv -
/gnu/store/km57wad98gyghrbj8pwydcscsh9y4n4d-glibc-2.25-debug
/gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25
Now
$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include -I /gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
wrote `gettext-po.scm'
worked with the original header file.
(
Removing /usr/bin/gcc leads to a
$ guild compile-ffi gettext-po.scm
/bin/sh: gcc: command not found
[
]
)
Solution: You don't install gcc on Guix. You install gcc-toolchain.
I suspect that you have 'gcc' 'glibc' etc installed via
guix package --install gcc glibc
in your profile. That's not how it works on Guix.
Post by pelzflorian (Florian Pelz)
Regards,
Florian
--
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://c.n0.is/ng0_pubkeys/tree/keys
WWW: https://n0.is
pelzflorian (Florian Pelz)
2017-12-12 09:30:40 UTC
Permalink
Post by ng0
Post by pelzflorian (Florian Pelz)
Apparently I need to install the 32-bit version of glibc in
order for it to work. I tried
$ guix build -s i686-linux glibc
[
]
@ build-succeeded /gnu/store/g7fj77yfv1m4xilfqxvzggm5kd20i10z-glibc-2.25.drv -
/gnu/store/km57wad98gyghrbj8pwydcscsh9y4n4d-glibc-2.25-debug
/gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25
Now
$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include -I /gnu/store/n0nvyn4lbcawfdbmd0blydrsp5wll75n-glibc-2.25/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
wrote `gettext-po.scm'
worked with the original header file.
Solution: You don't install gcc on Guix. You install gcc-toolchain.
I suspect that you have 'gcc' 'glibc' etc installed via
guix package --install gcc glibc
in your profile. That's not how it works on Guix.
Yes, I only had gcc in my environment (via
“guix environment --ad-hoc”). gcc-toolchain does not seem
to include the needed 32-bit gnu/stubs-32.h either when
built for x86_64. I also have to set the glibc or
gcc-toolchain include path from the store anyway otherwise
“guild compile-ffi” cannot find stdlib.h.

$ guix environment --ad-hoc gcc-toolchain
$ cd build/nyacc/src/nyacc/examples
$ source env.sh
$ cd ~/keep/projects/pelzfloriande-website
$ guild compile-ffi gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gettext-po.h"
compile-ffi: parse failed

$ guild compile-ffi -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "stdlib.h"
compile-ffi: parse failed

$ guild compile-ffi -I /gnu/store/z1y36la9q1xkc5i5vcxqm7d995nrngmn-gcc-toolchain-7.2.0/include -I /gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/include gettext-po.ffi
ffi-help: WARNING: the FFI helper is experimental
(unknown):1: not found: "gnu/stubs-32.h"
compile-ffi: parse failed

On Parabola I probably needed the lib32-glibc package.

Regards,
Florian
Matt Wette
2017-12-12 13:45:50 UTC
Permalink
The FFI Helper uses `gcc --print-search-dirs' to locate gcc directories. It also adds /usr/include.
I don't understand why it is not finding them. It also uses `gcc -dM -E' to determine #defines.
Even with that, on my macOS system, I need to make fixes. Can you determine if some gcc
command, via `gcc --print-search-dirs' will find the correct includes? Maybe I should add a
`--with-gcc' command line argument.

Thanks,
Matt
pelzflorian (Florian Pelz)
2017-12-12 18:47:00 UTC
Permalink
Post by Matt Wette
The FFI Helper uses `gcc --print-search-dirs' to locate gcc directories. It also adds /usr/include.
I don't understand why it is not finding them. It also uses `gcc -dM -E' to determine #defines.
Even with that, on my macOS system, I need to make fixes. Can you determine if some gcc
command, via `gcc --print-search-dirs' will find the correct includes? Maybe I should add a
`--with-gcc' command line argument.
Thanks,
Matt
There are no include directories found at all by
„gcc --print-search-dirs“ on my GuixSD or Arch installation.

$ gcc --print-search-dirs
install: /gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/
programs: =/gnu/store/bhv43hzkfwcrvm2grq9fiw5bh1h5j3vc-gcc-7.2.0/libexec/gcc/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/bhv43hzkfwcrvm2grq9fiw5bh1h5j3vc-gcc-7.2.0/libexec/gcc/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/bhv43hzkfwcrvm2grq9fiw5bh1h5j3vc-gcc-7.2.0/libexec/gcc/x86_64-unknown-linux-gnu/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../../../../../x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../../../../../x86_64-unknown-linux-gnu/bin/
libraries: =/gnu/store/zrmhjw6kha4ghra2dkr06kldarxybgkw-profile/lib/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/zrmhjw6kha4ghra2dkr06kldarxybgkw-profile/lib/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../../../../../x86_64-unknown-linux-gnu/lib/x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../../../../../x86_64-unknown-linux-gnu/lib/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../x86_64-unknown-linux-gnu/7.2.0/:/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../:/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/libx86_64-unknown-linux-gnu/7.2.0/:/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/lib
$ gcc --print-search-dirs | grep include
$

It looks similar but less verbose on Arch. On Parabola / Arch Linux
ffi-helper tried to find the 32-bit header on a 64-bit system which
was not installed.

According to https://gcc.gnu.org/ml/gcc-help/2007-09/msg00205.html
which I found with a very quick Web search one can check the output
from „cpp -Wp,-v“.

$ cpp -Wp,-v
ignoring nonexistent directory "/no-gcc-local-prefix/include"
ignoring nonexistent directory "/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/../../../../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/gnu/store/zrmhjw6kha4ghra2dkr06kldarxybgkw-profile/include
/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/include
/gnu/store/h3z6nshhdlc8zgh4mi13x1br03xipi9r-gcc-7.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/7.2.0/include-fixed
/gnu/store/3h31zsqxjjg52da5gp3qmhkh4x8klhah-glibc-2.25/include
End of search list.

This looks right. However this may be fragile. Including these
directories with the -I option as well as the path to Gettext’s
include directory allows me to compile-ffi the gettext-po.ffi file.
Except when including <stdlib.h> ffi-helper is still looking for not
installed gnu/stubs-32.h on a 64-bit system with a 64-bit glibc and
64-bit gcc.

Regards,
Florian
Matt Wette
2017-12-10 23:00:15 UTC
Permalink
Post by Matt Wette
(fh-object-set! eh-struct 'error error-p)
I was able to duplicate getting 0. The problem was the argument `error-p'.
(The bytestructures i/f seems to be silent here about he incorrect argument.)

Please try instead the following:

(fh-object-set! eh-struct 'error (ffi:pointer-address error-p))

Matt
pelzflorian (Florian Pelz)
2017-12-12 08:17:48 UTC
Permalink
Post by Matt Wette
Post by Matt Wette
(fh-object-set! eh-struct 'error error-p)
I was able to duplicate getting 0. The problem was the argument `error-p'.
(The bytestructures i/f seems to be silent here about he incorrect argument.)
(fh-object-set! eh-struct 'error (ffi:pointer-address error-p))
Matt
It works! Thank you. :)

Regards,
Florian
Thompson, David
2017-12-14 13:23:50 UTC
Permalink
Hey everyone!
Hi!
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
I’d love those sites to support translation! If we could integrate your
solution as a Haunt extension or something, that’d be great.
I agree! Thanks to everyone that is working on this. I would love to
include translation support in Haunt but I'm completely clueless about
the right way to do things, so I'd be happy to receive some patches
that implement it nicely. :)

Thanks again!

- Dave
pelzflorian (Florian Pelz)
2017-12-15 11:39:22 UTC
Permalink
Post by Thompson, David
Hey everyone!
Hi!
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
I’d love those sites to support translation! If we could integrate your
solution as a Haunt extension or something, that’d be great.
I agree! Thanks to everyone that is working on this. I would love to
include translation support in Haunt but I'm completely clueless about
the right way to do things, so I'd be happy to receive some patches
that implement it nicely. :)
Thanks again!
- Dave
I would be really glad to see this upstream. I will take a look at
how to integrate ffi-helper in the build system and then send patches
with my approach, unless you dislike the syntax.

Currently there is a need to always pass the current-lingua to all
procedures because that is what my gettext syntax (_ "Hello") reads
the locale from. I’m not sure, maybe this is the right approach to
explicitly pass it to each procedure, but maybe something like how
G-expressions wrap the store is better, but I have not fully
understood or even really looked at G-expressions yet. We can discuss
the syntax later though when I have working patches.

Regards,
Florian
Christopher Lemmer Webber
2017-12-15 03:48:40 UTC
Permalink
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
And not just important to Haunt... many sxml using projects!

At one point Mark Weaver and I were talking about something like a
special quasiquote that could be used for translations with gettext,
since many translations are very tricky in situations like:

`(p "Download from "
(a (@ (href ,link))
"this webpage") ".")

The naive approach would be to have different translations for
"Download from " and "this webpage" and (egads) "."

However this is called "lego translations" and unfortunately is nowhere
near as elegant as lego... different languages have different
word order so there is no safe way to "break apart" a translation like
this into bits.

The right way to do it using a special gettext quasiquote would probably
be like:

#_(p "Download from "
(a (@ (href #_,link))
"this webpage") ".")

or some such thing, which could then produce a string for translation
that looks like:

"(p \"Download from \"
(a (@ (href #_,$1))
\"this webpage\") \".\")"

or some such thing.

What do you think about this direction? It would require some new
tooling, but I think it's the best way forward probably?
pelzflorian (Florian Pelz)
2017-12-15 08:34:47 UTC
Permalink
Post by Christopher Lemmer Webber
At one point Mark Weaver and I were talking about something like a
special quasiquote that could be used for translations with gettext,
`(p "Download from "
"this webpage") ".")
The naive approach would be to have different translations for
"Download from " and "this webpage" and (egads) "."
However this is called "lego translations" and unfortunately is nowhere
near as elegant as lego... different languages have different
word order so there is no safe way to "break apart" a translation like
this into bits.
The right way to do it using a special gettext quasiquote would probably
#_(p "Download from "
"this webpage") ".")
or some such thing, which could then produce a string for translation
"(p \"Download from \"
\"this webpage\") \".\")"
or some such thing.
What do you think about this direction? It would require some new
tooling, but I think it's the best way forward probably?
I currently use

(div (@ (id "content"))
,body
(footer
(div (@ (id "contact"))
(h1
(@ (class "contact-heading"))
,(_ "Contact me:"))
(div
,(_ "Mail:")
" "
(a (@ (href "mailto:***@pelzflorian.de"))
"***@pelzflorian.de"))
(div
"XMPP:"
" "
(a
(@
(href
"xmpp:***@chat.pelzflorian.de?message"))
"***@chat.pelzflorian.de"))
(div
,@(__ "GnuPG key: ||gnupglink_|| (valid until \
||gnupgexp_||)"
`(("gnupglink_" .
,(a-href
"/files/key.asc"
"0x4947055B"))
("gnupgexp_" .
"01/27/2019")))))
(div (@ (id "source-code-link"))
,@(__ "Find the source code for this website \
||link_|here||."
`(("link_" .
,(lambda (text)
`(a (@ (href ,(build-url
"git"
"pelzfloriande-website")))
,text))))))
(div (@ (id "powered-by"))
,@(__ "Powered by \
||link_|https://www.gnu.org/software/guile/|GNU Guile|| and \
||link_|https://haunt.dthompson.us/|Haunt||."
`(("link_" . ,a-href))))))



This works with existing Gettext. Special syntax would perhaps be
easier to write, but I don’t know what kind of Gettext string it
should produce. "Find the source code for this website \
||link_|here||." uses the symbols (well, strings) like "link_"
specified in the code.

(Note that I want a clear separation between Gettext and Scheme code.
This is not always desirable, but often is.)

Regards,
Florian
ng0
2017-12-15 12:06:28 UTC
Permalink
Post by Christopher Lemmer Webber
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
And not just important to Haunt... many sxml using projects!
What's the difference (except for the visible syntax/writing) between
SXML and XML? When I've looked into translations not just prep came to
my mind but XML has native translation support - just no one wants to
work with plain XML when it can be avoided ;)

Is there a chance that we can make use of this native translation ability
while staying user friendly?

It's been really long since I've looked into this, so excuse the gap of knowledge
and fuzzy "what if..."s.
Post by Christopher Lemmer Webber
At one point Mark Weaver and I were talking about something like a
special quasiquote that could be used for translations with gettext,
`(p "Download from "
"this webpage") ".")
The naive approach would be to have different translations for
"Download from " and "this webpage" and (egads) "."
However this is called "lego translations" and unfortunately is nowhere
near as elegant as lego... different languages have different
word order so there is no safe way to "break apart" a translation like
this into bits.
The right way to do it using a special gettext quasiquote would probably
#_(p "Download from "
"this webpage") ".")
or some such thing, which could then produce a string for translation
"(p \"Download from \"
\"this webpage\") \".\")"
or some such thing.
What do you think about this direction? It would require some new
tooling, but I think it's the best way forward probably?
--
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://c.n0.is/ng0_pubkeys/tree/keys
WWW: https://n0.is
pelzflorian (Florian Pelz)
2017-12-15 14:25:01 UTC
Permalink
Post by ng0
Post by Christopher Lemmer Webber
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
And not just important to Haunt... many sxml using projects!
What's the difference (except for the visible syntax/writing) between
SXML and XML? When I've looked into translations not just prep came to
my mind but XML has native translation support - just no one wants to
work with plain XML when it can be avoided ;)
Is there a chance that we can make use of this native translation ability
while staying user friendly?
It's been really long since I've looked into this, so excuse the gap of knowledge
and fuzzy "what if..."s.
I found only this [1] for XML native translation support.

IMHO it should be up to the SHTML author to specify the lang attribute
etc. But the lang attribute and perhaps text direction should
be used when writing example code or default styles.

About the marking of translations: I prefer explicitly marking the
strings that should be translated instead of including all of them in
the Gettext pot file. Explicit markings are flexible which I believe
is good when interleaving SHTML code with Scheme procedures.

The w3 document [1] from above recommend extracting every element
content that is not marked as its:translate="no" by default. It is
however difficult to automatically decide which parts of a Scheme file
are (S)HTML and which parts are not, so I do not like this approach.
Also often an element’s attributes need translation and sometimes the
text content does not.

In GTK+ UI files, which are XML, each tag can be marked with a
translatable="yes" attribute. Gettext then extracts each marked
string into the pot file. (Although the Glade UI designer
automatically adds translatable=yes to all elements with text. As
above, this would make autodetection difficult when other Scheme code
is part of the file.)

I also prefer the much shorter _ syntax that is common for Gettext
rather than a translatable=yes attribute because SHTML is usually
written manually.

tl;dr I would rather not use this “implicit” marking in Haunt.

Regards,
Florian

[1] https://www.w3.org/TR/xml-i18n-bp/#AuthoringTime
Ricardo Wurmus
2017-12-16 09:54:15 UTC
Permalink
Post by ng0
Post by Christopher Lemmer Webber
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
And not just important to Haunt... many sxml using projects!
What's the difference (except for the visible syntax/writing) between
SXML and XML? When I've looked into translations not just prep came to
my mind but XML has native translation support - just no one wants to
work with plain XML when it can be avoided ;)
I would not call this “native translation support”. There is a
specification for tags and XPath rules that extract text from XML
documents.

Here’s an example relating to text segmentation:

https://www.w3.org/TR/xml-i18n-bp/#DevSeg

This doesn’t help us much, though. It is an example of a rule set to
extract strings in a way that is sensible for a translator.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
https://elephly.net
pelzflorian (Florian Pelz)
2017-12-16 12:37:36 UTC
Permalink
For me personally, my preference is still on manually marking what
goes into the Gettext pot file and I believe my idea for its syntax is
suitable.

For example, this is a multi-part internationalized paragraph from my
website:

(p ,@(__ "“Don’t Hang” by default uses the words from \
the ||ic_|/usr/share/dict|| directory, but it can deal with any list of \
expressions in a text file with one expression per line. ||samplelink_|Here|| \
is an example word list file compiled with words from \
||wiktionarylink_|Wiktionary’s list of 1000 basic English words|| which you \
can use if you want simpler words. This sample word list is available under \
the terms of ||ccbysalink_|the CC-BY-SA 3.0 Unported license||, because \
Wiktionary uses this license and the words are taken from there."
`(("ic_" .
,(lambda (text)
`(span (@ (class "inline-code"))
,text)))
("samplelink_" .
,(lambda (text)
(a-href
"sample-word-lists/english-words.txt"
text)))
("wiktionarylink_" .
,(lambda (text)
(a-href
"https://en.wiktionary.org/wiki/Appendix: 1000_b\
asic_English_words"
text)))
("ccbysalink_" .
,(lambda (text)
(a-href
"sample-word-lists/CCBYSA-3.0-UNPORTED.txt"
text))))))

But since there seems to be a demand for automatic extraction without
marking each piece of translatable text, maybe Haunt should offer that
as well. This would involve deciding which parts of a Scheme file are
SHTML code and which parts are just Scheme. Maybe that can mostly be
deduced from context but some false positives may need to be dealt
with. Also in SHTML code there is the issue of whether e.g. URLs
should go into the pot file or not.

I will take a look at Haunt and upstreaming next week. Currently my
main concern is build system integration. I want to just run „ninja
mywebsite-pot“ to build the pot file. But GNU Autotools are
complicated, Meson/Ninja depend on Python and probably Haunt should
integrate with each of them eventually


Regards,
Florian
pelzflorian (Florian Pelz)
2017-12-15 14:01:16 UTC
Permalink
Hi!
Post by pelzflorian (Florian Pelz)
I want to ask for your thoughts on my new solution since translations
are probably important to many Haunt users. In particular, I believe
there was some discussion on Website translation on the Guile or Guix
lists as well.
I’d love those sites to support translation! If we could integrate your
solution as a Haunt extension or something, that’d be great.
Speaking of Gettext, it might be interesting to see whether there ideas
https://www.gnu.org/software/gnun/
Hmm I don’t think Haunt should replace too many functions of the build
system. (The current “haunt build” and “haunt serve” are few and
helpful when not using a bigger build system, even though maybe they
should go away in the long run in favor of build system integration.)
As I understand, GNUnited Nations mostly means integration into the
GNU Autotools build system or Makefiles. Integration is desirable
IMHO but I believe not everyone wants the same workflow as GNUnited
Nations, so Make targets like theirs probably should not be part of
Haunt.

Regards,
Florian
pelzflorian (Florian Pelz)
2017-12-16 19:30:41 UTC
Permalink
I'm very interested on this subject because I help with Guile and Guix
websites, and I usually work with multilingual websites. I have no idea of
what would be the right way to do i18n of websites written in Scheme,
though. So I will just join this conversation as a potential user of your
solutions :)
:)
Post by pelzflorian (Florian Pelz)
I did not want to use the ordinary gettext functions in order to not
call setlocale very often to switch languages. It seems the Gettext
system is not designed for rapidly changing locales, but maybe I am
wrong about this and very many setlocale calls would not be that bad.
For what is worth, I use ordinary gettext and `setlocale` in my website,
which is not Haunt-based, but it is Guile Scheme and statically generated
too. So far, it works ok.
Performance is what motivated me to avoid repeated setlocale calls.
I now measured the impact of my approach and for my website, repeatedly
calling setlocale and gettext is actually slightly faster than
transforming a po file into an associative list and assoc-ref’ing the
list. Only when using the same msgid very many times, transforming
the po file gets faster.

So it is probably best *not* to add ffi-helper to Haunt after all and
just use Gettext because while repeated setlocale is bad in theory, it
is faster in practice for normal websites and it does not really
matter much anyway. Then again, for long running applications, not
using setlocale is better.


If you want detailed timings, read on, otherwise feel free to skip to
the end of this e-mail.

For my German and English website with the code at
https://pelzflorian.de/git/pelzfloriande-website/ this is the result
of timing my current approach, which avoids repeated setlocale and
standard gettext calls but instead uses libgettextpo to create an
association list of msgids and msgstrs from the respective po files
(i.e. not from compiled mo files).

I put

#!/bin/sh
GUILE_LOAD_PATH=$GUILE_LOAD_PATH:$HOME/keep/projects/pelzfloriande-website:$HOME/build/nyacc/src/nyacc/examples LD_LIBRARY_PATH=/gnu/store/0jjgg2bk6qmx87sdksm7bd2b3z10yd6j-gettext-0.19.8.1/lib haunt build

inside a file called launch.sh. I then ran

$ time ./launch.sh
[
]
./launch.sh 2.43s user 0.33s system 83% cpu 3.317 total
./launch.sh 2.47s user 0.33s system 103% cpu 2.703 total
./launch.sh 2.43s user 0.36s system 103% cpu 2.700 total
./launch.sh 2.56s user 0.33s system 103% cpu 2.783 total

When instead not loading gettext-po, ffi-help-rt and Guile’s system
foreign modules, but running msgfmt to transform the po files to mo
files, moving them to ./de/LC_MESSAGES/pelzfloriande.mo and just using
standard Gettext and setlocale with

(bindtextdomain "pelzfloriande" "/home/florian/keep/projects/pelzfloriande-website")
(bind-textdomain-codeset "pelzfloriande" "UTF-8")
(textdomain "pelzfloriande")

(define (locale-for-lingua lingua)
(assoc-ref
'(("de" . "de_DE.UTF-8")
("en" . "en_US.UTF-8"))
lingua))

(define (translated-msg msgid lingua)
(begin
(setlocale LC_ALL (locale-for-lingua lingua))
(gettext msgid)))


I got the following measurements and verified that the translation is
still working:

$ time ./launch.sh
building pages in 'site'...
copying asset 'css/common.css' → '/css/common.css'
[
]
./launch.sh 2.01s user 0.29s system 102% cpu 2.241 total

For multiple runs:
./launch.sh 2.01s user 0.29s system 102% cpu 2.241 total
./launch.sh 2.06s user 0.31s system 102% cpu 2.302 total
./launch.sh 2.15s user 0.33s system 104% cpu 2.387 total
./launch.sh 1.99s user 0.32s system 102% cpu 2.246 total



When using setlocale but only when the lingua has changed from the
last call to _:

(define old-lingua "")

(define (translated-msg msgid lingua)
(begin
(if (not (equal? old-lingua lingua))
(begin
(setlocale LC_ALL (locale-for-lingua lingua))
(set! old-lingua lingua)))
(gettext msgid)))

./launch.sh 2.10s user 0.32s system 103% cpu 2.332 total
./launch.sh 2.03s user 0.31s system 102% cpu 2.283 total
./launch.sh 2.11s user 0.36s system 102% cpu 2.408 total
./launch.sh 2.05s user 0.30s system 102% cpu 2.296 total



When adding the following in a div:

,@(let loop ((i 0))
(if (< i 10000)
(cons
(_ "Home page")
(loop (1+ i)))
'()))

and verifying it is correctly translated,
this is the result for my implementation:

./launch.sh 4.48s user 0.33s system 89% cpu 5.356 total
./launch.sh 4.52s user 0.36s system 102% cpu 4.737 total
./launch.sh 4.44s user 0.38s system 104% cpu 4.619 total
./launch.sh 4.49s user 0.40s system 103% cpu 4.735 total

With a setlocale call for each _:

./launch.sh 4.46s user 0.36s system 101% cpu 4.736 total
./launch.sh 4.64s user 0.39s system 103% cpu 4.875 total
./launch.sh 4.65s user 0.33s system 103% cpu 4.838 total
./launch.sh 4.66s user 0.33s system 103% cpu 4.833 total

This is the result for a cached setlocale call:

./launch.sh 4.39s user 0.37s system 102% cpu 4.624 total
./launch.sh 4.17s user 0.32s system 88% cpu 5.086 total
./launch.sh 4.09s user 0.32s system 102% cpu 4.276 total
./launch.sh 4.16s user 0.35s system 103% cpu 4.345 total




When adding the following in the div instead

,@(let loop ((i 0))
(if (< i 10000)
(cons
(let ((current-lingua "de"))
(_ "Home page"))
(cons
(let ((current-lingua "en"))
(_ "Home page"))
(loop (1+ i))))
'()))

this is my current implementation

./launch.sh 6.36s user 0.36s system 99% cpu 6.733 total
./launch.sh 6.34s user 0.34s system 103% cpu 6.470 total
./launch.sh 6.00s user 0.39s system 103% cpu 6.195 total

this is without caching setlocale

./launch.sh 8.74s user 0.38s system 101% cpu 8.986 total
./launch.sh 8.70s user 0.36s system 102% cpu 8.872 total
./launch.sh 8.86s user 0.40s system 99% cpu 9.300 total

this is with caching setlocale

./launch.sh 8.95s user 0.37s system 93% cpu 9.979 total
./launch.sh 8.60s user 0.39s system 103% cpu 8.712 total
./launch.sh 8.81s user 0.34s system 95% cpu 9.581 total


In this contrived example, my implementation is faster. Note that my
implementation may or may not be slower when not using the same
translation very often but instead using a longer PO file.
For internationalization, I know the convention is to use _, but I don't
like that, so I use the alias l10n instead.
We should definitely let the user define the syntax like in the Guile
manual. If you want l10n, then use l10n, which is less confusing when
using _ for pattern matching. But I will stick to _ for my website.
For internationalizing complex blocks that should not be translated in
`(p "Hi! I play "
" in "
`(p
"Hi! I play ~SPORT~ in ~PLACE~."
"futsal"
"Tokyo"
This interleaving is like a format string and is common in
applications, but it separates the value of ~SPORT~ from the context
in which it should be translated. I prefer my approach with
multi-part translations with

,@(__ "This is a ||em_|multi-part translation||."
`(("em_" .
,(lambda (text)
`(em ,text)))))
Currently, I use xgettext manually and Poedit for working with translation
catalogs, but I'd like to manage translations in the future like this
# Create new translation catalogs for Finnish and Japanese.
$ site catalog-new fi ja
# Update translation catalogs with new translation strings.
$ site catalog-update
# Compile translation catalogs (generate .mo files)
$ site catalog-compile
Yes. This is a good user interface. Maybe this should be part of the
haunt command and not require a build system after all

To be fully localized, I also have to pass IETF Language Tags around in the
website code, so that I get the right content when rendering the templates
in a given language.
My 2¢
Yes, me too. I wonder if this should be wrapped into custom syntax
maybe like the Guix store in G-expressions, but I’m not sure.

Regards,
Florian

Loading...