Discussion:
Introducing GNUPaste (and guile-wiredtiger future)
Amirouche Boubekki
2017-12-17 14:39:52 UTC
Permalink
Héllo,
Hello!
I am excited to share GNUPaste! This is a really simple web app
similar to paste.lisp.org built with Guile. I have a linode running it
from git on GuixSD.
https://paste.freshbakedyams.com (Please use it!)
Source: https://github.com/kristoferbuffington/gnupaste.git
Currently the frontend uses twitter bootstrap + jquery and highlightjs
from a CDN. It really doesn't need all that boilerplate. It will
definitely change in the future. GNUPaste depends on guile-wiredtiger
and guile-fibers to compile.
Thanks for considering guile-wiredtiger.

Basically, guile-wiredtiger is not compatible yet with fiber
in the general case, because fiber will spawn several threads
and several fibers in each thread (and I think that fibers
can be stolen by other threads but I am not sure).

The way wiredtiger works is that there is one
*connexion* (called *environment* in guile wiredtiger)
per database and one *session* per "thread" of execution
(called *context* in guile wiredtiger).

I changed the naming because they are different from the
original things. Both environment and context are backed
by fluids.

In a pre-fork thread model, one must use with-context [1]
after the fork to populate the current fuild with a specific
context.

[1]
https://framagit.org/a-guile-mind/guile-wiredtiger/blob/master/wiredtiger/extra.scm#L328

The thing is that when using fibers, the thread of execution
is a fiber, hence simply said the wiredtiger extra abstraction
called context fails. Because context/session must not be
shared between different fibers even if they are executed in
the same thread.

Otherwise said, guile-wiredtiger extra abstraction context
and environment are handy in single thread context because
it avoids passing the context around. It's also handy in
simple multithread settings where you don't need to pass
environment around (to create a new session per thread).
But it fails in the advanced use case of guile fibers.

This won't trigger a crash under low traffic, but will fail
under load and advanced use of guile wiredtiger, like multiple
statments transactions. This can be mitigated by turning off
preemption in guile-fibers but again it's not perfect solution.

I failed to create an API that makes simple things simple and
complex things possible. There might be an escape if fibers
implemented fluids for fibers something like PEP 550 [2].
But anyway, it won't be perfect, so I will rework the current
databases in guile wiredtiger to completly avoid the use of
fluids and instead pass around database cnx and session.

[2] https://www.python.org/dev/peps/pep-0550/

I started doing this in culturia [3] but it's far from being
complete. Since I need to convert all databases grf3, feature-space
and ix to that style.

Also, it's a backward incompatible change.

[3]
https://github.com/a-guile-mind/culturia.one/commit/15328e53fcb51d43f14de2f9f21d0b309237969a

At the end of the day, I don't think I want to maintain grf3,
feature-space and ix inside guile-wiredtiger because they are
much more than wiredtiger bindings. So I am pondering the fact
that I will drop those databases abstraction from guile wiredtiger
and focus on improving the core bindings (like actionable exceptions)
and maybe improve bindings coverage.

For the next release 0.8 of guile-wiredtiger, the abstractions grf3,
feature-space and ix will be deprecated. And after for 0.9 release
they will be moved to the example folder (or better, forked by
other people to be maintained separatly).
$ guix system disk-image gnupaste-system.scm
Then boot it up in a VPS.
That will be neat!
Thanks!
Kris
--
Amirouche ~ amz3 ~ http://www.hyperdev.fr
Kristofer Buffington
2017-12-17 17:35:57 UTC
Permalink
Post by Amirouche Boubekki
Héllo,
Hello!
I am excited to share GNUPaste! This is a really simple web app
similar to paste.lisp.org built with Guile. I have a linode running it
from git on GuixSD.
https://paste.freshbakedyams.com (Please use it!)
Source: https://github.com/kristoferbuffington/gnupaste.git
Currently the frontend uses twitter bootstrap + jquery and highlightjs
from a CDN. It really doesn't need all that boilerplate. It will
definitely change in the future. GNUPaste depends on guile-wiredtiger
and guile-fibers to compile.
Thanks for considering guile-wiredtiger.
Basically, guile-wiredtiger is not compatible yet with fiber
in the general case, because fiber will spawn several threads
and several fibers in each thread (and I think that fibers
can be stolen by other threads but I am not sure).
The way wiredtiger works is that there is one
*connexion* (called *environment* in guile wiredtiger)
per database and one *session* per "thread" of execution
(called *context* in guile wiredtiger).
I changed the naming because they are different from the
original things. Both environment and context are backed
by fluids.
In a pre-fork thread model, one must use with-context [1]
after the fork to populate the current fuild with a specific
context.
[1]
https://framagit.org/a-guile-mind/guile-wiredtiger/blob/master/wiredtiger/extra.scm#L328
The thing is that when using fibers, the thread of execution
is a fiber, hence simply said the wiredtiger extra abstraction
called context fails. Because context/session must not be
shared between different fibers even if they are executed in
the same thread.
I have noticed that I get "Resource busy" issues if multiple requests
are handled by the server simultaneously. I tried running (with-env ...)
in the request-handler instead, but it is the same problem. Multiple
environments can't open the database simultaneously either.
Post by Amirouche Boubekki
Otherwise said, guile-wiredtiger extra abstraction context
and environment are handy in single thread context because
it avoids passing the context around. It's also handy in
simple multithread settings where you don't need to pass
environment around (to create a new session per thread).
But it fails in the advanced use case of guile fibers.
This won't trigger a crash under low traffic, but will fail
under load and advanced use of guile wiredtiger, like multiple
statments transactions. This can be mitigated by turning off
preemption in guile-fibers but again it's not perfect solution.
I failed to create an API that makes simple things simple and
complex things possible. There might be an escape if fibers
implemented fluids for fibers something like PEP 550 [2].
But anyway, it won't be perfect, so I will rework the current
databases in guile wiredtiger to completly avoid the use of
fluids and instead pass around database cnx and session.
[2] https://www.python.org/dev/peps/pep-0550/
I started doing this in culturia [3] but it's far from being
complete. Since I need to convert all databases grf3, feature-space
and ix to that style.
Also, it's a backward incompatible change.
[3]
https://github.com/a-guile-mind/culturia.one/commit/15328e53fcb51d43f14de2f9f21d0b309237969a
At the end of the day, I don't think I want to maintain grf3,
feature-space and ix inside guile-wiredtiger because they are
much more than wiredtiger bindings. So I am pondering the fact
that I will drop those databases abstraction from guile wiredtiger
and focus on improving the core bindings (like actionable exceptions)
and maybe improve bindings coverage.
IMHO I think feature-space, ix and grf3 don't really belong in
guile-wiredtiger.
Post by Amirouche Boubekki
For the next release 0.8 of guile-wiredtiger, the abstractions grf3,
feature-space and ix will be deprecated. And after for 0.9 release
they will be moved to the example folder (or better, forked by
other people to be maintained separatly).
I really like the advanced abstractions. I'm not sure I'm knowledgeable
enough to fork and improve them yet.
Post by Amirouche Boubekki
$ guix system disk-image gnupaste-system.scm
Then boot it up in a VPS.
That will be neat!
Thanks!
Kris
I really don't want to rely on a database server. It would be easy
enough to store pastes on the filesystem, maybe even with (guix store)
or use git like tekuti and get the benefit of revision history with
paste modifications.

Thanks for the update!
Kris
Amirouche
2017-12-18 19:38:28 UTC
Permalink
Post by Kristofer Buffington
Post by Amirouche Boubekki
Héllo,
Hello!
I am excited to share GNUPaste! This is a really simple web app
similar to paste.lisp.org built with Guile. I have a linode running it
from git on GuixSD.
https://paste.freshbakedyams.com (Please use it!)
Source: https://github.com/kristoferbuffington/gnupaste.git
Currently the frontend uses twitter bootstrap + jquery and highlightjs
from a CDN. It really doesn't need all that boilerplate. It will
definitely change in the future. GNUPaste depends on guile-wiredtiger
and guile-fibers to compile.
Thanks for considering guile-wiredtiger.
Basically, guile-wiredtiger is not compatible yet with fiber
in the general case, because fiber will spawn several threads
and several fibers in each thread (and I think that fibers
can be stolen by other threads but I am not sure).
The way wiredtiger works is that there is one
*connexion* (called *environment* in guile wiredtiger)
per database and one *session* per "thread" of execution
(called *context* in guile wiredtiger).
I changed the naming because they are different from the
original things. Both environment and context are backed
by fluids.
In a pre-fork thread model, one must use with-context [1]
after the fork to populate the current fuild with a specific
context.
[1]
https://framagit.org/a-guile-mind/guile-wiredtiger/blob/master/wiredtiger/extra.scm#L328
The thing is that when using fibers, the thread of execution
is a fiber, hence simply said the wiredtiger extra abstraction
called context fails. Because context/session must not be
shared between different fibers even if they are executed in
the same thread.
I have noticed that I get "Resource busy" issues if multiple requests
are handled by the server simultaneously. I tried running (with-env ...)
in the request-handler instead, but it is the same problem. Multiple
environments can't open the database simultaneously either.
I will try to make the 0.8 before the end of the year with a fix.
Post by Kristofer Buffington
Post by Amirouche Boubekki
Otherwise said, guile-wiredtiger extra abstraction context
and environment are handy in single thread context because
it avoids passing the context around. It's also handy in
simple multithread settings where you don't need to pass
environment around (to create a new session per thread).
But it fails in the advanced use case of guile fibers.
This won't trigger a crash under low traffic, but will fail
under load and advanced use of guile wiredtiger, like multiple
statments transactions. This can be mitigated by turning off
preemption in guile-fibers but again it's not perfect solution.
I failed to create an API that makes simple things simple and
complex things possible. There might be an escape if fibers
implemented fluids for fibers something like PEP 550 [2].
But anyway, it won't be perfect, so I will rework the current
databases in guile wiredtiger to completly avoid the use of
fluids and instead pass around database cnx and session.
[2] https://www.python.org/dev/peps/pep-0550/
I started doing this in culturia [3] but it's far from being
complete. Since I need to convert all databases grf3, feature-space
and ix to that style.
Also, it's a backward incompatible change.
[3]
https://github.com/a-guile-mind/culturia.one/commit/15328e53fcb51d43f14de2f9f21d0b309237969a
At the end of the day, I don't think I want to maintain grf3,
feature-space and ix inside guile-wiredtiger because they are
much more than wiredtiger bindings. So I am pondering the fact
that I will drop those databases abstraction from guile wiredtiger
and focus on improving the core bindings (like actionable exceptions)
and maybe improve bindings coverage.
IMHO I think feature-space, ix and grf3 don't really belong in
guile-wiredtiger.
agree
Post by Kristofer Buffington
Post by Amirouche Boubekki
For the next release 0.8 of guile-wiredtiger, the abstractions grf3,
feature-space and ix will be deprecated. And after for 0.9 release
they will be moved to the example folder (or better, forked by
other people to be maintained separatly).
I really like the advanced abstractions. I'm not sure I'm knowledgeable
enough to fork and improve them yet.
This is not happening before I have a replacement for them.
When I think replacement, I think about a database (server?)
that is hopefully both easy to use (in the spirit of mongodb)
but that is also ACID across documents (aka. that cares about
your data).
Post by Kristofer Buffington
Post by Amirouche Boubekki
$ guix system disk-image gnupaste-system.scm
Then boot it up in a VPS.
That will be neat!
Thanks!
Kris
I really don't want to rely on a database server. It would be easy
enough to store pastes on the filesystem, maybe even with (guix store)
or use git like tekuti and get the benefit of revision history with
paste modifications.
I don't like either the idea of the database server. That said
without one, you have to code a secure API to allow
third parties to access the database ie. in a web context,
setup a RESTful API with credentials with secure tokens etc...

For instance, how will you delete old pastes in your application?

Anyway, the next step is to make guile-wiredtiger abstractions
work flawlessly with fiber.
Post by Kristofer Buffington
Thanks for the update!
Kris
Andy Wingo
2017-12-18 09:06:25 UTC
Permalink
Post by Amirouche Boubekki
Basically, guile-wiredtiger is not compatible yet with fiber
in the general case, because fiber will spawn several threads
and several fibers in each thread (and I think that fibers
can be stolen by other threads but I am not sure).
Note that it's possible to run fibers with only one kernel thread. See
the docs. Also note that in fibers (and indeed in Guile threads), a
newly spawned fiber (or thread) inherits the fluid values that were
current when the thread was spawned. Fluid values in other fibers or
threads are unaffected.

Anyway I reply to offer some more general notes :) If what you need is
sequential access to a database, you can arrange to access the database
from a single fiber. That fiber can communicate with others via
channels (for example). If the fiber migrates to another threads, that
usually doesn't matter -- it's as if a kernel thread migrated to a
different CPU. The memory model of Guile and fibers ensures that there
will be no problems. You do end up having to route database requests to
that fiber, usually via messages over channels, but that can be OK --
see
https://blog.acolyer.org/2017/12/04/ffwd-delegation-is-much-faster-than-you-think/.

Sometimes though you need real thread affinity between some external
resource and a fiber. In that case the usual solution is to spawn a
thread instead of a fiber, and access the resource only in that thread.
You can still use channels to communicate between that thread and other
fibers running on your system, if that's what you want.

Cheers,

Andy
Loading...