Discussion:
optimizing lazy sequences: srfi-41 vs delayed continuation with values + promise vs delayed continuation with cons + lambda
Amirouche Boubekki
2018-02-25 14:16:48 UTC
Permalink
Hello all,

I know it's not good to optimize first, but I got into it
and now I need to know.

A few months ago, I created a lazy sequence library that retains
history based on delayed continuation, it can be summarized
as follow:

(define-public (list->traversi lst)
(let loop ((lst lst))
(lambda ()
(if (null? lst)
'()
;; here the car of the pair contains the first value
;; and an empty list as history
(cons (cons (car lst) '()) (loop (cdr lst)))))))

(define-public (traversi->list traversi)
(let loop ((traversi traversi)
(out '()))
(match (traversi)
('() (reverse out))
((item . next) (loop next (cons (car item) out))))))

(define-public (traversi-map proc traversi)
(let loop ((traversi traversi))
(lambda ()
(match (traversi)
('() '())
((item . next) (cons (cons (proc (car item)) item) (loop
next)))))))

They are other procedures, but that is the gist of it.

I tested it on a parser combinator for csv and it seemed much
faster than the srfi-41 based parser combinators [0].

[0] https://git.dthompson.us/guile-parser-combinators.git

Now, I got another idea what if I replace the 'cons' returned
by the traversi procedures by 'values' and replace the
lambda with 'delay' with some use of 'force' at the correct
places. It will result in the following procedures:

(define-public (list->stream lst)
(let loop ((lst lst))
(if (null? lst)
(delay (values #f #f))
(delay (values (car lst) (loop (cdr lst)))))))

(define-public (stream->list stream)
(let loop ((stream stream)
(out '()))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(loop next (cons value out))
(reverse! out))))))

(define-public (stream-map proc stream)
(let loop ((stream stream))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(delay (values (proc value) (loop next)))
(delay (values #f #f)))))))

I tested those procedures in the REPL with ,time and
the results are not conclusive:

scheme@(guile-user)> (define (nop o) #t)


scheme@(guile-user)> ,time (nop (traversi->list (traversi-map (lambda
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$1 = #t
;; 0.028090s real time, 0.037698s run time. 0.012421s spent in GC.
scheme@(guile-user)> ,time (nop (traversi->list (traversi-map (lambda
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$2 = #t
;; 0.025622s real time, 0.025614s run time. 0.000000s spent in GC.
scheme@(guile-user)> ,time (nop (traversi->list (traversi-map (lambda
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$3 = #t
;; 0.034165s real time, 0.034146s run time. 0.000000s spent in GC.

scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$4 = #t
;; 0.031616s real time, 0.041882s run time. 0.014514s spent in GC.
scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$5 = #t
;; 0.026148s real time, 0.026141s run time. 0.000000s spent in GC.
scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$6 = #t
;; 0.028477s real time, 0.037107s run time. 0.011057s spent in GC.

Q: What in theory should be faster?

Q: How can I test / proove that one approach must be faster?

From the usability & readability point of view I find the
values + promise much more readable and easier to code.

TIA!
Amirouche Boubekki
2018-02-25 14:30:16 UTC
Permalink
Post by Amirouche Boubekki
Hello all,
I know it's not good to optimize first, but I got into it
and now I need to know.
A few months ago, I created a lazy sequence library that retains
history based on delayed continuation, it can be summarized
(define-public (list->traversi lst)
(let loop ((lst lst))
(lambda ()
(if (null? lst)
'()
;; here the car of the pair contains the first value
;; and an empty list as history
(cons (cons (car lst) '()) (loop (cdr lst)))))))
(define-public (traversi->list traversi)
(let loop ((traversi traversi)
(out '()))
(match (traversi)
('() (reverse out))
((item . next) (loop next (cons (car item) out))))))
(define-public (traversi-map proc traversi)
(let loop ((traversi traversi))
(lambda ()
(match (traversi)
('() '())
((item . next) (cons (cons (proc (car item)) item) (loop
next)))))))
They are other procedures, but that is the gist of it.
I tested it on a parser combinator for csv and it seemed much
faster than the srfi-41 based parser combinators [0].
[0] https://git.dthompson.us/guile-parser-combinators.git
Now, I got another idea what if I replace the 'cons' returned
by the traversi procedures by 'values' and replace the
lambda with 'delay' with some use of 'force' at the correct
(define-public (list->stream lst)
(let loop ((lst lst))
(if (null? lst)
(delay (values #f #f))
(delay (values (car lst) (loop (cdr lst)))))))
(define-public (stream->list stream)
(let loop ((stream stream)
(out '()))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(loop next (cons value out))
(reverse! out))))))
(define-public (stream-map proc stream)
(let loop ((stream stream))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(delay (values (proc value) (loop next)))
(delay (values #f #f)))))))
I tested those procedures in the REPL with ,time and
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$1 = #t
;; 0.028090s real time, 0.037698s run time. 0.012421s spent in GC.
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$2 = #t
;; 0.025622s real time, 0.025614s run time. 0.000000s spent in GC.
(x) (1+ x)) (list->traversi (iota (expt 2 10))))))
$3 = #t
;; 0.034165s real time, 0.034146s run time. 0.000000s spent in GC.
(1+ x)) (list->stream (iota (expt 2 10))))))
$4 = #t
;; 0.031616s real time, 0.041882s run time. 0.014514s spent in GC.
(1+ x)) (list->stream (iota (expt 2 10))))))
$5 = #t
;; 0.026148s real time, 0.026141s run time. 0.000000s spent in GC.
(1+ x)) (list->stream (iota (expt 2 10))))))
$6 = #t
;; 0.028477s real time, 0.037107s run time. 0.011057s spent in GC.
I forgot the srfi-41 because they are worse:

scheme@(guile-user)> (use-modules (srfi srfi-41))scheme@(guile-user)>
(define (nop o) #t)
scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$1 = #t
;; 0.034429s real time, 0.043788s run time. 0.012689s spent in GC.
scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$2 = #t
;; 0.033329s real time, 0.041973s run time. 0.010852s spent in GC.
scheme@(guile-user)> ,time (nop (stream->list (stream-map (lambda (x)
(1+ x)) (list->stream (iota (expt 2 10))))))
$3 = #t
;; 0.056261s real time, 0.064563s run time. 0.010440s spent in GC.
Post by Amirouche Boubekki
Q: What in theory should be faster?
Q: How can I test / proove that one approach must be faster?
From the usability & readability point of view I find the
values + promise much more readable and easier to code.
TIA!
--
Amirouche ~ amz3 ~ http://www.hyperdev.fr
Amirouche Boubekki
2018-02-25 22:39:05 UTC
Permalink
This post might be inappropriate. Click to display it.
Amirouche Boubekki
2018-02-25 23:31:39 UTC
Permalink
Post by Amirouche Boubekki
(define (lazyseq-with-stream)
(list->stream (iota max)))
This is wrong.

It must be implemented as:

(define-stream (lazyseq-with-stream)
(stream-let loop ((v 1))
(stream-cons v (loop (+ 1 v)))))

I get the same segfault with the following error:

Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
Mark H Weaver
2018-03-02 23:22:00 UTC
Permalink
Hi,

Amirouche Boubekki <***@hypermove.net> writes:
[...]
Post by Amirouche Boubekki
Now, I got another idea what if I replace the 'cons' returned
by the traversi procedures by 'values' and replace the
lambda with 'delay' with some use of 'force' at the correct
(define-public (list->stream lst)
(let loop ((lst lst))
(if (null? lst)
(delay (values #f #f))
(delay (values (car lst) (loop (cdr lst)))))))
(define-public (stream->list stream)
(let loop ((stream stream)
(out '()))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(loop next (cons value out))
(reverse! out))))))
(define-public (stream-map proc stream)
(let loop ((stream stream))
(call-with-values (lambda () (force stream))
(lambda (value next)
(if next
(delay (values (proc value) (loop next)))
(delay (values #f #f)))))))
This code assumes that promises can store multiple values. Although
Guile's legacy core promises *accidentally* support multiple values
today, there's no guarantee that they will continue to do so in the
future. None of the standards allow this, and Guile's manual states in
the documentation for 'delay' that "The effect of <expression> returning
multiple values is unspecified."

Supporting multiple values in promises makes them more complex, and
inevitably less efficient.

SRFI-45 promises are simpler than Guile's legacy core promises in two
ways: (1) they do not include built-in thread synchronization, and (2)
they do not support multiple values.

I would recommend using SRFI-45 promises. Although they are implemented
in Scheme, last I checked I found that they were about as fast as
Guile's legacy core promises implemented in C, presumably because of the
built-in thread synchronization in our core promises.

In any case, I would strongly recommend against writing code that
assumes that Guile's promises can hold multiple values.

Regards,
Mark

Loading...