Lý sự pclouds

Tháng bảy 2019
T2	T3	T4	T5	T6	T7	CN
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Kho tháng 7/2019

Thứ tư, 31 Tháng 7 năm 2019 18:38:04 ICT

Writer monad in Scheme

Just playing. The Writer monad could be written probably like this:

(import (scheme base)
        (scheme write)
        (srfi 8)
        (srfi 99)
        (only (util list)
              intersperse))

(define-record-type <writer> make-writer #t
  (value writer-value) (written writer-written))

(define (return x)
  (make-writer x (string-append "returning " (number->string x))))

(define (>>= a f)
  (define (str->list x)
    (if (string? x) (list x) x))

  (let ([b (f (writer-value a))])
    (make-writer (writer-value b)
                 (append (str->list (writer-written a))
                         (str->list (writer-written b))))))

(define (run-writer x)
  (values (writer-value x)
          (apply string-append
                 (intersperse "\n" (writer-written x)))))

(define (add x)
  (make-writer (+ x 1) "adding one"))
(define (mul x)
  (make-writer (* x 2) "multiply by two"))

(receive (value log)
    (run-writer (>>= (>>= (return 2) add) mul))
  (display value)
  (newline)
  (display log)
  (newline))

We add 1 to 2, then multiply by 2. The result is

$ gosh ~/t/writer.scm 
6
returning 2
adding one
multiply by two

The monad effect is collecting the strings "adding one" and "multiply by two".

The >>= part is far from easy to read. So with some magic to recreate "do" syntax (stolen from here)

(define-syntax monad-do
  (syntax-rules (let1)
    [(_ e) e]
    [(_ (let1 var e1) e2 ...)
     (>>= e1 (lambda (var) (monad-do e2 ...)))]
    [(_ e1 e2 ...)
     (>>= e1 (lambda (_) (monad-do e2 ...)))]))

we can rewrite the main code to

    (run-writer (monad-do (let1 foo (return 2))
                          (let1 bar (add foo))
                          (mul bar)))

which is much easier to read (and very imperative looking). If we reverse the arguments of >>= though, to

(define (>>= f a)

the main code does not look that bad to require "do"

(run-writer (>>= mul (>>= add (return 2))))

But yeah monad is basically it. It takes a monad value and a function. It calls the function with the real value inside the monadic value, then it combines the context from the first monad value and the one produced by the function.

Tác giả: pclouds | Liên kết tĩnh

Thứ hai, 29 Tháng 7 năm 2019 19:44:08 ICT

I know monad!

Or I don't. Actually I'm not sure I do.

The blame goes to SRFI-159, SRFI-165 and SRFI-166. But monad seems interesting. Basically it's a way to manage some "side context" while applying a sequence of functions in a neat way.

One example, in a pure functional program, how do you manage state? The program state must change, but the in functional programming state is immutable.

The easiest way is passing a state variable in all functions, and return a pair of whatever-value-that-function-produces and the new state. It works, but it's a but cumbersome.

With monad and the bind operator >>= (at least in haskell), the state management part is taken out and the function can just focus on the true logic, not worrying too much about states.

Of course the function still needs access to the state, or to produce a new state. That's when the ask and local function from Reader monad comes in. This reader monad is represented as "environmental monad" in SRFI-165. Different name, same beast (though Scheme code is still easier to understand even if not as concise)

There's also Writer monad to "keep a log" on the side. The real State monad looks a bit more complicated. But well that's something for another monday.

Tác giả: pclouds | Liên kết tĩnh

Thứ ba, 16 Tháng 7 năm 2019 19:43:57 ICT

Think recursive and you understand...

... too bad thinking recursively is not that easy

(define-syntax let-xcb
  (syntax-rules (font gc pixmap)
    [(_ ((name (font args ...))) body ...)
     (let ([name (make-xid)])
       (dynamic-wind
         (lambda () (xcb-open-font :fid name args ...))
         (lambda () body ...)
         (lambda () (xcb-close-font :font name))))]
    [(_ ((name (gc args ...))) body ...)
     (let ([name (make-xid)])
       (dynamic-wind
         (lambda () (xcb-create-gc :cid name args ...))
         (lambda () body ...)
         (lambda () (xcb-free-gc :gc name))))]
    [(_ ((name (pixmap args ...))) body ...)
     (let ([name (make-xid)])
       (dynamic-wind
         (lambda () (xcb-create-pixmap :pid name args ...))
         (lambda () body ...)
         (lambda () (xcb-free-pixmap :pixmap name))))]
    [(_ ((name (key args ...))) body ...)
     (syntax-error "invalid keyword" key)]
    [(_ ((name value) rest ...) body ...)
     (let-xcb ((name value))
              (let-xcb (rest ...)
                       body ...))]
    [(_ () body ...)
     (begin body ...)]))

Pretty nice "RAII" in Scheme actually. Though of course if xcb-open-font could associate itself with the corresponding "destructor" procedure, it would be even nicer. But that feels a bit heavy.

Of course it's also nicer to reduce duplication in this macro. It's probably possible, but this is easier to read.

Tác giả: pclouds | Liên kết tĩnh

Thứ năm, 11 Tháng 7 năm 2019 22:23:42 ICT

http://freshmeat.sourceforge.net/ oh the memories...

Tác giả: pclouds | Liên kết tĩnh

Thứ năm, 11 Tháng 7 năm 2019 15:53:07 ICT

xwd in Scheme hú hú!

Tác giả: pclouds | Liên kết tĩnh

Thứ sáu, 05 Tháng 7 năm 2019 19:15:15 ICT

Examining std::function<>

C is simple. C++ is complicated. If you have a function pointer variable "fn" in C, gdb's "print" would tell you exactly what it points to. And list *fn would even show you the source code.

Try that on std::function<> which is the C++ way of carrying function pointer around. The printout is confusing at best.

At least if std::function<> contains a simple C function (or static class function), things aren't quite as bad.

(gdb) p f
$1 = {
  <std::_Maybe_unary_or_binary_function<int, int, int>> = {
    <std::binary_function<int, int, int>> = {<No data fields>}, <No data fields>},
  <std::_Function_base> = {
    static _M_max_size = 16,
    static _M_max_align = 8,
    _M_functor = {
      _M_unused = {
        _M_object = 0x400947 <real(int, int)>,
        _M_const_object = 0x400947 <real(int, int)>,
        _M_function_pointer = 0x400947 <real(int, int)>,
        _M_member_pointer = &virtual table offset 4196678, this adjustment 72704
      },
      _M_pod_data = "G\t@\000\000\000\000\000\000\034\001\000\000\000\000"
    },
    _M_manager = 0x400fd4 <std::_Function_base::_Base_manager<int (*)(int, int)>::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)>
  },
  members of std::function<int(int, int)>:
  _M_invoker = 0x400f81 <std::_Function_handler<int (int, int), int (*)(int, int)>::_M_invoke(std::_Any_data const&, int&&, int&&)>
}

It's still quite confusing, but _M_function_pointer kinda hints what it is already. Which is correct in this case. But let's talk a bit about some other fields here before moving on to more complicated functions.

std::function is a quite simple class actually. It has 32-bytes, which is 16 bytes of opaque data and two function pointers: _M_manager and _M_invoker.

I'm not sure about the manager thing (probably for opaque data management), but the invoker is the equivalent of virtual 'operator()'. Except in this case, std::function avoids virtual methods and everything, probably to stay small and cheap.

When you invoke an std::function, its invoker is called with the 16-byte opaque data above. When a regular function is stored in a std::function, the invoker version used will take consider the first 8 bytes on the opaque data as the function address, which is what you see above (_M_unused is a union by the way).

For the record, the function is this

(gdb) l *f._M_functor._M_unused._M_function_pointer
0x400947 is in real(int, int) (func1.cpp:4).
1       #include <functional>
2
3       int real(int v1, int v2)
4       {
5               return v1 + v2;
6       }
7
8       int main(int ac, const char **av)
9       {
10              std::function<int(int, int)> f = real;

OK what if you store something more sophisticated, like a std::bind result? You may see something like this

(gdb) p f2
$1 = {
  <std::_Maybe_unary_or_binary_function<int, int>> = {
    <std::unary_function<int, int>> = {<No data fields>}, <No data fields>},
  <std::_Function_base> = {
    static _M_max_size = 16,
    static _M_max_align = 8,
    _M_functor = {
      _M_unused = {
        _M_object = 0x615e70,
        _M_const_object = 0x615e70,
        _M_function_pointer = 0x615e70,
        _M_member_pointer = (void (std::_Undefined_class::*)(std::_Undefined_class * const)) 0x615e70, this adjustment 4199941
      },
      _M_pod_data = "p^a\000\000\000\000\000\005\026@\000\000\000\000"
    },
    _M_manager = 0x400e7d <std::_Function_base::_Base_manager<std::_Bind<int (*(std::_Placeholder<1>, int))(int, int)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)>
  },
  members of std::function<int(int)>:
  _M_invoker = 0x400e3f <std::_Function_handler<int (int), std::_Bind<int (*(std::_Placeholder<1>, int))(int, int)> >::_M_invoke(std::_Any_data const&, int&&)>
}

The _M_unused part does not make sense anymore. But the trick is this, the entire std::bind result (which is called a "functor") is actually allocated dynamically and its pointer is _M_object (or any other fields in that union, of course). If you know its type (remember std::bind is a template) then you can cast it back and maybe see something.

And the type is actually here, in the _M_invoker line, second template argument. So what if we cast it back and print it out?

(gdb) p *(std::_Bind<int (*(std::_Placeholder<1>, int))(int, int)>*)f2._M_functor._M_unused._M_object
$2 = {
  <std::_Weak_result_type<int (*)(int, int)>> = {
    <std::_Weak_result_type_impl<int (*)(int, int)>> = {<No data fields>}, <No data fields>},
  members of std::_Bind<int (*(std::_Placeholder<1>, int))(int, int)>:
  _M_f = 0x400887 <real(int, int)>,
  _M_bound_args = std::tuple containing = {
    [1] = {
      <std::_Placeholder<1>> = {<No data fields>}, <No data fields>},
    [2] = 2
  }
}

This looks a lot better. The _M_f line actually tells us about the function (and we can of course locate it with l *$2._M_f. But the more interesting thing here is the bound arguments. The original std::bind is this:

std::bind(real, std::placeholders::_1, 2);

And we can see from _M_bound_args that the first argument is _Placeholder<1>, and the second one the value 2. The function prototype at _M_f line already hints about the placeholder positions.

(Note, if you really examine raw data, std::tuple<> actually stores data in backward order)

This should work for class member functions as well, at least non-virtual ones. Doing the same trick we see

(gdb) p *(std::_Bind<int (Foo::*(Foo*, std::_Placeholder<1>, int))(int, int)>*)0x615e70
$2 = {
  <std::_Weak_result_type<int (Base::*)(int, int)>> = {
    <std::_Weak_result_type_impl<int (Base::*)(int, int)>> = {<No data fields>}, <No data fields>},
  members of std::_Bind<int (Base::*(Base*, std::_Placeholder<1>, int))(int, int)>:
  _M_f = &virtual Base::real(int, int),
  _M_bound_args = std::tuple containing = {
    [1] = 0x7fffffffdb30,
    [2] = {
      <std::_Placeholder<1>> = {<No data fields>}, <No data fields>},
    [3] = 2
  }
}

Pretty much the same. The first argument is "this" pointer of course. I deliberately bound to the base virtual function to hide the function signature, so you don't really know what exact derived class it is. But normally you should have the right class name at line _M_f.

Anyway to find out what exact function it is then? There is (and it's how correct virtual function is executed). You need to look at the virtual function table.

Actually you won't have to look very far into the virtual table. You already know the first argument is "this" pointer. This is something new: the first pointer in this object should always be the vtable (unless there are no virtual functions of course, which is not the case here), so:

(gdb) set print asm-demangle on
(gdb) x/a 0x7fffffffdb30
0x7fffffffdb30: 0x401db8 <vtable for Foo1+16>

There it is, it's "Foo1::real()". Knowing the class name, you use gdb to decode the vtable:

(gdb) info vtbl (Foo1*)0x7fffffffdb30
vtable for 'Foo1' @ 0x401db8 (subobject @ 0x7fffffffdb30):
[0]: 0x400e18 <Foo1::real(int, int)>

and locate the function using that function address:

(gdb) l *0x400e18
0x400e18 is in Foo1::real(int, int) (func4.cpp:13).
8       };
9
10      class Foo1 : public Base
11      {
12      public:
13              int real(int v1, int v2) override
14              {
15                      return v1 * v2;
16              }
17      };

If you want to decode the vtable yourself, it's not that hard. "x/a" should give you the function addresses when you give it the vtable address 0x401db8.

That address is actually in the middle of the virtual table. There are two words before it to form class hierarchy and give type info. But that address is the start of the first virtual function (or more if you have more than one).

That's it. I still don't like fat pointer but what can you do about it.

What about lambda? It's not that bad from a quick glance

(gdb) p f6
$1 = {
  <std::_Maybe_unary_or_binary_function<int, int>> = {
    <std::unary_function<int, int>> = {<No data fields>}, <No data fields>},
  <std::_Function_base> = {
    static _M_max_size = 16,
    static _M_max_align = 8,
    _M_functor = {
      _M_unused = {
        _M_object = 0x1,
        _M_const_object = 0x1,
        _M_function_pointer = 0x1,
        _M_member_pointer = &virtual table offset 0, this adjustment 4197477
      },
      _M_pod_data = "\001\000\000\000\000\000\000\000e\f@\000\000\000\000"
    },
    _M_manager = 0x4008e8 <std::_Function_base::_Base_manager<main(int, char const**)::<lambda(int)> >::_M_manager(std::_Any_data &, const std::_Any_data &, std::_Manager_operation)>
  },
  members of std::function<int(int)>:
  _M_invoker = 0x4008a9 <std::_Function_handler<int(int), main(int, char const**)::<lambda(int)> >::_M_invoke(const std::_Any_data &, int &&)>
}

since you can figure out roughly where the lambda is from based on the "namespace" part in _M_invoker, which is main(int, char const **). But if you want even more accuracy, it gets a bit ugly.

Not sure if it's always is the case, but in this case, lambda code is embedded in the invoker. The first step, even with "/m" to show source code, does not help much:

(gdb) disassemble f6._M_invoker
Dump of assembler code for function std::_Function_handler<int(int), main(int, char const**)::<lambda(int)> >::_M_invoke(const std::_Any_data &, int &&):
   0x00000000004008a9 <+0>:     push   %rbp
   0x00000000004008aa <+1>:     mov    %rsp,%rbp
   0x00000000004008ad <+4>:     push   %rbx
   0x00000000004008ae <+5>:     sub    $0x18,%rsp
   0x00000000004008b2 <+9>:     mov    %rdi,-0x18(%rbp)
   0x00000000004008b6 <+13>:    mov    %rsi,-0x20(%rbp)
   0x00000000004008ba <+17>:    mov    -0x18(%rbp),%rax
   0x00000000004008be <+21>:    mov    %rax,%rdi
   0x00000000004008c1 <+24>:    callq  0x4009fd <std::_Function_base::_Base_manager<main(int, char const**)::<lambda(int)> >::_M_get_pointer(const std::_Any_data &)>
   0x00000000004008c6 <+29>:    mov    %rax,%rbx
   0x00000000004008c9 <+32>:    mov    -0x20(%rbp),%rax
   0x00000000004008cd <+36>:    mov    %rax,%rdi
   0x00000000004008d0 <+39>:    callq  0x400be9 <std::forward<int>(std::remove_reference<int>::type&)>
   0x00000000004008d5 <+44>:    mov    (%rax),%eax
   0x00000000004008d7 <+46>:    mov    %eax,%esi
   0x00000000004008d9 <+48>:    mov    %rbx,%rdi
   0x00000000004008dc <+51>:    callq  0x4006fc <<lambda(int)>::operator()(int) const>
   0x00000000004008e1 <+56>:    add    $0x18,%rsp
   0x00000000004008e5 <+60>:    pop    %rbx
   0x00000000004008e6 <+61>:    pop    %rbp
   0x00000000004008e7 <+62>:    retq
End of assembler dump.

The second step however does show something. Disassembling the last callq:

(gdb) disassemble/m 0x4006fc
Dump of assembler code for function <lambda(int)>::operator()(int) const:
11              std::function<int(int)> f6 = [v1](int v2) ->int { return v1 + v2; };
   0x00000000004006fc <+0>:     push   %rbp
   0x00000000004006fd <+1>:     mov    %rsp,%rbp
   0x0000000000400700 <+4>:     mov    %rdi,-0x8(%rbp)
   0x0000000000400704 <+8>:     mov    %esi,-0xc(%rbp)
   0x0000000000400707 <+11>:    mov    -0x8(%rbp),%rax
   0x000000000040070b <+15>:    mov    (%rax),%edx
   0x000000000040070d <+17>:    mov    -0xc(%rbp),%eax
   0x0000000000400710 <+20>:    add    %edx,%eax
   0x0000000000400712 <+22>:    pop    %rbp
   0x0000000000400713 <+23>:    retq

End of assembler dump.

Line 11 is where the lambda is from:

(gdb) l 11
6       }
7
8       int main(int ac, const char **av)
9       {
10              int v1 = 1;
11              std::function<int(int)> f6 = [v1](int v2) ->int { return v1 + v2; };
12              return f6(3);
13      }

Not exactly sure how "v1" is captured. Well... not in the mood to stare at these disassembly listings for long. At higher optimization level, that secondary call may be cut out and _M_invoker contains the lambda body too:

(gdb) disassemble/m f7._M_invoker
Dump of assembler code for function std::_Function_handler<int(int, int), rrr()::<lambda(int, int)> >::_M_invoke(const std::_Any_data &, int &&, int &&):
4               return [](int v1, int v2) ->int { return v1 + v2; };
   0x00000000004007b0 <+0>:     mov    (%rsi),%eax
   0x00000000004007b2 <+2>:     add    (%rdx),%eax

End of assembler dump.
(gdb) l *0x00000000004007b0
0x4007b0 is in std::_Function_handler<int(int, int), rrr()::<lambda(int, int)> >::_M_invoke(const std::_Any_data &, int &&, int &&) (zz.cpp:4).
1       #include <functional>
2       std::function<int(int, int)> rrr()
3       {
4               return [](int v1, int v2) ->int { return v1 + v2; };
5       }
6

Tác giả: pclouds | Liên kết tĩnh | Mánh và mẹo

Thứ ba, 02 Tháng 7 năm 2019 20:18:31 ICT

The key word of matrix multiplication is linear transformation

Tác giả: pclouds | Liên kết tĩnh