Debugging GNU Smalltalk on NetBSD: Episode I

Posted on February 26, 2011

In the previous post I have told how to build GNU Smalltalk on the fascinating operating system NetBSD. The interpreter worked pretty fine, but I wanted something more than just simple scripts.

The problem

So I have tried to run Seaside. netstat said that the port 8080 was opened, but I could not reach http://localhost:8080/seaside in the browser.

The first suspiction has fallen on sockets. Of course, it would be hard to debug sockets on such complicated tools as Swazoo and Seaside, so I have took a simple Samuel Montgomery-Blinn’s TCP echo server example for tests. The code has been slightly simplified to run only in a single green thread, to serve a single client and to work only for a single message:

Eval [
    | client server string |

    PackageLoader fileInPackage: #TCP.

    server := TCP.ServerSocket port: 8000.
    server waitForConnection.

    client := server accept.
    
    string := client nextLine.
    client nextPutAll: string; nextPut: Character nl.

    client flush.
    client close.
]

This sample works fine on GNU/Linux, but does not work on NetBSD. I have successfully connected on port 8000 with telnet , but after typing a message and hitting Enter the server has not replied to me with echo. Server process still hanged in memory.

Great, it is time to take a look under the hood and to understand how GNU Smalltalk sockets work.

Sockets: it is streams all the way down

GNU Smalltalk sockets are implented in a cute way. “End-user” objects are not actually sockets, it is just adaptors that implement a Stream interface over a concrete socket implementations.

Stream
  Sockets.AbstractSocket
    Sockets.DatagramSocket
      Sockets.MulticastSocket
    Sockets.ServerSocket
    Sockets.StreamSocket
      Sockets.Socket

Caption: End-user class hierarchyIt is obvious that a socket class does actually implement methods like #nextLine – it is abstract and is implemented somewhere in the Stream class. Design patterns call it “template methods”, I call it good OO design. The template methods are expressed with another methods whose behavior may be specified or changed in the subclasses.

The underlying implementations are actually FileDescriptor s.

Stream
  FileDescriptor
    Sockets.AbstractSocketImpl
      Sockets.DatagramSocketImpl
        Sockets.MulticastSocketImpl
          Sockets.UDPSocketImpl
        Sockets.OOBSocketImpl
        Sockets.RawSocketImpl
          Sockets.ICMP6SocketImpl
          Sockets.ICMPSocketImpl
        Sockets.UnixDatagramSocketImpl
      Sockets.SocketImpl
        Sockets.TCPSocketImpl
        Sockets.UnixSocketImpl

Caption: Implementation class hierarchyAgain, it is quite logical – the core BSD sockets are represented as file descriptors in the user space (remember that everything is file in Unix). Depending on the type of a file descriptor, calling common system calls (such as read(2) , write(2) , fcntl(2) ) on it will result in invoking a different code at the kernel space.

Files, sockets and all the I/O as well is the intercommunication with the outside world. It can not be implemented in pure Smalltalk, at the lowest level we have to deal with the API, which the operating system provides for us. In the case of files and sockets we are working with file descriptors – integer values in Unix systems.

In GNU Smalltalk, file descriptors are represented with FileDescriptor class. Every object of this class holds a numeric instance variable fd – actually the Unix file descriptor.

All the high-level I/O methods, which the programmer uses in the application, are expressed with low-level access methods like #fileOp: , #fileOp:ifFail: , #fileOp:with: , #fileOp:with:ifFail: and so on. These methods call the same primitive VMpr_FileDescriptor_fileOp and the succeeding processing goes on the VM side. Depending on an index passed to the #fileOp: from a higher-level method, a different file operation will be performed.

The basic socket implementation class AbstractSocketImpl overrides the #fileOp: methods to call VMpr_FileDescriptor_socketOp primitive instead of VMpr_FileDescriptor_fileOp .

Now, after digging into the implementation details, lets return back to the echo server example. If we will interrupt the hanged-up server process, we will receive the following stack trace:

Sockets.TCPSocketImpl(Sockets.AbstractSocketImpl)>>ensureReadable
optimized [] in Sockets.StreamSocket>>newReadBuffer:
Sockets.ReadBuffer>>atEnd
Sockets.Socket(Sockets.StreamSocket)>>peek
Sockets.Socket(Sockets.StreamSocket)>>atEnd
Sockets.Socket(Stream)>>nextLine

As we can see, our process has stuck on the call to AbstractSocketImpl>>ensureReadable , which was implicitly invoked via a chain of calls from Stream>>nextLine .

Stream>>nextLine method does a simple thing: it checks weither there is data available and reads it byte by byte until a newline character will be reached.

AbstractSocketImpl>>ensureReadable is a little bit more interesting. It blocks the current Smalltalk thread and waits until there will be data available for reading. It involves the VMpr_FileDescriptor_socketOp primitive too. Lets now go down from Smalltalk to the virtual machine side.

Asynchronous I/O for the win

Our sample server is synchronous. First of all, it waits for a client connection, and then it waits again while client will send us a line of text. All these operations are synchronous – we can not do something else inside a single Smalltalk thread while waiting for an event.

Such operations are called “blocking”. If we wrote our echo server on C, we would use a blocking sockets, so system calls like accept(2) and recv(2) would block our server process until a client will connect and send some data respectively. It is a very simple and straightforward scheme that is often used in simple applications.

We could assume that GNU Smalltalk’s #waitForConnection and #nextLine are implemented in the same way, since these method provides us the same blocking behavior, but actually it is not true.

GNU Smalltalk implements green threads (aka Smalltalk Process es) for multitasking inside VM, it does not support native system threads, so calling accept(2) or recv(2) on a true blocking socket would block the entire virtual machine on a time of the call. It is completely unacceptable, so socket IO is implemented in a more cute way with non-blocking sockets.

When a Smalltalk process needs to wait for a specific event (client connection or incoming data) on a specific socket, the AbstractSocketImpl>>ensureReadable is called. #ensureReadable creates and locks a Semaphore to block the current Smalltalk process.

On the virtual machine side, via call to the primitive VMpr_FileDescriptor_socketOp with operational codes 14 and 13, the following happens:

  1. SIGIO signal handler is installed on the socket;
  2. Socket is added to a table of polled descriptors;
  3. If there is no code to execute and all Smalltalk processes are sleeping (waiting for data), sigsuspend(2) is called. In this state the virtual machine process will sleep in waiting of the arrival of any Unix signal. I did not tested it, but I assume that the VM process can handle SIGIO even without of calling sigsuspend(2).
  4. If there is an activity on a file descriptor, i.e. incoming connection or data, the VM process will receive SIGIO and the signal handler (installed on the first step) will be executed;
  5. This handler will check the table of polled descriptors. For every ready for I/O descriptor VM will unlock the appropriate semaphore and the appropriate Smalltalk process will resume its execution;
  6. The descriptor is removed from a table of polled descriptors.

Now we get back on the Smalltalk side. After resuming from #ensureReadable , we know that a descriptor is ready for IO and calling accept(2) or recv(2) will not block the interpreter. That’s it!

A set of simple debugging printf s has been inserted in the VM and has shown that the VM really goes to sleep after the call to the #nextLine . Looks like the gst process just does not receive SIGIO on incoming data. I saw the only way to check it – to debug the NetBSD kernel.

See also: Episode II