In the previous post I have told how to build GNU Smalltalk on the fascinating operating system NetBSD. The interpreter worked pretty fine, but I wanted something more than just simple scripts.
So I have tried to run Seaside.
netstat said that the port 8080 was opened, but I could not reach http://localhost:8080/seaside in the browser.
The first suspiction has fallen on sockets. Of course, it would be hard to debug sockets on such complicated tools as Swazoo and Seaside, so I have took a simple Samuel Montgomery-Blinn’s TCP echo server example for tests. The code has been slightly simplified to run only in a single green thread, to serve a single client and to work only for a single message:
Eval [ | client server string | PackageLoader fileInPackage: #TCP. server := TCP.ServerSocket port: 8000. server waitForConnection. client := server accept. string := client nextLine. client nextPutAll: string; nextPut: Character nl. client flush. client close. ]
This sample works fine on GNU/Linux, but does not work on NetBSD. I have successfully connected on port 8000 with
telnet , but after typing a message and hitting
Enter the server has not replied to me with echo. Server process still hanged in memory.
Great, it is time to take a look under the hood and to understand how GNU Smalltalk sockets work.
Sockets: it is streams all the way down
GNU Smalltalk sockets are implented in a cute way. “End-user” objects are not actually sockets, it is just adaptors that implement a
Stream interface over a concrete socket implementations.
Stream Sockets.AbstractSocket Sockets.DatagramSocket Sockets.MulticastSocket Sockets.ServerSocket Sockets.StreamSocket Sockets.Socket
Caption: End-user class hierarchyIt is obvious that a socket class does actually implement methods like
#nextLine – it is abstract and is implemented somewhere in the
Stream class. Design patterns call it “template methods”, I call it good OO design. The template methods are expressed with another methods whose behavior may be specified or changed in the subclasses.
The underlying implementations are actually
Stream FileDescriptor Sockets.AbstractSocketImpl Sockets.DatagramSocketImpl Sockets.MulticastSocketImpl Sockets.UDPSocketImpl Sockets.OOBSocketImpl Sockets.RawSocketImpl Sockets.ICMP6SocketImpl Sockets.ICMPSocketImpl Sockets.UnixDatagramSocketImpl Sockets.SocketImpl Sockets.TCPSocketImpl Sockets.UnixSocketImpl
Caption: Implementation class hierarchyAgain, it is quite logical – the core BSD sockets are represented as file descriptors in the user space (remember that everything is file in Unix). Depending on the type of a file descriptor, calling common system calls (such as
fcntl(2) ) on it will result in invoking a different code at the kernel space.
Files, sockets and all the I/O as well is the intercommunication with the outside world. It can not be implemented in pure Smalltalk, at the lowest level we have to deal with the API, which the operating system provides for us. In the case of files and sockets we are working with file descriptors – integer values in Unix systems.
In GNU Smalltalk, file descriptors are represented with
FileDescriptor class. Every object of this class holds a numeric instance variable
fd – actually the Unix file descriptor.
All the high-level I/O methods, which the programmer uses in the application, are expressed with low-level access methods like
#fileOp:with:ifFail: and so on. These methods call the same primitive
VMpr_FileDescriptor_fileOp and the succeeding processing goes on the VM side. Depending on an index passed to the
#fileOp: from a higher-level method, a different file operation will be performed.
The basic socket implementation class
AbstractSocketImpl overrides the
#fileOp: methods to call
VMpr_FileDescriptor_socketOp primitive instead of
Now, after digging into the implementation details, lets return back to the echo server example. If we will interrupt the hanged-up server process, we will receive the following stack trace:
Sockets.TCPSocketImpl(Sockets.AbstractSocketImpl)>>ensureReadable optimized  in Sockets.StreamSocket>>newReadBuffer: Sockets.ReadBuffer>>atEnd Sockets.Socket(Sockets.StreamSocket)>>peek Sockets.Socket(Sockets.StreamSocket)>>atEnd Sockets.Socket(Stream)>>nextLine
As we can see, our process has stuck on the call to
AbstractSocketImpl>>ensureReadable , which was implicitly invoked via a chain of calls from
Stream>>nextLine method does a simple thing: it checks weither there is data available and reads it byte by byte until a newline character will be reached.
AbstractSocketImpl>>ensureReadable is a little bit more interesting. It blocks the current Smalltalk thread and waits until there will be data available for reading. It involves the
VMpr_FileDescriptor_socketOp primitive too. Lets now go down from Smalltalk to the virtual machine side.
Asynchronous I/O for the win
Our sample server is synchronous. First of all, it waits for a client connection, and then it waits again while client will send us a line of text. All these operations are synchronous – we can not do something else inside a single Smalltalk thread while waiting for an event.
Such operations are called “blocking”. If we wrote our echo server on C, we would use a blocking sockets, so system calls like
recv(2) would block our server process until a client will connect and send some data respectively. It is a very simple and straightforward scheme that is often used in simple applications.
We could assume that GNU Smalltalk’s
#nextLine are implemented in the same way, since these method provides us the same blocking behavior, but actually it is not true.
GNU Smalltalk implements green threads (aka Smalltalk
Process es) for multitasking inside VM, it does not support native system threads, so calling
recv(2) on a true blocking socket would block the entire virtual machine on a time of the call. It is completely unacceptable, so socket IO is implemented in a more cute way with non-blocking sockets.
When a Smalltalk process needs to wait for a specific event (client connection or incoming data) on a specific socket, the
AbstractSocketImpl>>ensureReadable is called.
#ensureReadable creates and locks a
Semaphore to block the current Smalltalk process.
On the virtual machine side, via call to the primitive
VMpr_FileDescriptor_socketOp with operational codes 14 and 13, the following happens:
- SIGIO signal handler is installed on the socket;
- Socket is added to a table of polled descriptors;
- If there is no code to execute and all Smalltalk processes are sleeping (waiting for data),
sigsuspend(2)is called. In this state the virtual machine process will sleep in waiting of the arrival of any Unix signal. I did not tested it, but I assume that the VM process can handle SIGIO even without of calling
- If there is an activity on a file descriptor, i.e. incoming connection or data, the VM process will receive SIGIO and the signal handler (installed on the first step) will be executed;
- This handler will check the table of polled descriptors. For every ready for I/O descriptor VM will unlock the appropriate semaphore and the appropriate Smalltalk process will resume its execution;
- The descriptor is removed from a table of polled descriptors.
Now we get back on the Smalltalk side. After resuming from
#ensureReadable , we know that a descriptor is ready for IO and calling
recv(2) will not block the interpreter. That’s it!
A set of simple debugging
printf s has been inserted in the VM and has shown that the VM really goes to sleep after the call to the
#nextLine . Looks like the gst process just does not receive SIGIO on incoming data. I saw the only way to check it – to debug the NetBSD kernel.
See also: Episode II