Even though Haskell is considered to be a high-level programming language, it’s a good fit for various system-level applications and services as well, similar to how Go is used for many applications.
A lot of functionality can be implemented in high-level user-space libraries, but at some point we must interact with the system and its kernel through syscalls. In the GHC Haskell ecosystem, we use the wrappers provided by the system C library instead of directly calling into the kernel, which doesn’t take a lot of effort thanks to GHC’s excellent Foreign Function Interface support. Exposing the
int openat(int dirfd, const char *pathname, int flags)
function from libc to Haskell code is as simple as
foreign import capi safe "fcntl.h openat"
c_openat :: CInt -> CString -> CInt -> IO CInt
Now Haskell code can call c_openat to open a file.
The above code is, however, a bit too low-level for most practical purposes:
- When
openatfails, it returns-1anderrnois set. Checking whether the return value of thec_openataction is-1is not very Haskell’esque: this is where we expect an exception to be thrown. - The
CInttype is very generic. Instead, we’d like to use a type specific to file-descriptors, so a correspondingread,writeorcloseaction will take such type as an argument, not any arbitraryCInt. c_openattakes aCStringargument, which is aPtr CChar. In high-level code, this is not a type we regularly work with. An API using a less clumsy path type would be welcome.
Luckily, there’s a popular library in the ecosystem which provides just that:
unix. In its
System.Posix.IO
module, it provides
openFdAt :: Maybe Fd -> FilePath -> OpenMode -> OpenFileFlags -> IO Fd
which is a wrapper around c_openat. It’s using the Fd
type where applicable, it will throw an IOError
when the call fails using handy utility functions like
Foreign.C.Error.throwErrnoIfMinus1 :: (Eq a, Num a) => String -> IO a -> IO a
and takes a
FilePath (a.k.a. String) as argument, internally turning this into a
CString for the lifetime of the call to c_openat using
Foreign.C.String.withCString :: String -> (CString -> IO a) -> IO a
Some problems with unix
The unix package has been serving the Haskell ecosystem well for many years
(and, without a doubt, for many years to come). However, it’s not without its
shortcomings:
- The
FilePathtype (as used in, e.g., theSystem.Posix.IOmodule) being aStringcomes with performance implications. Hence, theSystem.Posix.IOmodule was cloned intoSystem.Posix.IO.ByteStringand reworked to useByteStringvalues as paths. This duplication requires certain code changes to be applied in multiple modules. - Neither
FilePathnorByteStringare suitable types for paths, because of encoding issues. Hence, new clones of applicable modules were created now usingPosixPathvalues (e.g.,System.Posix.IO.PosixString), further increasing code-duplication. - The
unixlibrary exposes functions that are not necessarily available on all supported platforms, e.g., the WASM/WASI platform lacks some. Availability of library functions is checked using aconfigurescript at build time, but when some library function is not found, the corresponding binding inunixis implemented, unconditionally, asioError (ioeSetLocation unsupportedOperation "...")This breaks the “If it compiles, it works” mantra, since any call to the function will most definitely not work.
unixprovides somewhat-high-level access the system functions, though one can argue the wrappers are, sometimes, too high-level, and lower-level interfaces are not made available. As an example, theopenAtcall above takes anOpenFileFlagsargument which is a structure whose fields are mapped to bits in theflagsargument ofopenat. However, if one want to use a flag that’s not available inOpenFileFlags(say,O_PATHon a Linux system), this is not possible. This forced me to implement another set of bindings foropenatinlandlock-hs: not a lot of effort, but duplication nonetheless.- Most system functions are effectful (that’s why they exist in the first
place), so a big part of the
unixAPI lives inIO. When working with monad stacks layered on top ofIO, this requires a lot ofliftIOing.
The xinu experiment
To experiment with alternative implementation strategies to expose system
functions to Haskell code, I created the xinu
package. Actually, two packages:
-
xinu-ffi, which exposes FFI bindings to library functions, using aconfigurescript to detect system capabilities. If a library function is not found atconfiguretime,xinu-ffiwill not provide a binding to it. Hence, the library API can depend on the build environment. However, axinu-ffi.hheader file is installed so dependent package can detect availability of functions using theCPPlanguage extension. -
xinu, and a couple of internal libraries, which expose higher-level APIs to a developer.
The xinu library, similar to unix, supports multiple path types. However,
unlike unix, this is not implemented by copying the modules. Instead, it
leverages GHC’s
Backpack feature,
which brings ML module functor-like functionality to Haskell. Basically,
Backpack allows us to write code, abstracted over a module for which we only
provide the signature (i.e., the types and functions it must expose). Then, at
build time, we can specify one or more implementations of this signature and
create instanciations of the abstract module. Hence, without any code
duplication, there’s System.Xinu.IO.FilePath and System.Xinu.IO.ByteString,
both instanciations of System.Xinu.IO.Abstract (over
System.Xinu.Path.FilePath and System.Xinu.Path.ByteString). The latter are
two modules implementing a rather simple signature:
signature System.Xinu.Path (
Path
, toString
, withPath
) where
import Control.Monad.Catch (MonadMask)
import Control.Monad.IO.Class (MonadIO)
import Foreign.C.String (CString)
data Path
-- Execute an action, passing the given Path as a CString.
withPath :: (MonadIO m, MonadMask m) => Path -> (CString -> m a) -> m a
-- Used for error reporting.
toString :: Path -> String
Unlike the unix library xinu will expose a different API depending on
library function availability of the build environment (based on findings of
xinu-ffi’s configure script). Similar to xinu-ffi, it provides a xinu.h
header file, so dependents who care about availability of functions (e.g., to
provide different implementations based on what’s available) can use CPP.
Finally, xinu functions aren’t IO (though they’re SPECIALIZEd for it).
Instead, it relies on the MonadIO type-class to liftIO xinu-ffi functions
where necessary. Furthermore, errors are reported and (where applicable) safely
handled using the MonadThrow and MonadMask type-classes from the
exceptions package. This
allows xinu functions to be used within arbitrary monad stacks (assuming an
instance of MonadIO and MonadThrow/MonadMask is present).
So, xinu-ffi exposes
System.Xinu.IO.FFI.c_openat :: Int32 -> CString -> Int32 -> Word32 -> IO Int32
while xinu exposes, among others,
System.Xinu.IO.FilePath.openat :: (MonadIO m, MonadMask m)
=> Maybe Int32
-> FilePath
-> Int32
-> Maybe Word32
-> m Int32
and
System.Xinu.IO.ByteString.openat :: (MonadIO m, MonadMask m)
=> Maybe Int32
-> ByteString
-> Int32
-> Maybe Word32
-> m Int32
Of course, the intent is to use higher-level types (like Fd) instead. This
should be fairly simple to do, especially using
coerce
since indeed, we expect (or rather, require) such Fd to be equal to an Int32
(at least on this platform).
Learnings and Questions
It’s a bit too soon to know where this experiment could be heading. However, some early experiences:
- Backpack doesn’t work when a package has
Build-TypeConfigure, which is a bummer: it forcesxinu-ffito be a separate package, which increases maintenance burden. If it could be a (public) internal library, this would make things quite a bit easier. - A multi-package repository (using
cabal.project) and Backpack seems to trigger a dependency resolution issue in Cabal, causingcabal install --dependencies-only allto fail. stackdoesn’t support Backpack. This is a known issue but limits adoption of Backpack in the ecosystem, which is a shame, since ML-style module functors are a great way to reduce code duplication without incurring any runtime performance overhead.- Haddock doesn’t like
Mixins, internal libraries andReexported-Modules. Basically, no decent API documentation ofxinucan be generated. There are several related issues filed upstream. - Given the new
OsPathfamily of types, does it make sense to keep providingFilePath,ByteStringand other implementations? - More tests to ensure exception handling is working as desired are needed. How
is this code different from the
unliftiopackage?
Conclusion
Even though xinu is in no way meant to be as extensive as unix, it’s always
interesting to explore different approaches to a given problem, especially when
new functionality can be leveraged which wasn’t available when the original
solution was coded.
I’d love to hear your feedback. Would a library like xinu be of any use to
you? What’s missing? Head to the repository’s
Discussions section!