apropos

A Breakneck Guide to Nim

Note: This article is a work-in-progress with many unfinished and entirely missing sections.


Nim is a general-purpose programming language designed by Andreas Rumpf (Araq). It can be variously described as all of the following:

Code written in Nim looks like this:

import std/strformat

type Person = object
  name: string
  age: Natural # Ensures the age is positive

let people = @[
  Person(name: "John", age: 45),
  Person(name: "Kate", age: 30)
]

proc printAges(people: seq[Person]) =
  for person in people:
    echo fmt"{person.name} is {person.age} years old"

printAges(people)

Without further ado, let’s jump right into it.

Table of Contents

Design Decisions

[skip]

significant whitespace

Perhaps the most obvious feature of Nim is its syntax: it looks like Python! Where’d all the brackets go? Statement blocks in Nim are determined through significant whitespace.

import std/sugar
func takesALambda(a: (string, string) -> string) =
  ... # implementation omitted

# regular syntax for multi-line lambdas
takesALambda(
  func (a, b: string): string =
    return a + b
)

# same example using syntax sugar. types can be elided!
takesALambda((a, b) => (a + b))

More specifically: significant whitespace is queried while not in a statement, to determine the scope of the next line. As indentation determines scope only outside of expressions: when writing multi-line expressions, you must break after an operator.

# break long lines like this...
hereAreSomeQuiteLongFunctions() + andYetAnother() +
  thatReturnValuesForthwith() # indentation here may vary for aesthetics

# not like this! compilation error
hereAreSomeQuiteLongFunctions() + andYetAnother()
  + thatReturnValuesForthwith() # the above is a complete expression

Standard indentation practice uses two spaces, but any (consistent) number works.

Indenting with tabs in Nim is disallowed at the compiler level. I consider this an excellent design decision. If you absolutely must: adding #? replace("\t", " ") to the beginning of any Nim file will cause the compiler to treat tabulation characters as two spaces when compiling.

uniform function call syntax

A particularly unique feature of Nim (okay, not unique - D did it first) is what is known as uniform function call syntax (or UFCS for short).

In short, the following statements are equivalent under UFCS:

let a = "Hello, UFCS!"
echo a.len()
echo len(a)
echo a.len

This is made possible by the revelation that there isn’t all that much difference between a method call, a function call that takes a class, and an inherent property of a type. To quote a community member: “How many times have you pondered whether an operation should be a function, member, or method? It’s just a distracting detail with no benefit. And now you don’t need to care!”

This makes many things much nicer in practice. Function chaining, in particular, is now easy: a.foo().bar().baz()

style insensitivity

Nim is partially style insensitive.

In other words: identifiers in Nim are considered equal if - aside from their first character - they match with underscores removed and when taken to lowercase. This first character exception is so that code like let dog: Dog can be written.

In code:

func same(a, b: string): bool =
  a[0] == b[0] and
  a.replace("_", "").toLowerAscii == b.replace("_", "").toLowerAscii

This has proven to be somewhat of a controversial feature. Critics say it hampers IDE support, breaks tooling, makes reading documentation harder, and can cause consistency issues. Proponents say language servers handle it fine, alternative tooling is available, you get used to reading documentation, and it helps with codebase consistency.

I like it quite a lot. Being able to adopt a consistent snake_case or camelCase style in your codebase regardless of what external libraries do is a great boon: and optional (but likely to become default) --stylecheck flags can treat inconsistencies as warnings or errors. I would encourage anyone to try it out for a little while before flaming it.

Basic Syntax

[skip]

let var const

There are three assignment keywords in Nim: let (for immutable variables), var (for mutable variables), and const (for compile-time evaluated constants).

const testValues = [1, 0, 25]

let immutable = "This variable cannot be changed."

var mutable: string
mutable = stdin.readLine()
mutable = "Disregarding user input..."

The = operator is used for assignment and reassignment.

Note that you can declare a (mutable) variable without assigning anything to it. With the exception of ref and ptr values, it has a default value depending on the type: more on nil/notnil later.

statements vs. expressions

conditional assignment

more on indentation

comments

Comments are prefixed with the pound sign #.

Documentation comments are prefixed with ##. Common convention (and the one used by nim doc) is to put documentation comments directly beneath function signatures or type declarations.

Multiline comments are made with #[ ... ]# and can be nested.

imports and includes

logic and operations

Control Flow

[skip]

if / elif / else

when / else

when statements are statically (compile-time) evaluated if statements. The else keyword can be used with them.

While there’s not much else to say about when statements themselves: the kinds of conditions they evaluate can be very helpful to see examples of.

when defined(macos):
when defined(js):

# check if the file is compiled with `-d:release`
when not defined(release):
  # do some debug code here

# check if some code compiles with no errors
when compiles(3 + 4):
  # the `+` operation is defined for integers

# check whether a library provides a certain feature
when not declared(strutils.toUpper):
  # let's provide our own, then

case / of

The case statement allows for basic compiler-checked pattern matching. A case statement must handle all possibilities.

var x = stdin.readChar()
case x
of 'a'..'z', 'A'..'Z':
  echo "A letter!"
of '0'..'9':
  echo "A number!"
else:
  echo "Something else!"

Idiomatic Nim does not put a colon after the case parameter, nor indents the of blocks. Both of those are, however, valid Nim. (This may change in the future).

for / in

todo

while

todo

block / break / continue

todo

try / finally / except / raise

todo

Type System

[skip]

Nim has a static (ie. compile-time evaluated) and comprehensive type system.

basic types

ints

floats

chars

bools

strings

object types

object variants

By combining the case statement with Nim’s object types, it is possible to create what are known as object variants. Object variants can have different fields depending on the value of the matched field. These are also known by a wide variety of other names: including variant types, tagged unions, and discriminated unions.

This is best explained with an example:

import std/tables

type NodeKind = enum
  Text, Element

type Node = ref object
  x, y: float
  width, height: float
  case kind
  of Text:
    text: string
  of Element:
    tag: string
    attributes: Table[string, string]
    children: seq[Node]

In many cases, variant types provide a more idiomatic alternative to generics. However, they have their limitations: field names may not be reused across cases, and the kind of the variant is just a field within the object rather than a higher-level identifier as in Rust’s enums.

openarrays

Often, it’s helpful to write code that can deal with multiple kinds of iterable types: for example, a function that prints out every element in an array or a sequence. While generics are powerful, they run into a limitation here: arrays are sized! We would have to explicitly parametrize over every size of the array that we want to use. Furthermore, even disregarding arrays, it is frequently helpful to be generic over both sequences and strings, and annotating : string | seq[T] every time gets old.

Openarrays solve this! They provide a special openarray type that is generic over all arrays (of any length, regardless of their inner type), sequences (regardless of their inner type), and strings. Just like other generic types, openarrays are only available as parameter types.

Functions and Procedures

[skip]

procedures

What are typically known as functions in other languages are known in Nim as procedures.

Procedures use the proc keyword, followed by a name, (optional) parameters, an (optional) return type, and the procedure body.

proc plusOrMinus(a: bool, b, c: int): int =
  if a:
    return b + c
  else:
    return b - c

# You don't even need parentheses if your procedure doesn't take parameters!
proc anotherProcedure =
  echo "This procedure doesn't return anything."

functions

What Nim considers functions are typically known as pure functions in other languages. Functions are declared identically to procedures, only with the func keyword instead of the proc keyword. Functions are statically guaranteed by the compiler to have no side effects.

Side effects are considered to be any action modifying state outside of the function’s current scope. This includes modifying a global variable declared outside of the function, modifying : var T parameters (more on those later), and I/O.

Side effects do not currently include the modification of a ref type (more on those later), but this behavior is expected to change in the near future. See: {.experimental: "strictFuncs".}

As a special exception, the debugEcho procedure is not considered to have side effects - despite dealing with I/O - through compiler magic. This is to allow for easier debugging of pure functions.

Note that this guide has used the terms procedure and function interchangeably, and will continue to do so.

return and result

The return keyword returns the provided value and instantly exits the function, just like many other languages.

While you can simply return out of a procedure any time, Nim also provides an implicit result variable.

The result variable is initialized to the default value of the return type at the beginning of the function’s scope. If nothing has been explicitly returned by the end of the scope, the current value of result is returned. This allows for writing cleaner, more idiomatic code.

An example of return and result is as follows:

discard

The discard keyword allows for calling a statement that returns a value without doing anything with that value. This is best explained with an example:

proc returns(): bool =
  echo "This procedure runs some code and returns a value."
  return true

# fails: expression `returns()` is of type `bool` and has to be used...
returns()

# compiles: ... or discarded
discard returns()

parameters

Every parameter must have a type. Multiple adjacent parameters sharing a type can have their types elided.

proc subtract(a: int, b: int): int =
  return a - b

proc supersubtract(a, b, c: int): int =
  return a - b - c

Multiple parameters of the same type… the auto feature… however, this is broadly considered to be an anti-pattern.

ref… copied in…

sink parameters

mutable parameters

Parameters are immutable by default and passed by value (Copyed, for Rust programmers). The var keyword is reused in function signatures to denote a mutable parameter. This is best explained with an example:

proc immutableParameters(a: bool) =
  a = true

proc mutableParameters(a: var bool) =
  a = true

let a = false
immutableParameters(a) # Error: `a` cannot be assigned to
mutableParameters(a) # Compiles fine, a is now true

The compiler will try and optimize copies into moves, and can be helped out some by the programmer. More on that later.

Note that ref types behave somewhat unintuitively as parameters. A ref type is simply an automatically-managed pointer to some memory. The pointer (memory address), not the memory itself, is copied into the function signature. This makes the data of ref types mutable without the var annotation.

A var on a ref type parameter, then, lets you change what group of data that ref type variable is pointing to. This is usually a misnomer.

static parameters

The static keyword is also reused in function signatures to denote a parameter that must be known at compile time. This is best explained with an example:


Note that a: static T is a: static[T].

varargs

The varargs keyword allows you to specify that a function can take a dynamic number of parameters. Only one parameter can be varargs, and it must be the last parameter in the function signature. This is best explained with an example:

Metaprogramming

[skip]

todo.

Interop

[skip]

todo.

Memory Management

[top]

Nim’s memory management strategy is optimized reference counting with a cycle breaker. This may surprise some people, because one of Nim’s primary design goals is being efficient, and reference counting is typically considered to be less efficient than tracing GCs.

Nim’s version of reference counting (called ARC/ORC) brings to the table two things, however:

  1. Hard determinism
  2. Optimizing reference counts away with move semantics.

The former (hard determinism) comes from ARC/ORC not doing any sort of magic with deferred reference counts, and instead injecting destructors into the generated code. These injected destructors also provide other niceties, such as automagically-closing file streams.

The latter (optimizing reference counts away) comes from the move semantics used by Nim, Lobster, and Rust. Reference counting typically incurs a fairly substantial overhead: a counter increment every time an object is referenced, and a counter decrement + check every time something that references the object goes out of scope. However: if you can statically prove that no references exceed an object’s lifetime in the compilation step, then you don’t need that reference count.

This is the guiding principle behind Rust’s borrow checker…

ARC/ORC also has the advantage of greatly simplifying memory management across threads, but we won’t get into that here. Mostly because I don’t understand how it works.