Skip to content

Expressions

Expressions are the parts of the pipeline language that evaluate to some value. A multitude of different expression types known from other programming languages are supported by Safe-DS, from basic literals to lambdas.

Literals

Literals are the basic building blocks of expressions. They describe a fixed, constant value.

Int Literals

Int literals denote integers. They use the expected syntax. For example, the integer three is written as 3.

Float Literals

Float literals denote floating point numbers. There are two ways to specify them:

  • Decimal form: One half can be written as 0.5. Note that neither the integer part nor the decimal part can be omitted, so .5 and 0. are syntax errors.
  • Scientific notation: Writing very large or very small numbers in decimal notation can be cumbersome. In those cases, scientific notation is helpful. For example, one thousandth can be written in Safe-DS as 1.0e-3 or 1.0E-3. You can read this as 1.0 × 10⁻³. When scientific notation is used, it is allowed to omit the decimal part, so this can be shortened to 1e-3 or 1E-3.

String Literals

String literals describe text. Their syntax is simply text enclosed by double quotes: "Hello, world!". Various special characters can be denoted with escape sequences:

Escape sequence Meaning
\b Backspace
\f Form feed
\n New line
\r Carriage return
\t Tab
\v Vertical tab
\0 Null character
\' Single quote
\" Double quote
\{ Opening curly brace (used for template strings)
\\ Backslash
\uXXXX Unicode character, where XXXX is its hexadecimal code

String literals can contain also contain raw line breaks:

"Hello,

world!"

In order to interpolate text with other computed values, use template strings.

Boolean Literals

To work with truthiness, Safe-DS has the two boolean literals false and true.

null Literal

To denote that a value is unknown or absent, use the literal null.

Operations

Operations are special functions that can be applied to one or two expressions. Safe-DS has a fixed set of operations that cannot be extended. We distinguish between

  • prefix operations (general form <operator> <operand>), and
  • infix operations (general form <left operand> <operator> <right operand>).

Operations on Numbers

Numbers can be negated using the unary - operator:

  • The integer negative three is -3.
  • The float negative three is -3.0.

The usual arithmetic operations are also supported for integers, floats and combinations of the two. Note that when either operand is a float, the whole expression is evaluated to a float.

  • Addition: 0 + 5 (result is an integer)
  • Subtraction: 6 - 2.9 (result is a float)
  • Multiplication: 1.1 * 3 (result is a float)
  • Division: 1.0 / 4.2 (result is a float)

Finally, two numbers can be compared, which results in a boolean. The integer 3 for example is less than the integer 5. Safe-DS offers operators to do such checks for order:

  • Less than: 5 < 6
  • Less than or equal: 1 <= 3
  • Greater than or equal: 7 >= 7
  • Greater than: 9 > 2

Logical Operations

To work with logic, Safe-DS has the two boolean literals false and true as well as operations to work with them:

  • (Logical) negation (example not a): Output is true if and only if the operand is false:
not a false true
  true false
  • Conjunction (example a and b): Output is true if and only if both operands are true. Note that the second operand is always evaluated, even if the first operand is false and, thus, already determines the result of the expression. The operator is not short-circuited:
a and b false true
false false false
true false true
  • Disjunction (example a or b): Output is true if and only if at least one operand is true. Note that the second operand is always evaluated, even if the first operand is true and, thus, already determines the result of the expression. The operator is not short-circuited:
a or b false true
false false true
true true true

Equality Checks

There are two different types of equality in Safe-DS, identity and structural equality. Identity checks if two objects are one and the same, whereas structural equality checks if two objects have the same structure and content. Using a real world example, two phones of the same type would be structurally equal but not identical. Both types of equality checks return a boolean literal true if the check was positive and false if the check was negative. The syntax for these operations is as follows:

  • Identity: 1 === 2
  • Structural equality: 1 == 2

Safe-DS also has shorthand versions for negated equality checks which should be used instead of an explicit logical negation with the not operator:

  • Negated identity: 1 !== 2
  • Negated structural equality: 1 != 2

Elvis Operator

The elvis operator ?: (given its name because it resembles Elvis's haircut) is used to specify a default value that should be used instead if the left operand is null. This operator is not short-circuited, so both operand are always evaluated. In the following example the whole expression evaluates to nullableExpression if this value is not null and to 42 if it is:

nullableExpression ?: 42

Template Strings

String literals can only be used to denote a fixed string. Sometimes, however, parts of the string have to be computed and then interpolated into the remaining text. This is done with template strings. Here is an example:

"1 + 2 = {{ 1 + 2 }}"

The syntax for template strings is similar to string literals: They are also delimited by double quotes, the text can contain escape sequences, and raw newlines can be inserted. The additional syntax are template expressions, which are any expression enclosed by {{ and }}. There must be no space between the curly braces.

These template expressions are evaluated, converted to a string and inserted into the template string at their position. The template string in the example above is, hence, equivalent to the string literal "1 + 2 = 3".

References

References are used to refer to a declaration, such as a class or a placeholder. The syntax is simply the name of the declaration, as shown in the next snippet where we first declare a placeholder called one and then refer to it when computing the value for the placeholder called two:

val one = 1;
val two = one + one;

In order to refer to global declarations in other packages, we first need to import them.

Calls

Calls are used to trigger the execution of a specific action, which can, for example, be the creation of an instance of a class or executing the code in a segment. Let's look at an example:

First, we show the code of the segment that we want to call.

segment createDecisionTree(maxDepth: Int = 10) {
    // ... do something ...
}

This segment has a single parameter maxDepth, which must have type Int, and has the default value 10. Since it has a default value, we are not required to specify a value when we call this segment. The most basic legal call of the segment is, thus, this:

createDecisionTree()

This calls the segment createDecisionTree, using the default maxDepth of 10.

The syntax consists of these elements:

  • The callee of the call, which is the expression to call (here a reference to the segment createDecisionTree)
  • The list of arguments, which is delimited by parentheses. In this case the list is empty, so no arguments are passed.

If we want to override the default value of an optional parameter or if the callee has required parameters, we need to pass arguments. We can either use positional arguments or named arguments.

In the case of positional arguments, they are mapped to parameters by position, i.e. the first argument is assigned to the first parameter, the second argument is assigned to the second parameter and so forth. We do this in the following example to set maxDepth to 5:

createDecisionTree(5)

The syntax for positional argument is simply the expression we want to pass as value.

Named arguments, however, are mapped to parameters by name. On the one hand, this can improve readability of the code, since the meaning of a value becomes obvious. On the other hand, it allows to override only specific optional parameters and keep the rest unchanged. Here is how to set maxDepth to 5 using a named argument:

createDecisionTree(maxDepth = 5)

These are the syntactic elements:

  • The name of the parameter for which we want to specify a value.
  • An equals sign.
  • The value to assign to the parameter.

Passing Multiple Arguments

We now add another parameter to the createDecisionTree segment:

segment createDecisionTree(isBinary: Boolean, maxDepth: Int = 10) {
    // ... do something ...
}

This allows us to show how multiple arguments can be passed:

createDecisionTree(isBinary = true, maxDepth = 5)

We have already seen the syntax for a single argument. If we want to pass multiple arguments, we just separate them by commas. A trailing comma is allowed.

Restrictions For Arguments

There are some restriction regarding the choice of positional vs. named arguments and passing arguments in general:

  • For all parameters of the callee there must be at most one argument.
  • For all required parameters there must be exactly one argument.
  • After a named argument all arguments must be named.

Depending on the callee, a call can do different things. The following table lists all legal callees and what happens if they are called:

Callee Meaning
Class Create a new instance of the class. The class must have a constructor to be callable. The call evaluates to this new instance.
Enum Variant Creates a new instance of the enum variant. Enum variants are always callable. The call evaluates to this new instance.
Global Function Invokes the function and runs the associated Python code. The call evaluates to the result record of the function.
Method Invokes the method and runs the associated Python code. The call evaluates to the result record of the method.
Segment Invokes the segment and runs the Safe-DS code in its body. The call evaluates to the result record of the segment.
Block Lambda Invokes the lambda and runs the Safe-DS code in its body. The call evaluates to the result record of the lambda.
Expression Lambda Invokes the lambda and runs the Safe-DS code in its body. The call evaluates to the result record of the lambda.
Declaration with Callable Type Call whatever the value of the declaration is.

Result Record

The term result record warrants further explanation: A result record maps results of a

to their computed values.

If the result record only has a single entry, its value can be accessed directly. Otherwise, the result record must be deconstructed either by an assignment (can access multiple results) or by a member access (can access a single result).

Null-Safe Calls

If an expression can be null, it cannot be used as the callee of a normal call. Instead, a null-safe call must be used. A null-safe call evaluates to null if its callee is null. Otherwise, it works just like a normal call. This is particularly useful for chaining.

The syntax is identical to a normal call except that we replace the () with ?():

nullableCallee?()

Member Accesses

A member access is used to refer to members of a complex data structure such as

The general syntax of a member access is this:

<receiver>.<member>

Here, the receiver is some expression (the legal choices are explained below), while the member is always a reference.

Member Access of Class Members

To understand how we can access members of a class we must first look briefly at a declaration of the class we use in the following examples:

class DecisionTree() {
    static attr verboseTraining: Boolean

    attr maxDepth: Int
}

This class has a static attribute called verboseTraining, which has type Boolean. Static means that the attribute is shared between all instances of the class and can be accessed on the class itself, rather than a specific instance.

Moreover, the class has an instance attributemaxDepth, which is an integer. This must be accessed on a specific instance of the class.

Member Access of Static Class Member

Let us look at how to access the static attribute verboseTraining to retrieve its value:

DecisionTree.verboseTraining

These are the syntactic elements of this member access:

  • The receiver, which is the name of the class (here DecisionTree)
  • A dot.
  • The name of the static member of the class (here verboseTraining)

Note that we cannot access a static member from an instance of the class. We must use the class itself.

Member Access of Instance Class Member

Contrary to static member accesses, we can only access instance members on an instance of a class:

DecisionTree().maxDepth

We now take apart the syntax again:

  • The receiver, here a call of the constructor of the class DecisionTree. This creates an instance of this class.
  • A dot.
  • The name of the instance member (here maxDepth).

Note that instance members cannot be accessed from the class itself, but only from its instances.

Member Access of Enum Variants

A member access can also be used to access the variants of an enum. Here is the declaration of the enum that we use in the example:

enum SvmKernel {
    Linear,
    RBF
}

This enum is called SvmKernel and has the two variants Linear and RBF.

We can access the variant Linear using this member access:

SvmKernel.Linear

These are the elements of the syntax:

  • The receiver, which is the name of the enum (here SvmKernel).
  • A dot.
  • The name of the variant (here Linear).

This syntax is identical to the member access of static class members.

Member Access of Results

If the result record that is produced by a call has multiple results, we can use a member access to select a single one. Here is the global function we use to explain this concept:

fun divideWithRemainder(dividend: Int, divisor: Int) -> (quotient: Int, remainder: Int)

The global function divideWithRemainder has two parameters, namely dividend and divisor, both of which have type Int. It produces two results, quotient and remainder, which also have type Int.

If we are only interested in the remainder of 12 divided by 5, we can use a member access:

divideWithRemainder(12, 5).remainder

Here are the syntactic elements:

  • The receiver, which is a call.
  • A dot.
  • The name of the result (here remainder).

While it is also possible to access the result by name if the result record contains only a single entry, there is no need to do so, since this result can be used directly. If you still use a member access and the singular result of the call has the same name as an instance member of the corresponding class, the instance member wins.

To explain this concept further, we need the following declarations:

class ValueWrapper {
    attr value: Int
}

fun createValueWrapper() -> value: ValueWrapper

We first declare a class called ValueWrapper, which has an attribute value of type Int. Next, we declare a function, which is supposed to create an instance of the class ValueWrapper and put it into the result value.

Let us now look at this member access:

createValueWrapper().value

This evaluates to the attribute, i.e. an integer, rather than the result, which would be an instance of ValueWrapper.

If you want the result instead, simply omit the member access:

createValueWrapper()

Null-Safe Member Accesses

If an expression can be null, it cannot be used as the receiver of a regular member access, since null does not have members. Instead, a null-safe member access must be used. A null-safe member access evaluates to null if its receiver is null. Otherwise, it evaluates to the accessed member, just like a normal member access.

The syntax is identical to a normal member access except that we replace the dot with the operator ?.:

nullableExpression?.member

Indexed Accesses

An indexed access is used to access elements of a list by index or values of a map by key. In the following example, we use an index access to retrieve the first element of the values list:

segment printFirst(values: List<Int>) {
    print(values[0]);
}

These are the elements of the syntax:

  • An expression that evaluates to a list or map (here the reference values).
  • An opening square bracket.
  • The index, which is an expression that evaluates to an integer. The first element has index 0.
  • A closing square bracket.

Note that accessing a value at an index outside the bounds of the value list currently only raises an error at runtime.

Null-Safe Indexed Accesses

If an expression can be null, it cannot be used as the receiver of a regular indexed access. Instead, a null-safe indexed access must be used. A null-safe indexed access evaluates to null if its receiver is null. Otherwise, it works just like a normal indexed access. This is particularly useful for chaining.

The syntax is identical to a normal indexed access except that we replace the [] with ?[]:

nullableList?[0]

Chaining

Multiple calls, member accesses, and indexed accesses can be chained together. Let us first look at the declaration of the class we need for the example:

class LinearRegression() {
    fun drawAsGraph()
}

This is a class LinearRegression, which has a constructor and an instance method called drawAsGraph.

We can then use those declarations in a segment:

segment mySegment(regressions: List<LinearRegression>) {
    regressions[0].drawAsGraph();
}

This segment is called mySegment and has a parameter regressions of type List<LinearRegression>.

In the body of the segment we then

  1. access the first instance in the list using an indexed access,
  2. access the instance method drawAsGraph of this instance using a member access,
  3. call this method.

Lambdas

If you want to write reusable blocks of code, use a segment. However, sometimes you need to create a highly application-specific callable that can be passed as argument to some function or returned as the result of a segment. We will explain this concept by filtering a list. Here are the relevant declarations:

class IntList {
    fun filter(filterFunction: (element: Int) -> shouldKeep: Boolean) -> filteredList: IntList
}

fun intListOf(elements: List<Int>) -> result: IntList

First, we declare a class IntList, which has a single method called filter. The filter method returns a single result called filteredList, which is a new IntList. filteredList is supposed to only contain the elements of the receiving IntList for which the filterFunction parameter returns true.

Second, we declare a global function intListOf that is supposed to wrap elements into an IntList.

Say, we now want to keep only the elements in the list that are less than 10. We can do this by declaring a segment:

segment keepLessThan10(a: Int) -> shouldKeep: Boolean {
    yield shouldKeep = a < 10;
}

Here is how to solve the task of keeping only elements below 10 with this segment:

intListOf(1, 4, 11).filter(keepLessThan10)

The call to intListOf is just there to create an IntList that we can use for filtering. The interesting part is the argument we pass to the filter method, which is simply a reference to the segment we declared above.

The problem here is that this solution is very cumbersome and verbose. We need to come up with a name for a segment that we will likely use only once. Moreover, the segment must declare the types of its parameters and its results in its header. Finally, the declaration of the segment has to happen in a separate location then its use. We can solve those issues with lambdas.

Block Lambdas

We will first rewrite the above solution using a block lambda, which is essentially a segment without a name and more concise syntax that can be declared where it is needed:

intListOf(1, 4, 11).filter(
    (a) { yield shouldKeep = a < 10; }
)

While this appears longer than the solution with segments, note that it replaces both the declaration of the segment as well as the reference to it.

Here are the syntactic elements:

  • A list of parameters, which is enclosed in parentheses. Individual parameters are separated by commas.
  • The body, which is a list of statements enclosed in curly braces. Note that each statement is terminated by a semicolon.

The results of a block lambda are declared in its body using assignments.

Expression Lambdas

Often, the body of a block lambda only consists of yielding a single result, as is the case in the example above. The syntax of block lambdas is quite verbose for such a common use-case, which is why Safe-DS has expression lambdas as a shorter but less flexible alternative. Using an expression lambda we can rewrite the example above as

intListOf(1, 4, 11).filter(
    (a) -> a < 10
)

These are the syntactic elements:

  • A list of parameters, which is enclosed in parentheses. Individual parameters are separated by commas.
  • An arrow ->.
  • The expression that should be returned.

Closures

Note: This is advanced concept, so feel free to skip this section initially.

Both block lambdas and expression lambdas are closures, which means they remember the values of placeholders and parameters that can be accessed within their body at the time of their creation. Here is an example:

segment lazyValue(value: Int) -> result: () -> storedValue: Int {
    yield result = () -> value
}

This deserves further explanation: We declare a segment lazyValue. It takes a single required parameter value with type Int. It produces a single result called result, which has a callable type that takes no parameters and produces a single result called storedValue with type Int. In the body of the segment we then assign an expression lambda to the result result.

The interesting part here is that we refer to to the parameter value within the expression of the lambda. Since lambdas are closures, this means the current value is stored when the lambda is created. When we later call this lambda, exactly this value is returned.

Restrictions

At the moment, lambdas can only be used if the context determines the type of its parameters. Concretely, this means we can use lambdas in these two places:

Type Casts

The compiler can infer the type of an expression in almost all cases. However, sometimes its type has to be specified explicitly. This is called a type cast. Here is an example:

dataset.getColumn("age") as Column<Int>

A type cast is written as follows:

  • The expression to cast.
  • The keyword as.
  • The type to cast to.

Type casts are only allowed if the type of the expression is unknown. They cannot be used to override the inferred type of an expression.

Precedence

We all know that 2 + 3 * 7 is 23 and not 35. The reason is that the * operator has a higher precedence than the + operator and is, therefore, evaluated first. These precedence rules are necessary for all types of expressions listed above and shown in the following list. The higher up an expression is in the list, the higher its precedence and the earlier it is evaluated. Expressions listed beside each other have the same precedence and are evaluated from left to right:

If the default precedence of operators is not sufficient, parentheses can be used to force a part of an expression to be evaluated first.