Expression language

Expression language

Overview

Sperta has its own expression language called SEL. You can write expressions with SEL combining features, literals, and operators to output a value.

If you have worked with SQL, Python, or any programming language, the syntax should look very familiar to you.

Unlike SQL or Python, SEL is strongly typed. This means Sperta can detect data type mismatches in addition to syntax errors as soon as you save the expression. Take a look at Features to learn what features can be used in an expression and their data types.

SEL is fast, orders of magnitude faster than alternative implementations using sandboxed JavaScript.

Literals

Literals are values written exactly as it's meant to be interpreted. For example, if you have an expression fraud_score > 0.9, then fraud_score is a feature and 0.9 is a literal.

Here are some examples:

  • Integer literals
    • 1
    • -2
  • Double literals
    • -1.5
    • 1.
    • 7.3e4
    • 7.3E4
  • String literals
    • "john@example.com"
    • 'john@example.com'
    • """Some comments"""
    • '''Some comments'''
  • Boolean literals
    • true
    • false
  • List literals. The values must be of the same data type:
    • [1, 2, 3]
    • ["John", "Jessie", "Mark"]
  • Map literals. The keys must be of the same data type (Integer, Boolean, String). The values must be of the same data type too:
    • {"US": 0.95, "MX": 0.85}

Zero values

Sperta automatically assigns a zero value to missing input features based on the data type. This way, expressions are less likely to fail since there are no null values. The zero values for each data type are:

  • 0 for Integer
  • 0.0 for Double
  • "" (empty string) for String
  • false for Boolean
  • timestamp("1970-01-01T00:00:00Z") (Unix epoch) for Timestamp
  • {} (empty JSON) for JSON
  • [] (empty list) for List

Operators

Operators in SEL perform mathematical, relational, or logical operations. The following operators are supported:

  • Logical operators: not, and, or
    • When used together in an expression, not has the highest precedence and or has the lowest precedence.
    • For example, in 3 > 2 and not 2 > 1 or 4 > 3, not 2 > 1 is evaluated first.
    • If you are not sure about the precedence, you can use ( or ) to override the precedence and make it more clear. You can rewrite the previous expression to (3 > 2 and (not 2 > 1)) or 4 > 3
  • Negation operator: -
    • For example, if a feature score has a value of 0.9, then -score outputs a value of -0.9
  • Arithmetic operators (precedence from high to low): *, /, %, +, -
    • For example, the result of 18 / 2 * 3 + 1 is 4
    • If you are not sure about the precedence, you can use ( or ) to override the precedence and make it more clear. You can rewrite the previous expression to (18 / (2 * 3)) + 1
    • % performs a modulo operation. For example, the result of 10 % 3 is 1.
    • Since SEL is strongly typed, you can’t mix Integer and Double in arithmetic operations. Sperta will report an error for this expression: 4.0 * 3. Instead, first convert them to the same data type: 4.0 * double(3)
    • You can use + to concatenate strings, e.g. first_name + " " + last_name
    • You can also use + and - between timestamps and durations:
      • timestamp("2024-02-16T05:13:45Z") - timestamp("2024-02-15T05:13:45Z") > duration("23h")
      • timestamp("2024-02-16T05:13:45Z") + duration("2h") < timestamp("2024-02-16T08:13:45Z")
      • duration("2h") - duration("30m")
  • Relational operators: ==, !=, >, >=, < , <=
    • You can use == and != to compare booleans and strings. You can use all relational operators to compare numbers.
    • Compare booleans: device_spoofed == true
    • Compare strings: country_code != "US"
    • Compare numbers: fraud_score >= 0.9
    • Compare timestamps: timestamp("2024-02-16T00:00:00Z") < timestamp("2024-02-16T01:00:00Z")
    • Compare durations: duration("2h") > duration("80m")
  • Inclusion operators: in
    • Use inclusion operators to test the inclusion of a value in a List or a Map
    • Example for in: billing_country_code in ["US", "MX"]
    • Note that not in is not yet supported, so instead, you can write not (email_domain in ["111.com", "freemail.com"])
    • You can’t use in to check if any elements of a list exists in another list. Instead, use the exists() function on the list. For example, if countries is a list, the expression will look like countries.exists(country, country in ["US", "MX"])
  • Conditional operators: ?:
    • Similar to conditional steps, the conditional operators allow you to output a feature based on a condition.
    • For example, score > 700 ? 200.0 : 100.0 will return 200.0 if the score is higher than 700 and return 100.0 otherwise.
    • You can even output a feature using the conditional operator, which makes it more flexible than a conditional step. For example, score > 700 ? requested_amount : 100.0.
    • You can also use the conditional operators to assign default values to features. For example: model_score == 0.0 ? 0.5 : model_score

If the precedence rules are not clear to you, take a look at the technical reference of operator precedence.

Operator precedence reference

PrecedenceOperatorDescriptionAssociativityExample Usage
1

()

Functions
Left-to-right

"john".size()

2

- (unary)

Negation
Right-to-left

-score

not

Logical NOT
Right-to-left

not device_spoofed

3

*

Multiplication
Left-to-right

a * 3

/

Division
Left-to-right

a / 3

%

Modulo
Left-to-right

a % 3

4

+

Addition
Left-to-right

a + 3

- (binary)

Subtraction
Left-to-right

a - 3

5

==, !=, <, >, <=, >=

Relations
Left-to-right

a > 3

in

Inclusion test
Left-to-right

country in ["US", "MX"]

6

and

Logical AND
Left-to-right

a > 3 and b > 1

7

or

Logical OR
Left-to-right

a > 3 or b > 1

8

?:

Conditional
Right-to-left

score > 700 ? 200.0 : 100.0

Built-in functions

NameTypeDescriptionExample UsageExample Result
size

(string) -> integer string.() -> integer

String size

size("john") "john".size()

4

(list(A)) -> integer list(A).() -> integer

List size

size([1, 2, 3]) [1, 2, 3].size()

3

(map(A, B)) -> integer map(A, B).() -> integer

Map size

size({"a": 1}) {"a": 1}.size()

1

contains

string.(string) -> boolean

Tests whether the string operand contains the substring.

"Android Samsung 2.0".contains("Android")

true

lower

string.(string) -> string

Converts a string to lower case

"John".lower()

"john"

upper

string.(string) -> string

Converts a string to upper case

"john".upper()

"JOHN"

startsWith

string.(string) -> boolean

Tests whether the string operand starts with the substring.

"4154314238".startsWith("415")

true

endsWith

string.(string) -> boolean

Tests whether the string operand ends with the substring

"abc@gmail.com".endsWith("gmail.com")

true

substring

string.(start, end) -> string

Returns a substring with an inclusive start range and exclusive end range. The index starts at 0.

"415-555-6666".substring(0, 3)

"415"

format

string.(list) -> string

Formats the specified value(s) and insert them inside the string's placeholder.

"/transaction_risk?id=%s&time=%d".format(["abc", 123])

"/transaction_risk?id=abc&time=123"

matches

string.(string) -> boolean

Tests whether the string matches the regular expression.

"a".matches("[abc]+")

true

bool

(string) -> boolean

Converts a string to a boolean

bool("true")

true

double

(integer) -> double

Converts an integer to a double

double(100)

100.0

(string) -> double

Converts a string to a double

double("100")

100.0

int

(double) -> integer

Converts a double to an integer. Rounds toward zero. Errors if the result is out of range.

int(5.3)

5

(string) -> integer

Converts a string to an integer

int("100")

100

(timestamp) -> integer

Converts a timestamp to an integer in seconds since Unix epoch.

int(timestamp("2024-02-16T05:13:45Z"))

1708060425

string

(integer) -> string

Converts an integer to a string

string(100)

"100"

(double) -> string

Converts a double to a string

string(100.0)

"100.0"

(boolean) -> string

Converts a boolean to a string

string(true)

"true"

(timestamp) -> string

Converts a timestamp to a string

string(timestamp("2024-02-16T05:13:45Z"))

"2024-02-16T05:13:45Z"

(duration) -> string

Converts a duration to a string

string(duration("2h"))

"2h"

timestamp

(string) -> timestamp

Converts a string to a timestamp

timestamp("2024-02-16T05:13:45Z")

-

duration

(string) -> duration

Converts a string to a duration; Supports the following suffixes: "h" (hour), "m" (minute), "s" (second), "ms" (millisecond), "us" (microsecond), and "ns" (nanosecond). Duration strings may be zero, negative, fractional, and/or compound. Examples: "0", "-1.5h", "1m6s”

duration("2h")

-

time.now

() -> timestamp

Get the current timestamp

time.now()

timestamp("2024-02-16T05:13:45Z")

getDate

timestamp.() -> integer

Get day of month from the timestamp in UTC, one-based indexing

timestamp("2024-02-16T05:13:45Z").getDate()

16

Get day of month from the date with timezone, one-based indexing

timestamp("2024-02-16T05:13:45Z").getDate("-08:00")

15

getDayOfWeek

timestamp.() -> integer

Get day of week from the timestamp in UTC, zero-based, zero for Sunday

timestamp("2024-02-16T05:13:45Z").getDayOfWeek()

5

Get day of week from the date with timezone, zero-based, zero for Sunday

timestamp("2024-02-16T05:13:45Z").getDayOfWeek("-08:00")

4

all

list.(element, predicate) -> boolean

Tests whether a predicate holds for all elements of a list

["US", "UK"].all(country, country in ["US", "MX"])

false

exists

list.(element, predicate) -> boolean

Tests whether a predicate holds for any elements of a list

["US", "UK"].exists(country, country in ["US", "MX"])

true

exists_one

list.(element, predicate) -> boolean

Tests whether a predicate holds for exactly one element of a list

["US", "UK"].exists_one(country, country in ["US", "UK", "MX"])

false

map

list.(element, expression) -> list

Transforms a list  by taking each element and transforming it with an expression

[1, 2, 3].map(e, e*e)

[1, 4, 9]

filter

list.(element, predicate) -> list

Returns a sublist where the predicate evaluates to true for each element in the sublist

[15, 5, 25].filter(e, e > 10)

[15, 25]

numeric.round

(double) -> integer

Rounds the double to the nearest integer

numeric.round(10.3)

10

numeric.pow

(double, double) -> double

Calculate x to the power y

numeric.pow(2.0, 3.0)

8.0

math.least

(integer…) -> integer

Find the smallest integer among the arguments or a list

math.least(2, 1, 3) math.least([2, 1, 3])

1

(double…) -> double

Find the smallest double among the arguments or a list

math.least(2.0, 1,0, 3.0) math.least([2.0, 1,0, 3,0])

1.0

math.greatest

(integer…) -> integer

Find the biggest integer among the arguments or a list

math.greatest(2, 1, 3) math.greatest([2, 1, 3])

3

(double…) -> double

Find the biggest double among the arguments or a list

math.greatest(2.0, 1,0, 3.0) math.greatest([2.0, 1,0, 3,0])

3.0

Regular expressions

The matches() function allows you to test a regular expression, and it returns true if the test passes.

There are two ways to use backslash as the escape character:

  • Use \\ instead of \. For example, instead of s.matches("^trashymail\.(com|net)$"), you need to use s.matches("^trashymail\\.(com|net)$")
  • Use a raw string with an r prefix for the regular expression and use \. For example, s.matches(r"^trashymail\.(com|net)$")