Regular Expressions: A brief introduction

A regular expression, or in short regex, is a string of characters that specifies a pattern. It is a very powerful tool, and is used in a wide variety of applications from search (and replace) and string validation to lexical analysis (which is the first step in a compiler stack where source code is converted into a stream of tokens).

The languages of regular expressions coincides with the languages recognized by finite state automata, which means that for every regular expression, there exists an automaton that can recognize it.

Most programming languages support regex, including for example Python, C, C++, Java, JavaScript and Dart.

@GARABATOKID

If you don’t have a cat, bad news, you’ll have to learn it 😀

Basics of regex

A single character is it self a regular expression that recognized once and only that character.

Let’s consider the following string: ‘I am excited to learn regular expressions!’, characters in bold are considered matched.

Using regular expression r = ‘a’ will match ‘I am excited to learn regular expressions!’

We can use the boolean or operator | to match either r1 or r2: r = r1 | r2

r = ‘am|to’ matches ‘I am excited to learn regular expressions!’

Operators can be applied recursively to extend the regular expression.

Let’s look at what we can do with regex:

Regex Matched set of strings
hi{hi}
hi | hello{hi, hello}
zz*a mandatory z followed by zero or more z: {z,zz,zzz,zzzz,zzzzz,…}
(haha)+at least one occurrence of haha: {haha,hahahaha,hahahahahaha,…}
analy(s|z)e{analyse, analyze}
analog(ue)?? means optional: {analog, analogue}
o{2}exactly two occurrences of letter o: {oo}
[a-z]matches any lowercase letter in the range from a to z: {a,b,c,d,e,f,g,…,z}
^footballmatches any string that begins with football
football$matches any string that ends with football
\dmatches any digit: {0,1,2,3,4,5,6,7,8,9}
[0-9]matches any digit: {0,1,2,3,4,5,6,7,8,9}
.matches any character
a non exhaustive list of regex operators

Regex for validation

Let’s say we are building an application for a specific university or workplace, and we would like to allow users to register using emails that belong only to the domain of the university or company, and it enforces some rules on the special characters used (only – or _)

Allowed emails:

user@mycompany.com

firstname_lastname@mycompany.com

firstname-lastname@mycompany.com

The string we are trying to match here is:

[Any alpha numeric string with characters (- _) ]@mycompany.com

We can use the following regex: [A-Za-z0-9_-]+@mycompany\.com (we need to use \. to escape the dot character since using only . matches has another semantic of matching every character)

There are two groups we can focus on here:

The outer group: ( )+ matches at least one occurrence of the contents of the parentheses

The inner group: [A-Za-z0-9_-] matches either A-Z or a-z or 0-9 or _ or

followed by a fixed string @mycompany\.com

You can find and interact with the above regex example here.

Thanks for reading 🙂

Enjoy regexing!

Introduction to Functional Programming with Scheme (Part 1)

Scheme is a multi-paradigm language, developed in the 1970s by Guy L. Steele and Gerald Jay Sussman. Scheme is a minimal dialect of the LISP family, and although it supports both functional and procedural paradigms, Scheme is mainly functional. It is a language that is built with the purpose of learning about the core concepts of programming languages.

Procedural programming focuses on statements

Functional programming focuses on expressions

Expressions have values. A functional program is an expression who’s value is a sequence of instructions for the computer to carry out.

Statements don’t have values and instead modify the state of some conceptual machine.

omnimike

In this tutorial we will learn a dialect of Scheme called Racket.

To follow along, feel free to download DrRacket, the IDE of Racket. I will proceed to call Scheme and Racket terms interchangeably.

Racket

In programming languages such as C, Java and others, the procedure call is f(x,y)

In Scheme the equivalent is (f x y)

The basic primitives of Scheme are:

  • Booleans: #t, #f
  • Characters: #\a, #\B
  • Numbers: 123, 1.23e+10, 3/4
  • Vectors: #( 1 2 3 4 5)
  • Strings: “Hello world!”
  • Symbols: symbol-a, symbol-b
  • Pairs: (x . z)
  • Lists: (1 2 3 4 5)

To write comments we use ;

Writing our first procedure

+ is a built-in procedure, hence, to add two numbers we can call (+ 1 2) which yields the expression 3

We can create our own procedure that adds two to a number

#lang racket

(lambda (x) (+ x 2))
; Output => #<procedure>

Now this lambda is an anonymous function that we can use directly instead of the above +

#lang racket

((lambda (x) (+ x 2)) 1) 
; Output => 3

We can do better, we can store the lambda in a variable:

#lang racket

(define add2
  (lambda (x) (+ x 2)))
;And then we can reference it
(add2 1)
; Output => 3

Lets make a define a procedure the calculate the cube of a number

#lang racket

(define cube
   (lambda (x) (* x x x)))
;And then we can reference it
(cube 3)
; Output => 27

We use a quote ‘ if we would like to stop the evaluation of a list, for example, if we would like to find if a number exists in a list:

#lang racket
(member 10 '(1 4 5 10 9 7))
; If we don't use quote ', we will get an error because the compiler will evaluate it as (f arg1 arg2 arg3) and not as a list
; Output => '(10 9 7) ; It returns the sublist from which the number 10 begins

(member 50 '(1 4 5 10 9 7))

; Output => #f ; False since 50 doesn't exist within the list

Recursion in Scheme

The classical recursion Hello World is the factorial. Please note that if is a keyword, and not a procedure, (if cond a b) executes a if cond is true, otherwise b

(define (factorial n)
  (if (= n 0) ; Checking for the base case
      1 ; if the base case is true we return 1
      (* n (factorial (- n 1))))) ; otherwise we recurse passing n-1 as an argument

With that, we conclude Part 1 of the Introduction to Functional Programming with Scheme.

You can find the above code on my github

I would like to express my gratitude for professors Matteo Pradella & Michele Chiari who taught me the course Principles of Programming Languages at PoliMi

In the upcoming parts, we will dive deeper into Scheme and its constructs

Brace yourself for even more parentheses 😀

Thanks for reading! and stay tuned.

Introduction to Higher Order Functions

A higher order function (HOF) is a function that takes one or more functions as arguments or return a function as a result, or both. They are supported by a wide range of programming languages including the most popular ones such as Python, C++ and JavaScript.

An example of Higher Order Functions that typically built-in within programming languages are:

  1. map
  2. fold
  3. filter

Although they are simple concepts, knowing how to use them can very much improve the way you write code :). So let’s get started!

In this tutorial we’ll use Python, without any loss of generalization the same concepts apply to any other language.

1. map

The signature of map function in Python is map(function, iterable,)

map operates on iterables (lists, tuples, strings or any sequence that can be iterated on with a for loop)

map is used to apply a function to each item in the iterable, or in other words to map each item to something else.

Example: Let’s say we have this list of numbers, and we would to return zero for each negative number, and return the square of each positive number.

'''
Map function
'''
import math

list_of_nums = [1, 5, 10, -5, 8, -4, 3]

def my_function(x):
    if x < 0:
        return 0
    else:
        return math.pow(x, 2)

new_list_of_nums = list(map(my_function, list_of_nums))

print(new_list_of_nums)

#Output: [1, 25, 100, 0, 64, 0, 9]

2. reduce

reduce is the Pythonic version of foldl in Functional Programming, it can be used to apply operations that can accumulate the iterable, such as summing numbers, or concatenation of strings.

The reduce in Python is available in the functools module

The signature of reduce function in Python is reduce(function, iterable)

In reduce, the function we apply must have two arguments. This time we’ll use a lambda,

A lambda is an anonymous function that has the following syntax: lambda arguments : expression

In the following example, we are finding the sum of the first 100 natural numbers, the way this works is as follows it starts from the first item of the list from left to right:  ((((1+2)+3)+4)+5)...+100) = 5050

'''
Reduce function
'''
import functools

list_of_nums = list(range(1, 101))

sum_of_nums = functools.reduce(lambda x, y: x + y, list_of_nums)

print(sum_of_nums)

#Output: 5050

3. filter

The filter function as the name suggests, filters an iterable based on the boolean value returned by the function we pass.

The signature of filter function in Python is filter(function, iterable)

In the following example, we are filtering the list such that lowercase characters are removed.

'''
Filter function
'''

list_of_chars= ['A','B','c','D','e','F',]

filtered_list = list(filter(lambda x: x.isupper(), list_of_chars))

print(filtered_list)

#Output: ['A', 'B', 'D', 'F']

You can also find the code on my github.

Thanks for reading and happy coding!