Basics of Relational Algebra

There are many methods for modeling data and defining queries on it. You heard about SQL, yeah, but how about Relational Algebra? Try this article for a beginner-friendly introduction -and maybe bookmark it for quick reviews ‘till you get used to the logic!

What is a Relation?

Relation, under the concept of Computer Science, is any connection in between two or more data tables. You can create a relation by take a singular data or an entire data column from a table and add it into another one, adding multiple tables together to create one big table, or compare data columns in multiple tables by creating a connection between them. In short; you define a relation in between multiple data tables, combine data from those tables and end up with a new data.

What is a Relational Database?

A relational database is a database with pre-determined relations between data items. Each row on the data table on a relational database will represent an object or entity, while each column will represent an attribute of said object or entity.

While working with Relational Databases, we will have a terminology to follow. The columns on a data table, for example, will now be called an attribute. The rows are now tuples instead. If two tuples are exactly the same, we will call them duplicates.

A key is an attribute that identifies that table. Keys can be used to mark rows, make rows easier to manage, filter rows, access data within the table easily or even make rows from different tables related to each other. There are many types of keys;

What is Relational Algebra?

Every database management system must define a query language, a language that is used to access the data stored in the database and search within. There are multiple query language options such as SQL, CQL, and you have guessed right: Relational Algebra.

In relational algebra, the input will be the table from which we get the data and the output will be a temporary table holding the data the user asked for. Both the input and the output are relations. Working with operations in relational algebra show similarities and common points with sets in mathematics.

Relational algebra works through the entire data table, and therefore unlike other query languages such as SQL, you won’t have to use loops in order to iterate within the database one item at a time.

Operations

There are plenty of operations we can perform on the data using Relational Algebra or any other query language. For now, let’s focus on the basic operations we can perform using Relational Algebra.

σsubject = "health" ^ price = "450"(Books)

The command above will search from within the Books table and bring you the tuples where the subject attribute matches health and price attribute matches 450. Remember from the logical operations that ^ sign stands for “and” operation.

∏subject, author (Books)

The example above will bring you the attributes subject and author within the table Books. But if you have the multiple novels from Charles Dickens, it will print Novel, Charles Dickens only once.

∏ author (Books) ∪ ∏ author (Articles)

In the example above, the projection operator will project the author attribute from the Books table and the author attribute from the Articles table. The union operator in between will list the authors from the articles table right under the authors from the books table. However, if Jack London has both books and articles, his name will be listed once only since the union operator will eliminate duplicates just like the projection operator. Keep in mind that you can union multiple attributes from as many tables as you like.

∏ author (Books) − ∏ author (Articles)

Let’s say that the author attribute from Books table consists of Charles Dickens, Jack London and Emily Dickinson. And the author attribute from the Articles table consists of Jack London and Shereen Bhan. The outcome will be a table consisting of Charles Dickens and Emily Dickenson.

Let’s say that we have an id and an author attribute on both books and articles table. Books table consisting of 1, 2, 3 for Charles Dickens, Jack London and Emily Dickinson while Articles table consists of A, B for Jack London and Shereen Bhan.

Books Χ Articles

The line above will give us the following table;

1 Charles Dickens A Jack London

1 Charles Dickens B Shereen Bhan

2 Jack London A Jack London

2 Jack London B Shereen Bhan

3 Emily Dickinson A Jack London

3 Emily Dickinson B Shereen Bhan

Notice: Just like the sum and multiply operators we use in Math, these operators can also be used together.

ρ strawberry (∏ author (Books) ∪ ∏ author (Articles))

The example above will create a new data table with the names of all authors from Books table and Articles table combined, and name this new table “strawberry”.

∏ author (Books) ∩ ∏ author (Articles)

The example above will list out the common authors in both lists, in this case, Jack London.

Gym members ⋈ Zumba class members

Let’s say that the Gym members table consists of name, membership term while the Zuma class members table consists of name and sex. If the Gym members table consists of Adriana Lima, 3 months and Miranda Kerr, 4 months while Zumba class members consist of Adriana Lima, Female and Kelly Gale, Female. The combination table will take the common attribute “name” and give us the data table that consists of Adriana Lima, 3 months, Female.

Gym members ⟕ Zumba class members

The example above will give us the list below;

Adriana Lima, 3 months, Female

Miranda Kerr, 4 months, Null

Gym members ⟖ Zumba class members

The example above will give us the data table below;

Adriana Lima, Female, 3 months

Kelly Gale, Female, Null

Gym members ⟗ Zumba class members

The example above will give the following table as the outcome;

Adriana Lima, Female, 3 Months

Miranda Kerr, Null, 4 Months

Kelly Gale, Female, Null

Gym members ÷ ∏ name (Zumba class members)

The example above will first create a table with only the names of the Zumba class members: Adriana Lima and Kelly Gale. And then, it will go through the gym members list to find the tuples with the exact same data in the name attribute: Adriana Lima, 3 months. Finally, it will exclude the name attribute which is already common and the output table will only consist of the data “3 months”.

g sum(membership term)(Gym members)

The example above will give out the sum of all membership terms in the Gym members table, returning a data table named sum(membership term) with only one tuple, one which holds the value “7” months (3+4).

Changing the Database with Relational Algebra

There are multiple operations we can perform on the original database with the help of Relational Algebra.

Gym Members <- Gym Members - σname = "Adriana Lima" (Gym Members)

The example above will delete the tuple where the name is “Adriana Lima” from the Gym Members database, and push the output database on the original Gym Members database.

Gym Members <- Gym Members ∪ {("Stella Maxwell", 4)}

The example above will create a new data table with the values “Stella Maxwell” as name and “4” as membership term. Then it will combine the new data table with the Gym Members table and finally, push the outcome over the previous Gym Members table.

Gym Members <- ∏ name, membership term *2 (Gym Members)

The example above will double the membership terms on the Gym Members database while keeping the name attributes the exact same and then push the new table over the old Gym Members table.

I write about all that I’m interested in: Tech, Software, Physics, Math, and all development tools!