Introduction
Apache Cassandra uses CQL (Cassandra Query Language) for communicating with its database. Cassandra is similar to SQL as it also stores data in tables, organizing it into rows and columns.
Cassandra stores data in variables. Each variable has an assigned data type that defines the type (or range) of the values it can store, and what operations it can perform without causing an error.
Read on to learn about Cassandra data types and how they differ.
Cassandra Data Types
Apache Cassandra supports a rich set of data types, including:
- Built-in data types
- Collection data types
- User-defined data types
Note: Apache Cassandra is a Wide-Column NoSQL database. If you want to learn more about these types of databases, read NoSQL database types. And if you are interested in NoSQL core concepts and features, refer to What is NoSQL.
Built-In Data Types
Cassandra has many data types for which it provides built-in support. These are also referred to as primitive data types. They come pre-defined and you can directly refer to any of them.
Data Type | Constants | Description |
---|---|---|
ascii | strings | ASCII is a data type that includes character encoding used for strings. In it, numeric code represents characters (for instance, T is 84). While the standard ASCII can depict 128 characters, the extended version incorporates 256 characters. |
Boolean | booleans | BOOLEAN is used for variables that have one of two possible values. These values are stored as 16-bit numbers, but they can only be True or False. |
blob | blobs | BLOB is short for “Binary Large Object” and its utilized for storing binary data. As it represents arbitrary bytes, it is mainly used to store images, videos, and audio files. Because of their size, they require more space compared to other data types. |
decimal | integers, floats | DECIMAL data types are convenient for storing currency data due to the precision it offers. It is used for numeric values that consist of two components: precision (number of digits: 5.754 ) scale (digits that come after the decimal point: 5.754 ) It stores the value 5.754 as two separate units: 5 (precision) and 754 (scale). |
double | integers | If you need to store decimal values that do not require the level of precision of currency values, you can use the DOUBLE data type. It represents a 64-bit floating point and is used for integers. |
float | integers, floats | The FLOAT data type stores decimal point values. It is a single-precision, representing a 32-bit floating point. You shouldn’t use it with data that requires high accuracy since it is not as precise as decimal data type representation. |
int | integers | INT data type is used to store 32-bit signed integers. |
smallint | integers | SMALLINT stores 16-bit signed integers. |
bigint | integers | BIGINT stores 64-bit signed integers. |
text | strings | To store data you can use TEXT data types used for text data, represented in UTF8 encoded strings. |
varchar | strings | Use VARCHAR for variables or arbitrary characters. It stores in UTF8 encoded strings for which you can also determine the maximum size. |
inet | strings | To store strings of characters that don’t require any arithmetic operations utilize the INET data type. Use it to save and manage IP addresses since it supports both numeric and character representation. INET can store IPv4 and IPv6 host addresses. |
counter | integers | The COUNTER data type is used for 64-bit integral values and stores them in counter columns. This data type supports two operations: incrementing and decrementing, and is commonly used to count page views. |
time | integers, strings | You can store time values in the following format: hh:mm:ss using the time data type. It offers nanosecond precision and supports data in integers and strings. |
date | integers, strings | Accordingly, you can store date values in the format: YYYY-MM-DD . This data type also supports integers and strings. |
timestamp | integers, strings | The TIMESTAMP data type is a combination of the previously mentioned two. It is used for values that include time and date values in the format: YYYY-MM-DD hh:mm:ss |
Note: To learn more about Cassandra, see Cassandra vs MongoDB.
Collection Data Types
You can use one of the collection data types if you want to store multiple values into one unit.
Cassandra supports three kinds of collection data types:
- Maps. Cassandra can store data in sets of key-value pairs using the Map data type. It allows you to store data and assign labels (key names) to it for easier sorting.
- Sets. You can store multiple unique values, using the Set data type. Bear in mind that the elements will not be stored in order.
- Lists. If you need to store multiple values in a specific order, you can use the List data type. Unlike sets, lists can store duplicate values.
User-Defined Data Type
The last type of Cassandra data types are User-Defined data types (UDTs). As the name suggests, it allows you (the user) to create your own data type based on the requirements you need.
A UDT consists of multiple data fields of any data type inside a single column. Once you create your user-defined data type, you can change or even remove the fields inside of it.
Conclusion
Now you should have an understanding of the primary groups of data types used in Apache Cassandra: Built-in, collection, and user-defined. This NoSQL database offers a broad scope of data types to utilize while storing and managing your data.
If you would like to test this out, check out our article on how to install Cassandra on Ubuntu or how to install Cassandra on your Windows machine.