Table

What is table?

A table is a structured arrangement of data or information presented in rows and columns. It's commonly used to organize and display data in a clear and systematic manner, making it easier to read, analyze, and interpret. Tables are the main data storage location of Avalanchio that powers reports, dashboards, machine learning, semantic search and other forms of data analysis.

In the backend, Avalanchio stores the data in a distrbuted fault tolerant architecture. Data in the table are mutable, and can be looked up by id or secondary indexes in constant time. Tables support SQL queries, including multi table JOIN operations.

Key features of the tables

Support billions of records in a table. Data are stored in a compressed format. Tables that are primarily readonly and big in size, can be configured to be stored as compressed columnar file formats providing even greater scalability.
As soon as you create a table, REST API is created automatically for the table to support SQL queries and CRUD operations.
Tables supports secondary indexes. Secondary indexes speeds up certain lookup and range queries significantly, providing low latency queries at high concurrency usage scenarios.
Tables has a schema. A table supports thousands on columns. Each column has a type such as Integer, Bigint, Text, Textarea etc. The data types are validated on write. Check the documentation for supported field types.
Table supports Json data type. Queries can extract nested fields from the JSON fields.
Foreign keys can be defined as a Lookup data type to build relationship among tables, but the field values are not currently enforced during write.
You can encrypt sensitive data by turning on field encryption.
You can turn on field history tracking to record the historical values of field.

What are system fields?

System fields are automatically generated or managed by a system and used for internal purposes and are not directly editable by users. You can turn off the system fields if you find those fields are relevent for certain use cases.

Field	Type	Purpose
ID	Bigint	This is a unique value and acts a record identifier. You can provide a value during write process. If not provided, system will automatically generate one during the write process. System generated values are monotonically increasing sequential numbers.
Created By ID	Bigint	User ID who creates the record.
Created On	Timestamp	Timestamp at which the record is created. Like other timestamp values, it is stored as Unix epoch milliseconds.
Last Modified By ID	Bigint	User ID who has modified the record the last time.
Last Modified On	Timestamp	Timestamp at which the record is last modified.
Owner ID	Bigint	User ID who owns the record. Record ownership drives the record accessibility.
Name	Text	User friendly handle for record. In look up values, value in the fields are displayed to the user.

Table Configurations

Configuration	Description
Label	User-friendly name of the table. It is used for display purpose. The label can be changed for an existing table.
Name	API name of the table. It must be a valid SQL table name and it cannot be changed for an existing table.
Allow Create	Controls whether the users are allowed to create a new record in the table from the user interface.
Allow Edit	Controls whether the users are allowed to edit an existing record in the table from the user interface.
Allow Delete	Controls whether the users are allowed to delete an existing record from the table from the user interface.
Enable Field History Tracking	Controls whether field history are tracked. Enabling this configuration alone is not sufficient to track field history. You can to configure the columns (in the column configuration) to mark which columns will be tracked for value changes. If this setting is unset, the fields will not be tracked even if you turn on field history tracking at the table level.
Time Index Column	This column is used for the queries involving time, e.g. find events in the last 1 hour.
Skip System Fields	Skip the system generated audit fields - Created By User Id, Last Modified By User Id, Created On, Last Modified On and Owner Id. For each record, these fields can add 16-40 bytes of overhead. Skipping the system field will reduce the overall storage footprint.

image not found

Difference between table name and lable

Name: A table name refers to the name given to a specific table that stores data. Each table has a unique name, and the table name is used to identify and reference the table when performing operations such as querying or modifying data. The name must start with an alphabet and can only contain alpha-neumeric characters. No white space character or any special characters are allowed.

Example: the name for "Campaign Finance" table will be "campaign_finance".

Label: It is a user-friendly name of a table. It is used for end user reports.

Example: The label for the campaign_finance table will be "Campaign Finance" to make it clearer to users.

Table name convention

Must start with an alphabet
Can contain only alphabets, numeric digits or underscore (_)
Any form space character is not allowed
Any non-ascii character is not allowed
Must be at least 2 chacter wide
Must be at maximum 128 chacter wide

Who can manage tables in Avalanchio?

The administrator can create tables and configure the permissions to the standard users.

How to query a table?

API user can query the table using SQL query or lookup by ID. The end user can view records using query builder or reports.

Figure: SQL Query on table

Figure: Query builder on table records

Figure: Report on table

Truncate a table

Administrators can delete all the records from a table but keep the configuration of the table by truncating a table.

image not found

Drop a table

Dropping table will delete the table configuration as well as the records in it.

image not found

Column Configuration

An administrator can define custom columns on the table to build the data model that fits the use case. Following are the key features of the columns.

Key features of the column

Configuration	Description
Required	Columns marked "required" must have a valid value while saving.
Searchable	Columns marked "searchable" appear in the query builder screen as filtering option. SQL query has no effect of this configuration.
Enable Field Audit	Controls whether the values changes of the column are recorded. In the table configuration "Enable Field Audit" must be set for recording the changes.
Indexed	Build secondary index on this column. Only certain column type can be used for secondary indexes: Text, Email, Integer, Bigint, Date, Timestamp and UUID.
Unique	Column value must be unique in the table. The unique field must be indexed.
Encrypt	Encrypt the column value at the storage layer. It is uses AES/ECB/PKCS5Padding as encryption cipher. (1)
Mask	Masked fields are hidden to the user who does not have permission to view the masked fields. (1)
Length	Length of embedding vector. It is applicable for Text type and Embedding type columns.
Maximum Length	Maximum length of text field. It is applicable for Text and Textarea type columns.

(1) Please note, masked fields and the encryption fields always return NULL in the queries. The values are accessed while accessing the record by id.

image not found

Supported Column Types

Type	Description
Text	String data UTF-8 encoded values. Max length: 65,535
Textarea	String data UTF-8 encoded values. Max length: 4,294,967,295
Picklist	Picklist of choice values.
Boolean	True/False values. The field can be NULL as well.
Integer	Integer value. Value range from -2,147,483,648 to 2,147,483,647. Internally, it is stored as 4 byte integer.
Currency	Currency type data. Internally it is stored as 8 bytes signed floating point number.
Percent	Percentage value. Internally it is stored as 4 bytes signed floating point number.
Phone	Phone number
Email	Valid email text
URL	Valid URL text
Bigint	Signed integer value. Value range from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. Internally, it is stored as 8 byte integer.
Float	Signed floating point value. Value range: 1.175494351E-38 to 3.402823466E+38.Internally, it is stored as 4 byte integer with maximum precision 6-7 digits.
Double	Signed float point value. Value range: 2.2250738585072014E-308 to 1.7976931348623158E+308. Internally, it is stored as 4 bytes floating point number with maximum precision 15-16 digits.
Date	Date field type without time.
Timestamp	Timestamp field type containing date and time. Internally data is stored in milli-seconds precision. While sending to this send the timestamp either in epoch milliseconds value or timestamp in ISO 8601 format.
Json	Json value. Max length in the string format: 4,294,967,295
Lookup	Lookup relationship to other table. This fields stored the ID (Bigint) of the record. This field requires "Lookup Table" field to indicate the target table name.
Embedding	Embedding vector for the text field. It contains a vector of float32 numbers. Embedding field requires a length to validate vector length. Usually all records are expected to have same embedding length. To know more about Embedding type, look into
UUID	UUID type field. Internall it is stored as 16 byte data, while during the processing it is processed as 36 character value.
Blob	Binary data. Max length in bytes: 4,294,967,295. While sending the binary data over REST api, please send the data in Base64 encoded string format.

Note: the maximum length of the data types such as Text, Textarea, Blob, Json may be effective less because of other restriction such limit on the maximum byte size of a request, number of records per request etc.

What is the difference between column name and lable?

Name: The column name refers to the name assigned to a specific column in a table. It serves as an identifier for the column within the schema and is used to reference the column in SQL queries, data manipulation operations, and database administration tasks. Column names are unique within a table and follow naming conventions same as that of the table name.

Label: The label refers to the text or descriptive name assigned to a column in a user interface or report to help users understand the meaning or purpose of the column. It is typically displayed to users in forms, tables, reports, or other UI elements to provide context and guidance on the data being presented. Labels are often more user-friendly and descriptive than column names and may be customized based on the target audience or application requirements. Example: The label for the "campaign_id" column might be "Campaign Id" to make it clearer to users. Label does not follow the naming convention of the name.

While the name of the column cannot be changed, but the label can be changed.

Lookup Field

It is similar to foreign key in the relational database. If a column type is lookup then in the lookup table field the table name is mentioned to which column type is mapped to. Actual field value is a Bigint type (8 byte integer) which is the ID of the related record from the lookup table.

Secondary Index

A certain queries requiring sorting and filtering can be executed significantly faster by indexing the column. Internally, index values are stored seperately. For equality filters, the secondary indexes returns the data in constant time. For scan operations, such as those requiring inequality checks, are also improved. Just for illustration, while the main table can be scanned at a rate 80,000 records per second, the indexed values can be scanned at 400,000 records/second. Additionally, the some indexed values are stored in sorted manner, which help speed up the queries requiring sorting. You can index up to 5 columns per table.

However, it's important to note that creating indexes also has some drawbacks, including increased storage space requirements and potential performance degradation during data modification operations. Moreever, the index values require extra storage. Therefore, indexes should be used judiciously and only on columns that are frequently used in queries and provide significant performance benefits.

Composite indexes are not currently supported.

image not found

Drop columns

User can select the columns and delete them with "Delete Selected Columns" button. That will remove selected columns from the table. Note, the selected columns are deleted from the users, however, the columns are not truly deleted, but rather marked "Deleted". Storage for the deleted columns are not immediatedly released, while running repair, the storage allocated for the deleted columns are release. If the table is dropped, the data for both active and deleted columns along with any indexed data are deleted with immediate effect.

image not found

What happens if you turn on indexing on an existing column?

When you change the index settings such as marking a field as Indexed, or turning on unique property, the database automatically updates the existing record to take effect of the changes at the storage layer. However, the backend is distributed, some nodes may not be available during index updation, which can create inconsistency in the indexed data. To repair the index, you can explicitly run the reindex job to fix any potential issues. During the reindexing, the backend rebuilds the indexes.

Hot-Cold configuration

The hot storage layer of Avalanchio is optimized for rapid retrieval and frequent edits. However, if table records beyond a certain point in time—such as one year—are no longer frequently queried, the data can be exported using a hot-warm configuration. In this configuration, the warm data is stored as columnar Parquet files on HDD. While queries can still access this data, the latency will generally be higher compared to data stored in the hot storage layer.