Chapter 6. Database tuning

Table of Contents

Creating large databases
Databases limits
Tuning the host system
Creating multi-volumes databases using datafiles and dataspaces
Managing indexes
Creating and deleting indexes
Listing, updating and moving indexes
Getting statistics on indexes
Getting and setting index default dataspace
Managing collections
EyeDB collection implementations
Getting and setting collection default implementation
Getting and updating a particular collection implementation
Getting statistics on a collection implementation
Getting and setting collection default dataspace
Managing locations
What is a location in EyeDB?
Managing instances locations
Managing attributes locations
Managing collections locations
Managing indexes locations

This chapter explains how to tune EyeDB to improve performance, either system response time or databases size.

The first section describes how to build multi-volumes databases that can contain hundred of terabytes of data. The two following sections explains how to manage indexes and collections to improve system response time. The last section describes how to specify a location (i.e. a dataspace) for EyeDB entities such as class instances, collections...

Creating large databases

Creating large databases (i.e. larger than a terabyte) requires partitionning the database between several volumes, either physical disks or partitions of a disk. EyeDB supports multi-volumes databases using datafiles (see the section called “Managing datafiles”) and dataspaces (see the section called “Managing dataspaces”).

Databases limits

Tuning the host system

An important notice is that the host system imposes to the processes running on it some resource limits, in particular on process memory size and process virtual memory size. As EyeDB file access relies heavily on virtual memory and file mapping, it is mandatory to either unset these limits or set them to a high value.

On Linux, the resource limits can be retrieved and set using functions getrlimit and setrlimit, that can be used from a C or C++ program. For a further description of these functions, consult the getrlimit(2) and setrlimit(2) manual pages.

When using command line tools, an easy way to retrieve and set resource limits is to use the bash ulimit command. For a further description of this command, consult the bash(1) manual page.

The two mandatory resource limits that must be set are:

  • the maximum memory size, which can be retrieved using command ulimit -m

  • the maximum virtual memory size, which can be retrieved using command ulimit -v

Example 6.1. Retrieving and setting resource limits using ulimit

ulimit -m
1024
ulimit -m unlimited
ulimit -m
unlimited
ulimit -v
8192
ulimit -v unlimited
ulimit -v
unlimited
	  

Creating multi-volumes databases using datafiles and dataspaces