HBase install in 2024 year

Initial version: 2024-10-08
Last update: 2024-10-08

In this tutorial you will learn how to install HBase.

Table of contents


Preface
Most of the steps are performed according to description given in my previous tutorial Install and work with Apache HBase

Comunication from host to guest
My goal in this instalation is to treat virtual machine like a remote server so virtual machine will host HBase, serve web application dedicated to use it etc. while client code will run on host.

VirtualBox offers you a planty of networking modes two of which are in our area of interest: Bridged networking and NAT Network. The first one allows you to run servers in a guest, while the second is mostly dedicated to communication initiated from guest.

In my case option number one, the easiest, wasn't possible to apply so I had to do some "tricks" to work with NAT. The "trick" is well known as a port forwarding. It occured to be not so hard – for some details please refer to VirtualBox Network Settings: Complete Guide

Figure: Port forwarding setup


The guest IP 10.0.2.15 is the default IP address assigned to every virtual machine by VirtualBox.

The first rule allows you to use ssh and connect from host to guest. Second will be used by comunication with HBase.

It may sounds silly but remember to install ssh on guest befor you start to test it. First test connection from guest to gest:


nosql@nosql-virtualbox:~$ ssh 127.0.0.1
The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
ED25519 key fingerprint is SHA256:JutNl+moUSIwNnE2Tw6YiynbWh9Vl3MrOqmQEYsK1+s.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '127.0.0.1' (ED25519) to the list of known hosts.
nosql@127.0.0.1's password: 
Welcome to Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-45-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

94 updates can be applied immediately.
45 of these updates are standard security updates.
To see these additional updates run: apt list --upgradable

8 additional security updates can be applied with ESM Apps.
Learn more about enabling ESM Apps service at https://ubuntu.com/esm

Last login: Fri Sep 27 00:53:09 2024 from 10.0.2.2
nosql@nosql-virtualbox:~$
Of course in this case you will log into the same system – this is not important because the only thing you should care is if connection will be established successfully or not.


nosql@nosql-virtualbox:~$ exit
logout
Connection to 127.0.0.1 closed.
nosql@nosql-virtualbox:~$
If there will be no errors or problems you can try to connect from host to guest (fulmanp-ThinkPad-T540p is a host, nosql-virtualbox is a guest); remember to use p flag with 2222 value:


fulmanp@fulmanp-ThinkPad-T540p:~$ ssh -p 2222 nosql@127.0.0.1
The authenticity of host '[127.0.0.1]:2222 ([127.0.0.1]:2222)' can't be established.
ED25519 key fingerprint is SHA256:JutNl+moUSIwNnE2Tw6YiynbWh9Vl3MrOqmQEYsK1+s.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[127.0.0.1]:2222' (ED25519) to the list of known hosts.
nosql@127.0.0.1's password: 
Welcome to Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-45-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

8 additional security updates can be applied with ESM Apps.
Learn more about enabling ESM Apps service at https://ubuntu.com/esm


The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

nosql@nosql-virtualbox:~$ cd ~/Desktop/nosql/
nosql@nosql-virtualbox:~/Desktop/nosql$ ls -l
total 8
drwxrwxr-x 8 nosql nosql 4096 Sep 26 23:39 hbase-2.5.10
drwxrwxr-x 4 nosql nosql 4096 Sep 26 23:39 tmp
nosql@nosql-virtualbox:~/Desktop/nosql$ exit
logout
Connection to 127.0.0.1 closed.
Install Java
You neet to have both JRE and JDK installed (mostly you need JRE but JDK is neede for example to use jps command).

Figure: Synaptic and packages


After instalation remember to set $JAVA_HOME environment variable:


nosql@nosql-virtualbox:~/Desktop/nosql$ java -version
openjdk version "21.0.4" 2024-07-16
OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04)
OpenJDK 64-Bit Server VM (build 21.0.4+7-Ubuntu-1ubuntu224.04, mixed mode, sharing)
nosql@nosql-virtualbox:~/Desktop/nosql$ echo $JAVA_HOME
[empty]
nosql@nosql-virtualbox:~/Desktop/nosql$ update-java-alternatives --list
java-1.21.0-openjdk-amd64      2111       /usr/lib/jvm/java-1.21.0-openjdk-amd64
Because in this case java is installed in /usr/lib/jvm/java-1.21.0-openjdk-amd64 directory, so add below line to conf/hbase-env.sh:


export JAVA_HOME=/usr/lib/jvm/java-1.21.0-openjdk-amd64
Alternatively you can add it directly to .bashrc file so that it may be set permanently. If you want to see it right now remember to use source command:


nosql@nosql-virtualbox:~$ echo $JAVA_HOME

nosql@nosql-virtualbox:~$ source ~/.bashrc
nosql@nosql-virtualbox:~$ echo $JAVA_HOME
/usr/lib/jvm/java-1.21.0-openjdk-amd64


Install HBase on guest
  1. Create directory to store downloaded files
    
    nosql@nosql-virtualbox:~/Desktop$ mkdir ~/Desktop/install
    nosql@nosql-virtualbox:~/Desktop$ mkdir ~/Desktop/install/hbase
    
  2. Download files. I downloaded stable release 2.5.10 dated on 2024.07.24 – it was a real surprise for me because it means that HBase is still maintained and under active development:

    Figure: HBase download page (September, 2024)


  3. Verify file integrity
    
    nosql@nosql-virtualbox:~/Desktop/install/hbase$ sha512sum hbase-2.5.10-bin.tar.gz
    [... s1: SHA 512 SUM of downloaded binary file ...]
    nosql@nosql-virtualbox:~/Desktop/install/hbase$ cat hbase-2.5.10-bin.tar.gz.sha512
    [... s2: Correct SHA 512 SUM of a binary file ...]
    
    Of course s1 must agree with s2.
  4. Extract archive and move to final destination (~/Desktop/nosql/ in my case):
    
    nosql@nosql-virtualbox:~/Desktop/install/hbase$ tar zxvf hbase-2.5.10-bin.tar.gz
    nosql@nosql-virtualbox:~/Desktop/install/hbase$ mkdir ~/Desktop/nosql
    nosql@nosql-virtualbox:~/Desktop/install/hbase$ mv hbase-2.5.10 ~/Desktop/nosql/
    


Test the basic functionality of HBase
  1. Start HBase:
    
    nosql@nosql-virtualbox:~/Desktop/nosql$ /home/nosql/Desktop/nosql/hbase-2.5.10/bin/start-hbase.sh
    running master, logging to /home/nosql/Desktop/nosql/hbase-2.5.10/bin/../logs/hbase-nosql-master-nosql-virtualbox.out
    
  2. Verify if all HBase prcesses are running:
    
    nosql@nosql-virtualbox:~/Desktop/nosql$ jps
    6134 Jps
    5436 HMaster
    
  3. Start HBase Shell:
    
    nosql@nosql-virtualbox:~/Desktop/nosql$ hbase-2.5.10/bin/hbase shell
    HBase Shell
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
    Version 2.5.10, ra3af60980c61fb4be31e0dcd89880f304d01098a, Thu Jul 18 22:45:17 PDT 2024
    Took 0.0169 seconds                                                                                                              
    hbase:001:0> list
    TABLE                                                                                                                            
    0 row(s)
    Took 0.6565 seconds                                                                                                              
    => []
    hbase:002:0> create_namespace 'test'
    Took 0.2385 seconds                                                                                                              
    hbase:003:0> create 'test:table1', 'f1'
    Created table test:table1
    Took 0.7019 seconds                                                                                                              
    => Hbase::Table - test:table1
    hbase:004:0> list
    TABLE                                                                                                                            
    test:table1                                                                                                                      
    1 row(s)
    Took 0.0193 seconds                                                                                                              
    => ["test:table1"]
    hbase:005:0> put 'test:table1', 'key005', 'f1:column_a', 'value_005_f1_a'
    Took 0.2712 seconds                                                                                                              
    hbase:006:0> scan 'test:table1'
    ROW                               COLUMN+CELL                                                                                    
     key005                           column=f1:column_a, timestamp=2024-09-26T23:56:09.939, value=value_005_f1_a                    
    1 row(s)
    Took 0.0981 seconds                                                                                                              
    hbase:007:0> exit
    
  4. Connect again to check if your changes in database are still there:
    
    nosql@nosql-virtualbox:~/Desktop/nosql$ hbase-2.5.10/bin/hbase shell
    HBase Shell
    Use "help" to get list of supported commands.
    Use "exit" to quit this interactive shell.
    For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
    Version 2.5.10, ra3af60980c61fb4be31e0dcd89880f304d01098a, Thu Jul 18 22:45:17 PDT 2024
    Took 0.0029 seconds                                                                                                              
    hbase:001:0> scan 'test:table1'
    ROW                               COLUMN+CELL                                                                                    
     key005                           column=f1:column_a, timestamp=2024-09-26T23:56:09.939, value=value_005_f1_a                    
    1 row(s)
    Took 0.7532 seconds                                                                                                              
    hbase:002:0> exit
    
    I have some problem with disappearing data. Sometimes after restarting my database is empty. Maybe the following executed just before stopping database (not before exiting from shell) will help:

    
    hbase memstore manual flush
    hbase:003:0> flush 'test:table1'
    Took 0.5539 seconds
    
  5. Run Thrift to allow communication from different programming languages:
    
    nosql@nosql-virtualbox:~/Desktop/nosql$ cd hbase-2.5.10/bin/
    nosql@nosql-virtualbox:~/Desktop/nosql/hbase-2.5.10/bin$ ./stop-hbase.sh 
    stopping hbase.............
    nosql@nosql-virtualbox:~/Desktop/nosql/hbase-2.5.10/bin$ ./hbase-daemon.sh start thrift
    running thrift, logging to /home/nosql/Desktop/nosql/hbase-2.5.10/bin/../logs/hbase-nosql-thrift-nosql-virtualbox.out
    nosql@nosql-virtualbox:~/Desktop/nosql/hbase-2.5.10/bin$ ./start-hbase.sh 
    running master, logging to /home/nosql/Desktop/nosql/hbase-2.5.10/bin/../logs/hbase-nosql-master-nosql-virtualbox.out
    


Write Python script and execute it on host
  1. Create working directory
    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop$ mkdir nosql
    
  2. Create directory for virtual environments configuration:
    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop$ mkdir python_virtual_environments
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop$ cd python_virtual_environments/
    
  3. Create virtual environment:
    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ python3 -m venv nosql
    
    Because I got the message:

    
    The virtual environment was not created successfully because ensurepip is not
    available.  On Debian/Ubuntu systems, you need to install the python3-venv
    package using the following command.
    
        apt install python3.12-venv
    
    You may need to use sudo with that command.  After installing the python3-venv
    package, recreate your virtual environment.
    
    Failing command: /home/fulmanp/Desktop/python_virtual_environments/nosql/bin/python3
    
    so I did sudo apt install python3.12-venv:

    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ sudo apt install python3.12-venv
    [sudo] password for fulmanp: 
    Reading package lists... Done
    Building dependency tree... Done
    Reading state information... Done
    [...]
    After this operation, 2 771 kB of additional disk space will be used.
    Do you want to continue? [Y/n] 
    Get:1 http://pl.archive.ubuntu.com/ubuntu noble/universe amd64 python3-pip-whl all 24.0+dfsg-1ubuntu1 [1 702 kB]
    [...]
    Setting up python3.12-venv (3.12.3-1ubuntu0.2) ...
    
    and then again:

    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ python3 -m venv nosql
    
  4. Activate virtual environment:
    
    fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ source /home/fulmanp/Desktop/python_virtual_environments/nosql/bin/activate
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$
    
  5. Add system components needed by happybase (which is used to simplify communication from Python to HBase):
    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ sudo apt-get update
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ sudo apt-get install gcc
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ sudo apt-get install python3-dev
    
  6. Install happybase:
    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ python3 -m pip install happybase
    
  7. Create simple test program:
    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/python_virtual_environments$ cd ../nosql/
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ ls -l
    total 0
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ touch hbase_simple_test.py
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ ls -l
    total 4
    -rw-rw-r-- 1 fulmanp fulmanp 575 wrz 27 01:21 hbase_simple_test.py
    
    Paste the following code as hbase_simple_test.py content:

    
    import happybase
    
    def test():
        connection = happybase.Connection(host='127.0.0.1',
                                          port=9090,
                                          autoconnect=True)
    
        print(f'Tables: {connection.tables()}')
    
        table = connection.table('test:table1')
    
        rows = table.rows([b'key005'])
        for key, data in rows:
          print(f'row key={key}, data={data}')
          for k in data:
            print(f'  data key={k}')
            print(f'  data[{k}]={data[k]}')
            
            print(f'  decode data key: {k.decode("utf-8")}')
            # Get value and decode it:
            val = data[k].decode("utf-8")
            print(f'  Decoded value={val}')
            print(f'  Encoded decoded value: {val.encode('utf-8')}')
            print('  ---')
    
    
    if __name__ == '__main__':
        test()
    
  8. Execute simple test program: Because when I tried to execute it:

    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ python3 hbase_simple_test.py
    
    I got a message:

    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ python3 hbase_simple_test.py 
    Traceback (most recent call last):
      File "/home/fulmanp/Desktop/nosql/hbase_simple_test.py", line 1, in 
        import happybase
      File "/home/fulmanp/Desktop/python_virtual_environments/nosql/lib/python3.12/site-packages/happybase/__init__.py", line 6, in 
        import pkg_resources as _pkg_resources
    ModuleNotFoundError: No module named 'pkg_resources'
    
    I had to install setuptools:

    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ python3 -m pip install setuptools
    
    Now exectution was successful:

    
    (nosql) fulmanp@fulmanp-ThinkPad-T540p:~/Desktop/nosql$ python3 hbase_simple_test.py 
    Tables: [b'test:table1']
    row key=b'key005', data={b'f1:column_a': b'value_005_f1_a'}
      data key=b'f1:column_a'
      data[b'f1:column_a']=b'value_005_f1_a'
      decode data key: f1:column_a
      Decoded value=value_005_f1_a
      Encoded decoded value: b'value_005_f1_a'
      ---