Accumulo on Hortonworks Sandbox
Accumulo is not included in the Ambari installation so has to be manually installed. If you want to do some development with it the best starting place to get an instance up and running quickly is the Hortonworks Sandbox, however due to differences in installation procedures getting this working isn’t quite as straightforward as it could be.
Here are some notes on the procedure to help you on your way.
Prerequisites:
Download the Hortonworks Sandbox and start it in your virtual machine manager, I’m using VirtualBox here. Networking settings are quite important too, I set this to NAT so that the VM runs on a 10.0.2.0 network and management web pages are accessed on your host on the http://127.0.0.1/ address. This keeps everything simple and external repos can be accessed through the host internet connection.
- Sandbox address: http://127.0.0.1:8000/about/
- Ambari address: http://127.0.0.1:8080/
Go to the Ambari management page, login is admin/admin and verify that the processes that we need are up and running. That will be HDFS, MapReduce2, YARN and Zookeeper; I also like to start the Ambari Metrics and collector so that I can see the activity but its not required.
Procedure:
- Log in via ssh to the sandbox, login root/hadoop.
-
1yum install accumulo
- Accumulo is installed under (version numbers may differ), /usr/hdp/2.2.4.2-2/accumulo/
- Copy a configuration example set to the root config directory, select a configuration according to your memory constraints but they should always be a standalone set. e.g.
1cp /usr/hdp/2.2.4.2-2/accumulo/conf/examples/2GB/standalone/* /usr/hdp/2.2.4.2-2/accumulo/conf - Edit the file accumulo-env.sh and set the following variables accordingly.
1234JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64HADOOP_PREFIX=/usr/hdp/2.2.4.2-2/hadoopZOOKEEPER_HOME=/usr/hdp/2.2.4.2-2/zookeepertest -z "$ACCUMULO_HOME" && export ACCUMULO_HOME=/usr/hdp/2.2.4.2-2/accumulo1ACCUMULO_MONITOR_BIND_ALL="true" - Edit the file accumulo-site.xml and modify the value tags as below to hadoop, this is very important so that accumulo can interact with zookeeper.
12345<property><name>instance.secret</name><value>hadoop</value><description>A secret unique to a given instance that all servers must know in order to communicate with one another. Change it before initialization. To change it later use ./bin/accumulo org.apache.accumulo.server.util.ChangeSecret --old [oldpasswd] --new [newpasswd], and then update this file.</description></property>12345<property><name>trace.token.property.password</name><value>hadoop</value><!-- change this to the root user's password, and/or change the user below --><property> - Now we have to change the accumulo user properties, edit /etc/password and change:
1accumulo:x:495:486:accumulo:/var/lib/accumulo:/bin/bash1accumulo:x:495:501:accumulo:/home/accumulo:/bin/bash - Create the home directory (need to su – hdfs to run the hadoop commands)
12mkdir /home/accumulo/datahadoop fs -mkdir -p /home/accumulo/data - Change permissions and ownership
12345chown accumulo:hadoop accumulochown accumulo:hadoop datahadoop fs -chown -R accumulo:hadoop /home/accumulo/datachmod 777 /home/accumulo/datahadoop fs -chmod -R 777 /home/accumulo/data - Now you are ready to initialize accumulo, this step writes the configuration information into zookeeper.
-
1234su - accumulocd /usr/hdp/2.2.4.2-2/accumulo/conf. ./accumulo-env.shaccumulo init
- You should enter that instance name, which can be anything you like and the secret which must be hadoop
123456789101112132015-05-21 16:09:27,651 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss2015-05-21 16:09:27,653 [init.Initialize] INFO : Hadoop Filesystem is dfs://sandbox.hortonworks.com:80202015-05-21 16:09:27,654 [init.Initialize] INFO : Accumulo data dirs are [hdfs://sandbox.hortonworks.com:8020/accumulo]2015-05-21 16:09:27,654 [init.Initialize] INFO : Zookeeper server is localhost:21812015-05-21 16:09:27,654 [init.Initialize] INFO : Checking if Zookeeper is available. If this hangs, then you need to make sure zookeeper is runningInstance name : hortonEnter initial password for root (this may not be applicable for your security setup): ******Confirm initial password for root: ******2015-05-21 16:11:46,827 [Configuration.deprecation] INFO : dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min2015-05-21 16:11:47,156 [Configuration.deprecation] INFO : dfs.block.size is deprecated. Instead, use dfs.blocksize2015-05-21 16:11:47,598 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthorizor2015-05-21 16:11:47,600 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthenticator2015-05-21 16:11:47,603 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKPermHandler - You are now ready to start accumulo
1/usr/hdp/2.2.4.2-2/accumulo/bin/start-all.sh -
123456789101112131415Starting monitor on localhostWARN : Max open files on localhost is 1024, recommend 32768Starting tablet servers .... doneStarting tablet server on localhostWARN : Max open files on localhost is 1024, recommend 327682015-05-21 16:16:11,222 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss2015-05-21 16:16:11,270 [server.Accumulo] INFO : Attempting to talk to zookeeper2015-05-21 16:16:11,539 [server.Accumulo] INFO : Zookeeper connected and initialized, attemping to talk to HDFS2015-05-21 16:16:11,966 [server.Accumulo] INFO : Connected to HDFSStarting master on localhostWARN : Max open files on localhost is 1024, recommend 32768Starting garbage collector on localhostWARN : Max open files on localhost is 1024, recommend 32768Starting tracer on localhostWARN : Max open files on localhost is 1024, recommend 32768
Congratulations, you have successfully installed and started accumulo. You can now monitor your instance at http://127.0.0.1:50095/
Troubleshooting:
If you see this exception during start-up:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
2015-05-13 09:28:18,736 [start.Main] ERROR: Thread 'org.apache.accumulo.master.state.SetGoalState' died. org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /accumulo/ca8f1eff-042c-46b6-9365-261e98fc6f0e/masters/goal_state at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.accumulo.fate.zookeeper.ZooUtil.putData(ZooUtil.java:288) at org.apache.accumulo.fate.zookeeper.ZooUtil.putPersistentData(ZooUtil.java:267) at org.apache.accumulo.fate.zookeeper.ZooReaderWriter.putPersistentData(ZooReaderWriter.java:68) at org.apache.accumulo.master.state.SetGoalState.main(SetGoalState.java:47) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.accumulo.start.Main$2.run(Main.java:130) at java.lang.Thread.run(Thread.java:745) |
This indicates that accumulo doesn’t have sufficient permissions to write into zookeeper. Check that you have configured all the file and user permissions correctly but above all verify that the secret in the accumulo-site.xml config file matches the value you entered at the init stage. It is perfectly safe to set this secret value again using:
1 |
./bin/accumulo org.apache.accumulo.server.util.ChangeSecret |
You will be prompted for the original value and the new value that will get inserted into zookeeper.