欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

hadoop-2.6.0-cdh5.5.2源码编译:支持snappy压缩

程序员文章站 2024-03-18 10:16:46
...

1、下载tar包

下载 hadoop-2.6.0-cdh5.5.2-src.tar.gz

2、解压并查看

# tar zxvg hadoop-2.6.0-cdh5.5.2-src.tar.gz

# head -15 BUILDING.txt 
Build instructions for Hadoop
----------------------------------------------------------------------------------
Requirements:

* Unix System
* JDK 1.7+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
* Zlib devel (if compiling native code)
* openssl devel ( if compiling native hadoop-pipes )
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

3、安装依赖包

# yum -y install svn ncurses-devel gcc* protobuf-compiler 
# yum -y install lzo-devel zlib-devel gcc gcc-c++ autoconf automake libtool cmake openssl-devel
# yum -y install fuse-devel fuse build-essential zlib1g-dev pkg-config libssl-dev libprotobuf-dev
# yum -y install libfuse-dev bzip2-devel snappy libsnappy-dev bzip2 libbz2-dev libjansson-dev

1)安装 protobuf

下载 protobuf-2.5.0.tar.gz

# tar zxvf protobuf-2.5.0.tar.gz -C /usr/local/
# cd /usr/local/protobuf-2.5.0
# ./configure
# make && make install

2)安装 findbugs

下载 findbugs-3.0.1.tar.gz

# tar zxvf findbugs-3.0.1.tar.gz -C /usr/local/

3)安装 maven

下载 apache-maven-3.5.3-bin.tar.gz

# tar zxvf apache-maven-3.5.3-bin.tar.gz -C /usr/local/

4)安装 ant

下载 apache-ant-1.9.11-bin.tar.gz

# tar zxvf apache-ant-1.9.7-bin.tar.gz -C /usr/local/

5)安装 snappy

下载 snappy-1.1.3.tar.gz

# tar zxvf snappy-1.1.3.tar.gz -C /usr/local/ 
# cd /usr/local/snappy-1.1.3
# ./configure
# make && make install

6)安装 jdk

http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html

4、配置环境变量

export JAVA_HOME=/usr/java/jdk1.7.0_80
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export MAVEN_HOME=/usr/local/apache-maven-3.5.3
export FINDBUGS_HOME=/usr/local/findbugs-3.0.1 
export PROTOBUF_HOME=/usr/local/protobuf-2.5.0 
export ANT_HOME=/usr/local/apache-ant-1.9.11

export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$FINDBUGS_HOME/bin:$ANT_HOME/bin

5、验证

# java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
# mvn -v
Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T15:58:13+08:00)
Maven home: /usr/local/apache-maven-3.5.3
Java version: 1.7.0_80, vendor: Oracle Corporation
Java home: /usr/java/jdk1.7.0_80/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-693.21.1.el7.x86_64", arch: "amd64", family: "unix"
# findbugs -version
3.0.1
# protoc --version
libprotoc 2.5.0
# ls -lh /usr/local/lib | grep snappy
-rw-r--r-- 1 root root 511K May 23 13:06 libsnappy.a
-rwxr-xr-x 1 root root  955 May 23 13:06 libsnappy.la
lrwxrwxrwx 1 root root   18 May 23 13:06 libsnappy.so -> libsnappy.so.1.3.0
lrwxrwxrwx 1 root root   18 May 23 13:06 libsnappy.so.1 -> libsnappy.so.1.3.0
-rwxr-xr-x 1 root root 253K May 23 13:06 libsnappy.so.1.3.0

6、编译 hadoop

# mvn clean package -Pdist,native -DskipTests -Dtar -Dbundle.snappy -Dsnappy.lib=/usr/local/lib

参数说明:

-Pdist,native   :重新编译hadoop动态库
-DskipTests     :跳过测试
-Dtar           :打成tar包
-Dbundle.snappy :添加snappy压缩支持
-Dsnappy.lib=/usr/local/lib  :snappy库路径

编译日志:

main:
     [exec] $ tar cf hadoop-2.6.0-cdh5.5.2.tar hadoop-2.6.0-cdh5.5.2
     [exec] $ gzip -f hadoop-2.6.0-cdh5.5.2.tar
     [exec] 
     [exec] Hadoop dist tar available at: /data/soft/hadoop-2.6.0-cdh5.5.2/hadoop-dist/target/hadoop-2.6.0-cdh5.5.2.tar.gz
     [exec] 
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-dist ---
[INFO] Building jar: /data/soft/hadoop-2.6.0-cdh5.5.2/hadoop-dist/target/hadoop-dist-2.6.0-cdh5.5.2-javadoc.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [05:22 min]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [01:44 min]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [01:02 min]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.829 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [01:06 min]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [01:16 min]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [08:11 min]
[INFO] Apache Hadoop Auth ................................. SUCCESS [03:38 min]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 23.072 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [07:31 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 20.975 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [01:51 min]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.040 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [06:48 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [01:25 min]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [01:39 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [  5.574 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.037 s]
[INFO] hadoop-yarn ........................................ SUCCESS [  0.093 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [01:30 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [01:24 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [  0.108 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [ 10.047 s]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [01:54 min]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [  2.645 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [  6.131 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 17.503 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [  1.072 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [  5.614 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [  0.036 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [  3.449 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [  2.293 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [  0.069 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [  5.647 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [  7.496 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [  0.169 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [ 27.196 s]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 16.749 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [  4.179 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [  9.747 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [  6.914 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [ 24.696 s]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [  2.478 s]
[INFO] hadoop-mapreduce-client-nativetask ................. SUCCESS [01:22 min]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [  4.960 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [  6.066 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 19.190 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 33.327 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  2.458 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  2.178 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [  5.748 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  4.453 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  2.662 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  3.827 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  2.868 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  7.180 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  5.756 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [01:10 min]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 20.748 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [  7.300 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  1.666 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  6.151 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  9.300 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.298 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [01:40 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 56:41 min
[INFO] Finished at: 2018-05-23T14:23:10+08:00
[INFO] Final Memory: 136M/390M
[INFO] ------------------------------------------------------------------------

成功截图:

hadoop-2.6.0-cdh5.5.2源码编译:支持snappy压缩

7、检查

# hadoop checknative -a
18/05/23 16:01:09 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
18/05/23 16:01:09 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop:  true /opt/cloudera/CDH/hadoop-2.6.0-cdh5.5.2/lib/libhadoop.so.1.0.0
zlib:    true /lib64/libz.so.1
snappy:  true /opt/cloudera/CDH/hadoop-2.6.0-cdh5.5.2/lib/libsnappy.so.1
lz4:     true revision:99
bzip2:   true /lib64/libbz2.so.1
openssl: true /lib64/libcrypto.so

8、Q & A

1)报错:Unable to load native-hadoop library for your platform

# hadoop checknative -a
18/05/23 14:41:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop:  false 
zlib:    false 
snappy:  false 
lz4:     false 
bzip2:   false 
openssl: false 
18/05/23 14:41:24 INFO util.ExitUtil: Exiting with status 

解决方案:

# 到网站 http://dl.bintray.com/sequenceiq/sequenceiq-bin/ 下载对应的编译版本。

# 执行以下命令:

# tar -xvf hadoop-native-64-2.6.0.tar -C $HADOOP_HOME/lib/
# tar -xvf hadoop-native-64-2.6.0.tar -C $HADOOP_HOME/lib/native

# 配置环境变量

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native"

# 复制文件

# cp $HADOOP_SRC_HOME/hadoop-dist/target/hadoop-2.6.0-cdh5.5.2/lib/native/* $HADOOP_HOME/lib/native/
# cp $HADOOP_SRC_HOME/hadoop-dist/target/hadoop-2.6.0-cdh5.5.2/lib/native/* $HADOOP_HOME/lib/