1. 如何編譯Apache Hadoop2.4.0源代碼
安裝JDK
hadoop是java寫的,編譯hadoop必須安裝jdk。
從oracle官網下載jdk,下載地址是http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html,選擇 jdk-7u45-linux-x64.tar.gz下載。
tar -zxvfjdk-7u45-linux-x64.tar.gz
會生成一個文件夾jdk1.7.0_45,然後設置環境變數中。
執行命令 vi/etc/profile,增加以下內容到配置文件中,結果顯示如下
export JAVA_HOME=/usr/java/jdk1.7.0_45
export JAVA_OPTS="-Xms1024m-Xmx1024m"
exportCLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH
保存退出文件後,執行以下命令
source /etc/profile
java –version 看到顯示的版本信息即正確。
安裝maven
hadoop源碼是使用maven組織管理的,必須下載maven。從maven官網下載,下載地址是http://maven.apache.org/download.cgi,選擇 apache-maven-3.1.0-bin.tar.gz 下載,不要選擇3.1下載。
執行以下命令解壓縮jdk
tar -zxvf apache-maven-3.1.0-bin.tar.gz
會生成一個文件夾apache-maven-3.1.0,然後設置環境變數中。
執行命令vi /etc/profile,編輯結果如下所示
MAVEN_HOME=/usr/maven/apache-maven-3.1.0
export MAVEN_HOME
export PATH=${PATH}:${MAVEN_HOME}/bin
保存退出文件後,執行以下命令
source /etc/profile
mvn -version
如果看到下面的顯示信息,證明配置正確了。
2. hadoop源碼修改了,編譯成功後,將編譯後的hadoop文件直接拿來搭建么,還是需要經過什麼處理呢
把你編譯後的hadoop源碼丟到原來的hadoop集群環境中去 即覆蓋hadoop安裝目錄下的原hadoop-core-xxx.jar 同樣的所有節點都需要更新 然後重啟集群
3. linux下,如何在上不了外網的情況下編譯hadoop源碼作為一名學生,真是太愁苦了!!
1.首先確認,你的linux上有沒有安裝ant
2.確認你的 各種環境是否配置正確(不會的話去看hadoop開發指南)
我把我的build.xml發給你
<?xml version="1.0"?>
<!--build.xml - a simple Ant buildfile -->
<project name="Simple Buildfile" default="compile" basedir=".">
<!--Set up the env prefix for environment variable -->
<property environment="env"/>
<!-- The directory cotaining source code -->
<property name="src.dir" value="src" />
<!-- Temporary build directories -->
<property name="build.dir" value="build"/>
<property name="build.classes" value="${build.dir}/classes"/>
<property name="build.lib" value="${build.dir}/lib"/>
<property name="jar.name" value="howtouse.jar"/>
<!-- Hadoop path -->
<property name="hadoop.path" value="${env.HADOOP_HOME}"/>
<!-- Target to create the build directories prior to the -->
<!-- compile target -->
<target name="prepare">
<mkdir dir="${build.dir}"/>
<mkdir dir="${build.classes}"/>
<mkdir dir="${build.lib}"/>
</target>
<target name="clean" description="Removes all generated files.">
<delete dir="${build.dir}"/>
</target>
<target name="compile" depends="prepare" description="Compiles all source code.">
<javac srcdir="${src.dir}" destdir="${build.classes}" includeantruntime="false"
deprecation="false">
<classpath>
<pathelement location="${hadoop.path}/hadoop-core-0.20.203.0.jar"/>
<pathelement location="${hadoop.path}/hadoop-tools-0.20.203.0.jar"/>
<pathelement location="${hadoop.path}/hadoop-examples-0.20.203.0.jar"/>
<pathelement location="${hadoop.path}/lib/mockito-all-1.8.5.jar"/>
<pathelement location="${hadoop.path}/lib/junit-4.5.jar"/>
<pathelement location="${hadoop.path}/lib/commons-math-2.1.jar"/>
<pathelement location="${hadoop.path}/lib/commons-cli-1.2.jar"/>
<pathelement location="${hadoop.path}/contrib/datajoin/hadoop-datajoin-0.20.203.0.jar"/>
</classpath>
</javac>
</target>
<target name="jar" depends="compile" description="Generates jar in the dist directory.">
<!-- Exclude unit tests from the final Jar file -->
<jar jarfile="${jar.name}" basedir="${build.classes}" excludes="**/*Test.class"/>
</target>
<target name="all" depends="clean,jar" description="Cleans,compiles,then builds the Jar file."/>
</project>
確保以上條件後
1.將自己寫的java文件放到 /home/hadoop/ant/build/src 使用XFTP
2.在HDFS中新建一個文件目錄專門用來裝資源文件 hadoop fs -mkdir /user/src
3.將simple.txt測試文件放到HDFS中 hadoop fs -put/user/src
4.將/home/hadoop/ant/build/src下的java文件編譯打包成play.jar(名字由build.xml決定)
方法:編譯之前必須保證與build.xml同級目錄下 ant jar
5.hadoop單機模式運行的時候必須保證在lib目錄下
hadoop jar play.jar hadoop/MaxTemperature /user/src/simple.txt output
4. 怎麼使用eclipse編譯hadoop源碼
使用eclipse編譯hadoop源碼
1,建立一個Hadoop源碼文件夾。
2、svn 檢出hadoop1.0.4的源碼。svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.4
注意:如果在ubuntu下直接上面語句報錯,可能需要執行下面的語句
sudo apt-get install autoconf
sudo apt-get install libtool
3、在檢出完成後的目錄下執行
ant eclipse.然後將源碼導入到eclipse中。
4、修改 release-1.0.4/src/contrib/gridmix/src/Java/org/apache/hadoop/mapred/gridmix/Gridmix.java
將兩處的 Enum<? extends T> 改成 Enum<?>
5、編譯器設置及編譯。
右擊工程名,Properties-->Builders-->New--->Ant Builder
New_Builder --> Edit: Name: hadoop-Builder.Main:Builderfile(builder.xml的位置):/home/nacey/workspace/source-workspace/hadoop-1.0.4;Targets—>Manual Build: jar
然後選擇菜單Project-->Build Project
在/home/nacey/workspace/source-workspace/hadoop-1.0.4/build文件夾下會生成三個開發 jar 包:
hadoop-client-1.0.4-SNAPSHOT.jar
hadoop-core-1.0.4-SNAPSHOT.jar
hadoop-minicluster-1.0.4-SNAPSHOT.jar
去掉"-SNAPSHOT"即可替換hadoop-1.0.4 下的同名 jar 包.
注意如果要在集群中使用自己編譯的jar,則需要替換集群中的所有機器。不然會出現版本不匹配。
5. 如何編譯Apache Hadoop2.2.0源代碼
下載hadoop-2.2.0-src.tar.gz 下載。
執行以下命令解壓縮jdk
tar -zxvf hadoop-2.2.0-src.tar.gz
會生成一個文件夾 hadoop-2.2.0-src。源代碼中有個bug,這里需要修改一下,編輯目錄/usr/local/hadoop-2.2.0-src/hadoop-common-project/hadoop-auth中的文件pom.xml,執行以下命令
gedit pom.xml
在第55行下增加以下內容
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<scope>test</scope>
</dependency>
保存退出即可。
上述bug詳見https://issues.apache.org/jira/browse/HADOOP-10110,在hadoop3中修復了,離我們太遙遠了。
好了,現在進入到目錄/usr/local/hadoop-2.2.0-src中,執行命令
mvn package -DskipTests -Pdist,native,docs
如果沒有執行第4步,把上面命令中的docs去掉即可,就不必生成文檔了。
該命令會從外網下載依賴的jar,編譯hadoop源碼,需要花費很長時間,你可以吃飯了。
在等待n久之後,可以看到如下的結果:
[INFO] Apache Hadoop Main ................................ SUCCESS [6.936s]
[INFO] Apache Hadoop Project POM ......................... SUCCESS [4.928s]
[INFO] Apache Hadoop Annotations ......................... SUCCESS [9.399s]
[INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.871s]
[INFO] Apache Hadoop Project Dist POM .................... SUCCESS [7.981s]
[INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [8.965s]
[INFO] Apache Hadoop Auth ................................ SUCCESS [39.748s]
[INFO] Apache Hadoop Auth Examples ....................... SUCCESS [11.081s]
[INFO] Apache Hadoop Common .............................. SUCCESS [10:41.466s]
[INFO] Apache Hadoop NFS ................................. SUCCESS [26.346s]
[INFO] Apache Hadoop Common Project ...................... SUCCESS [0.061s]
[INFO] Apache Hadoop HDFS ................................ SUCCESS [12:49.368s]
[INFO] Apache Hadoop HttpFS .............................. SUCCESS [41.896s]
[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [41.043s]
[INFO] Apache Hadoop HDFS-NFS ............................ SUCCESS [9.650s]
[INFO] Apache Hadoop HDFS Project ........................ SUCCESS [0.051s]
[INFO] hadoop-yarn ....................................... SUCCESS [1:22.693s]
[INFO] hadoop-yarn-api ................................... SUCCESS [1:20.262s]
[INFO] hadoop-yarn-common ................................ SUCCESS [1:30.530s]
[INFO] hadoop-yarn-server ................................ SUCCESS [0.177s]
[INFO] hadoop-yarn-server-common ......................... SUCCESS [15.781s]
[INFO] hadoop-yarn-server-nodemanager .................... SUCCESS [40.800s]
[INFO] hadoop-yarn-server-web-proxy ...................... SUCCESS [6.099s]
[INFO] hadoop-yarn-server-resourcemanager ................ SUCCESS [37.639s]
[INFO] hadoop-yarn-server-tests .......................... SUCCESS [4.516s]
[INFO] hadoop-yarn-client ................................ SUCCESS [25.594s]
[INFO] hadoop-yarn-applications .......................... SUCCESS [0.286s]
[INFO] hadoop-yarn-applications-distributedshell ......... SUCCESS [10.143s]
[INFO] hadoop-maprece-client ........................... SUCCESS [0.119s]
[INFO] hadoop-maprece-client-core ...................... SUCCESS [55.812s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [8.749s]
[INFO] hadoop-yarn-site .................................. SUCCESS [0.524s]
[INFO] hadoop-yarn-project ............................... SUCCESS [16.641s]
[INFO] hadoop-maprece-client-common .................... SUCCESS [40.796s]
[INFO] hadoop-maprece-client-shuffle ................... SUCCESS [7.628s]
[INFO] hadoop-maprece-client-app ....................... SUCCESS [24.066s]
[INFO] hadoop-maprece-client-hs ........................ SUCCESS [13.243s]
[INFO] hadoop-maprece-client-jobclient ................. SUCCESS [16.670s]
[INFO] hadoop-maprece-client-hs-plugins ................ SUCCESS [3.787s]
[INFO] Apache Hadoop MapRece Examples .................. SUCCESS [17.012s]
[INFO] hadoop-maprece .................................. SUCCESS [6.459s]
[INFO] Apache Hadoop MapRece Streaming ................. SUCCESS [12.149s]
[INFO] Apache Hadoop Distributed Copy .................... SUCCESS [15.968s]
[INFO] Apache Hadoop Archives ............................ SUCCESS [5.851s]
[INFO] Apache Hadoop Rumen ............................... SUCCESS [18.364s]
[INFO] Apache Hadoop Gridmix ............................. SUCCESS [14.943s]
[INFO] Apache Hadoop Data Join ........................... SUCCESS [9.648s]
[INFO] Apache Hadoop Extras .............................. SUCCESS [5.763s]
[INFO] Apache Hadoop Pipes ............................... SUCCESS [16.289s]
[INFO] Apache Hadoop Tools Dist .......................... SUCCESS [3.261s]
[INFO] Apache Hadoop Tools ............................... SUCCESS [0.043s]
[INFO] Apache Hadoop Distribution ........................ SUCCESS [56.188s]
[INFO] Apache Hadoop Client .............................. SUCCESS [10.910s]
[INFO] Apache Hadoop Mini-Cluster ........................ SUCCESS [0.321s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 40:00.444s
[INFO] Finished at: Thu Dec 26 12:42:24 CST 2013
[INFO] Final Memory: 109M/362M
[INFO] ------------------------------------------------------------------------
6. 為什麼要編譯 hadoop 源碼 怎麼在eclipse里寫hadoop程序啊~ 上不了外網,hadoop源碼編譯不了,該怎麼
1:編譯了hadoop,可以方便的查看某個函數的實現。如果不編譯就只是自己去翻源代碼了。更重要的是如果你編譯了hadoop,你可以根據自己的需要改動hadoop的某些實現機制。(hadoop開源的好處).
2:編程hadoop程序是不需要編譯hadoop源碼的。你可以參看網上hadoop安裝教程。
關於hadoop編程,歡迎訪問我的博客:http://blog.csdn.net/jackydai987
7. 如何在hadoop-2.6.0上編譯運行自己編寫的java代碼
在不使用eclipse情況使java程序在hadoop 2.2中運行的完整過程。整個過程中其實分為java程序的編譯,生成jar包,運行測試。
這三個步驟運用的命令都比較簡單,主要的還是如何找到hadoop 2.2提供給java程序用來編譯的jar包。具體可以查看:
HADOOP_HOME/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib目錄
下面會通過一個在hadoop中創建一個目錄的JAVA例子來進行演示
具體代碼如下:
package com.wan.demo;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HADemo {
public static void main(String[] args) {
// TODO Auto-generated method stub
mkdir(args[0]);
}
public static void mkdir(String dir){
Configuration configuration=new Configuration();
FileSystem fs;
try {
fs = FileSystem.get(configuration);
fs.mkdirs(new Path(dir));
fs.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
把HADemo.java文件拷貝到linux環境中
配置HADOOP_HOME/bin到環境中,啟動集群,進入HADemo.java文件目錄中
注:下面的lib目錄裡面的文件由HADOOP_HOME/share/hadoop/httpfs/tomcat/webapps/ webhdfs/WEB-INF/lib目錄中獲取,下面做的目的是為了縮減命令長度
1.編譯java
# mkdir class
#Javac -classpath .:lib/hadoop-common-2.2.0.jar:lib/hadoop-annotations-2.2.0.jar -d class HADemo.java
2.生成jar包
#jar -cvf hademo.jar -C class/ .
added manifest
adding: com/(in = 0) (out= 0)(stored 0%)
adding: com/wan/(in = 0) (out= 0)(stored 0%)
adding: com/wan/demo/(in = 0) (out= 0)(stored 0%)
adding: com/wan/demo/HADemo.class(in = 844) (out= 520)(deflated 38%)
3.測試運行
#hadoop jar hademo.jar com.wan.demo.HADemo /test
檢測:
#hadoop fs -ls /
結束!
8. 為什麼要編譯apache hadoop2.2.0源代碼
hadoop是不需要編譯的,解壓就可以直接使用了,操作如下: tar –zxvf hadoop-2.2.0.tar.gz #解壓"hadoop-2.2.0.tar.gz"安裝包mv hadoop-2.2.0 /usr/local/hadoop #將"hadoop-2.2.0"移動到/usr/local目錄下chown –R hadoop:hadoop /usr/local/had...
9. hadoop搭建時為什麼要重新編譯源碼的解釋
把你編譯後的hadoop源碼丟到原來的hadoop集群環境中去 即覆蓋hadoop安裝目錄下的原hadoop-core-xxx.jar 同樣的所有節點都需要更新 然後重啟集群
10. 你好,我編譯hadoop2.4源碼,成功之後導入eclipse中,但是怎麼還是有報錯,缺少包這類的問題呢
你把hadoop安裝目錄下的各個share目錄下的lib下的jar包導入項目,那些是hadoop依賴的jar包