Haddop - Introduction
Here we will focus how to install Hadoop on Windows 10 environment rather than its detail framework, but will cover brief definition.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
I went through Hadoop 2.8.0 version, though you can use any stable version.
Download Hadoop 2.8.0
Download Java JDK 1.8.0
Check either Java 1.8.0 is already installed on your system or not, using command prompt –
Java –version
If Java is not installed on your system, then first installs java under "D:\JAVA" or your preferred drive –
Extract file Hadoop 2.8.0.tar.gz or Hadoop-2.8.0.zip and place under "D:\Hadoop", you can use any preferred location –
[1] You will get again a tar file post extraction –
[2] Go inside of Hadoop-2.8.0.tar folder and extract again –
[3] Copy the leaf folder “hadoop-2.8.0” and move to the root folder "D:\Hadoop" and removed all other files and folders –
Set the path HADOOP_HOME and JAVA_HOME Environment variables (User Variables) on windows 10 –
This PC - > Right Click - > Properties - > Advanced System Settings - > Advanced - > Environment Variables
STEP - 3: Configure System variables
Next onward need to set some particulars System variables, including Hadoop bin directory and JAVA bin directory path –
Variable: Path
Value:
- D:\Hadoop\hadoop-2.8.0\bin
- D:\Hadoop\hadoop-2.8.0\sbin
- D:\Hadoop\hadoop-2.8.0\share\hadoop\common\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\hdfs
- D:\Hadoop\hadoop-2.8.0\share\hadoop\hdfs\lib\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\hdfs\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\yarn\lib\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\yarn\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\mapreduce\lib\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\mapreduce\*
- D:\Hadoop\hadoop-2.8.0\share\hadoop\common\lib\*
- D:\Java\jdk1.8.0_171\bin
- Create some dedicated folders -
- Create folder "data" under "D:\Hadoop\hadoop-2.8.0".
- Create folder "datanode" under “D:\Hadoop\hadoop-2.8.0\data".
- Create folder "namenode" under “D:\Hadoop\hadoop-2.8.0\data”
- Create a folder to store temporary data during execution of a project, such as “D:\Hadoop\hadoop-2.8.0\temp.”
- Create a log folder, such as “D:\Hadoop\hadoop-2.8.0\userlog”
For example -
Now need to configure four key files with minimal required details –
- core-site.xml
- hdfs-site.xml
- mapred.xml
- yarn.xml
[1] Edit file D:/Hadoop/hadoop-2.8.0/etc/hadoop/core-site.xml, paste below xml paragraph and save this file.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
[2] Rename "mapred-site.xml.template" to "mapred-site.xml" and edit this file D:/Hadoop/hadoop-2.8.0/etc/hadoop/mapred-site.xml, paste below xml paragraph and save this file.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[3] Edit file D:/Hadoop/hadoop-2.8.0/etc/hadoop/hdfs-site.xml, paste below xml paragraph and save this file.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/D:/Hadoop/hadoop-2.8.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/D:/Hadoop/hadoop-2.8.0/data/datanode</value>
</property>
</configuration>
[4] Edit file D:/Hadoop/hadoop-2.8.0/etc/hadoop/yarn-site.xml, paste below xml paragraph and save this file.
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/D:/Hadoop/hadoop-2.8.0/userlog</value><final>true</final>
</property>
<property><name>yarn.nodemanager.local-dirs</name>
<value>/D:/Hadoop/hadoop-2.8.0/temp/nm-localdir</value>
</property>
</configuration>
[5] Edit file D:/Hadoop/hadoop-2.8.0/etc/hadoop/hadoop-env.cmd by closing the command line"JAVA_HOME=%JAVA_HOME%"
STEP - 6: Manage Hadoop configuration
Time to manage Hadoop configuration, download file Hadoop Configuration.zip –
- https://mindtreeonline-my.sharepoint.com/:u:/g/personal/m1045767_mindtree_com1/EaT3z5mOMyFAkdba7ywoPqoBCpZ920qGbajvMTXQvYGYkA?e=iOiP6y
- Delete file bin on D:\Hadoop\hadoop-2.8.0\bin, replaced by file bin on file just download (from Hadoop Configuration.zip).
- Do check the Hadoop version details; go to command prompt and type –D:\> hadoop version
4. Execute namenode format command; go to the location “D:\Hadoop\hadoop-2.8.0\bin” by writing on command prompt and then “hdfs namenode –format” –
STEP - 7: Start Hadoop
Time to start Hadoop, open command prompt and change directory to “D:\Hadoop\hadoop-2.8.0\sbin" and type "start-all.cmd" to start apache.
- Hadoop Datanaode
- Hadoop Namenode
- Yarn Nodemanager
- Yarn Resourcemanager
It can be verified via browser also as –
- Namenode (hdfs) - http://localhost:50070
- Datanode - http://localhost:50075
- All Applications (cluster) - http://localhost:8088 etc.
Since the ‘start-all.cmd’ command has been deprecated so you can use below command in order wise -
- “start-dfs.cmd” and
- “start-yarn.cmd”
STEP - 8: Stop Hadoop
To stop the services, execute stop command such as –
- Stop-all.cmd
- Stop-dfs.cmd
- Stop-yarn.cmd
Congratulations, Hadoop installed !! 😊
STEP - 9: Some Hands on activity
For example, copy file from local to HDFS –
- hadoop fs -mkdir /raj/data
- hadoop fs -ls /raj
- hadoop fs -copyFromLocal D:/testhdfs.xlsx /raj/data/
This blog is full of Innovative ideas.surely i will look into this insight.please add more information's like this soon.
ReplyDeleteHadoop Training in Chennai
Big data training in chennai
Big Data Hadoop Training in Chennai
JAVA Training in Chennai
Python Training in Chennai
Digital Marketing Course in Chennai
Hadoop training in chennai
Big data training in chennai
big data training in velachery
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai
DeleteSpring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Corporate TRaining Spring Framework the authors explore the idea of using Java in Big Data platforms.
Spring Training in Chennai
The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training
19/10/30 16:53:53 INFO service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService failed in state INITED; cause: java.lang.NullPointerException
ReplyDeletejava.lang.NullPointerException
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:636)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:630)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:861)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:625)
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1071)
at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:177)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:206)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:738)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:734)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:734)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createDir(DirectoryCollection.java:478)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createDir(DirectoryCollection.java:476)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createDir(DirectoryCollection.java:476)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createDir(DirectoryCollection.java:476)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createDir(DirectoryCollection.java:476)
at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.createNonExistentDirs(DirectoryCollection.java:275)
at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.serviceInit(LocalDirsHandlerService.java:206)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.serviceInit(NodeHealthCheckerService.java:50)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
Hey Jaseer, I am unable to access this link: https://mindtreeonline-my.sharepoint.com/:u:/g/personal/m1045767_mindtree_com1/EaT3z5mOMyFAkdba7ywoPqoBCpZ920qGbajvMTXQvYGYkA?e=iOiP6y, which is used to get hadoop configuration file. If you have already downloaded that zip file, can you please send it to bilalshafqat0336@gmail.com? So i can also use it. Thanks.
DeleteI getting issue while accessing this link:
ReplyDeletehttps://mindtreeonline-my.sharepoint.com/:u:/g/personal/m1045767_mindtree_com1/EaT3z5mOMyFAkdba7ywoPqoBCpZ920qGbajvMTXQvYGYkA?e=iOiP6y
to download Hadoop configuration file.
It says that user bilalshafqat0336@gmail.com is not listed in users directory, seems like access issue, please resolve it.
Hi!
Deletehttps://drive.google.com/file/d/1AMqV4F5ybPF4ab4CeK8B3AsjdGtQCdvy/view
You can see:
https://github.com/TuanHenry/Step-by-step-install-Hadoop-on-Windows-10/wiki
Hi,
ReplyDeleteI am not able to use sql count query in hive, getting this below error
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2).
Please let me know how to solve this.
I getting issue while accessing this link:
ReplyDeletehttps://mindtreeonline-my.sharepoint.com/:u:/g/personal/m1045767_mindtree_com1/EaT3z5mOMyFAkdba7ywoPqoBCpZ920qGbajvMTXQvYGYkA?e=iOiP6y
to download Hadoop configuration file.
It says that user abhiismeonly@gmail.com is not listed in users directory, seems like access issue, please resolve it.
Hi!
Deletehttps://drive.google.com/file/d/1AMqV4F5ybPF4ab4CeK8B3AsjdGtQCdvy/view
You can see:
https://github.com/TuanHenry/Step-by-step-install-Hadoop-on-Windows-10/wiki
How do I access this link?
ReplyDeletehttps://mindtreeonline-my.sharepoint.com/:u:/g/personal/m1045767_mindtree_com1/EaT3z5mOMyFAkdba7ywoPqoBCpZ920qGbajvMTXQvYGYkA?e=iOiP6y
If anyone has this file, please email it to mansih.spit551@gmail.com!! Its very urgent. Thanks.
Hi!
Deletehttps://drive.google.com/file/d/1AMqV4F5ybPF4ab4CeK8B3AsjdGtQCdvy/view
You can see:
https://github.com/TuanHenry/Step-by-step-install-Hadoop-on-Windows-10/wiki
hi!
ReplyDeletei didn't understand what is to close the command line? i have entered to the edit, but what i should do now?
Edit file D:/Hadoop/hadoop-2.8.0/etc/hadoop/hadoop-env.cmd by closing the command line"JAVA_HOME=%JAVA_HOME%"
Debe poner la ruta del path de java, lo puede encontrar en la variable de usuario
Deletehgjgh
ReplyDelete"Good post. I learn something new and challenging on sites I stumbleupon on a daily basis. It's always interesting to read content from other writers and practice a little something from their web sites.
ReplyDeleteDigital Marketing Training Course in Chennai | Digital Marketing Training Course in Anna Nagar | Digital Marketing Training Course in OMR | Digital Marketing Training Course in Porur | Digital Marketing Training Course in Tambaram | Digital Marketing Training Course in Velachery"
I enjoyed over read your blog post. Your blog have nice information, I got good ideas from this amazing blog.
ReplyDeleteDigital Marketing Training Course in Chennai | Digital Marketing Training Course in Anna Nagar | Digital Marketing Training Course in OMR | Digital Marketing Training Course in Porur | Digital Marketing Training Course in Tambaram | Digital Marketing Training Course in Velachery
Mua vé máy bay tại Aivivu, tham khảo
ReplyDeletevé máy bay đi Mỹ
ve may bay ve vietnam tu my
vé máy bay nhật bản về việt nam
thông tin chuyến bay từ canada về việt nam
A charming conversation is worth remark. I believe that you should distribute more on this topic, it probably won't be an untouchable issue yet for the most part indivliveiduals don't examine these issues. To the following! Kind respects!!
ReplyDelete