欢迎来到课桌文档! | 帮助中心 课桌文档-建筑工程资料库
课桌文档
全部分类
  • 党建之窗>
  • 感悟体会>
  • 百家争鸣>
  • 教育整顿>
  • 文笔提升>
  • 热门分类>
  • 计划总结>
  • 致辞演讲>
  • 在线阅读>
  • ImageVerifierCode 换一换
    首页 课桌文档 > 资源分类 > DOCX文档下载  

    大数据平台搭建详细教程CHD.docx

    • 资源ID:1013466       资源大小:120.10KB        全文页数:25页
    • 资源格式: DOCX        下载积分:5金币
    快捷下载 游客一键下载
    会员登录下载
    三方登录下载: 微信开放平台登录 QQ登录  
    下载资源需要5金币
    邮箱/手机:
    温馨提示:
    用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP免费专享
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    大数据平台搭建详细教程CHD.docx

    大数据平台搭建详细教程目录1 .引言41.1 编写目的42 .详细搭建步骤42.1前期准备42.1.1添力口hostname42.1.2添加子用户52.1.3设置免密登陆52.1.4关闭selinux52.1.5关闭防火墙52.1.6安装JDK62.2安装hadoop集群62.2.1Zookeeper62.2.1.1配置Zookeeper62.2.1.2Zookeeper的使用72.3.2Hadoop72.3.2.1配置HadOoP82.3.2.2第一次启动hadoop92. 3.3Spark103. 3.1安装SCale(全部节点)102.3.3.2安装spark112.3.4Hive112.3.4.1部署MySQL主从集群112.3.4.2配置HiVe142.3.5Sqoop172.3.5.1配置SqoOP172.3.5.2使用sqoop182.4安装HbaSe集群182.4.1Hbase182.4.1.2部署分布式hbase集群182.4.1.3操作hbase212.4.2Kafka222.4.2.1分布式部署kafka222.4.2.2使用Kafka222.4.3Kafka-MONITOR232.4.3.1配置Kafka-MONITOR232.5环境变量242.5.1在hadoop节点上添加的环境变量242.5.2在hbase集群节点配置环境变量251.引言1.1编写目的本教程基于CentOS7.3编写,主要用于大数据平台搭建,其中组件有Zookeeper.HDFS.YARN、MaPredUCeS2、HBaSe、Spark、HiVe和SQe)OP。本系统一共2套,一套hadoop集群,一套HbaSe集群功能部若组件IHadoo集群管理方点(2台)NameNode(hadoop)、DFszKFalloverControIIer(hadoop)、ResourceManager(hadoop)HIVE(MYSQL),SQOOPMYSQLHadoO隙群数据节点(3台)hadpO1JournaINode(hadoop),DataNode(hadoop),QuorumPeerMain(Zookeeper),SPARK(master、worker),NodeManager(hadoop)hadoop01hadoop02HbaSe集群管理三点(2台)hbaseManagerO1NameNode(hadoop)xDFszKFaliovefControIIer(hadoop)、ResourceManager(hadoop),Hmaster(hbase)KafkaOffsetMonitorhbaseManagerC2HbaSe集群数据E点(3台)hbaseO1JournaINode(hadp),DataNode(hadoop).Zookeeper1HRegionServer(hbase),KAFKA,NodeManager(hadoop)hbaseO2hbaseO3图1.1组件2.详细搭建步骤2.1 前期准备在全部节点配置2.1.1 添加hostname修改主机名,并且在每个节点上etchosts文件中添加hostnamegIP,如果有域名服务器可以不127.0.0.1localhostlocalhost.localdomainlocalhost4localhost4.localdomain4:1localhostlocalhost.Iocaldomain1。CalhOSt6localhost6.localdomain692.168.19.31hadoop01192.168.19.32hadoop02192.168.19.33hadoop03192.168.19.34hadoop04192.168.19.35hadoop05用添加.2-1-1添加生机名2.1.2 添加子用户在全部主机上添加子用户,hadoop集群子用户名为hadoop,HbaSe集群子用户名为hbaseadduserHadoopadduserhbase2.1.3 设置免密登陆生成sshkey,设置主机之间子用户免密登陆,将所有主机子用户的rsa.pub复制到authorized_keys中,然后将authorized.keys复制到所有节点,并将authorized.keys权限改为644.chown-RHadooprhadoophomeHadoopchmod700homeHadoopchmod700homehadoop.sshchmod644homehadoop.sshauthorized-keyschmod600homehadoop.sshid-rsa配置完成后,验证配置是否成功,相互免密登陆就算配置成功.2.1.4 关闭selinu×修改所有节点的etcselinuxconfig中值为disabled,并重启SELINX=disabled用usrsbinsestatus检查2.1.5 关闭防火墙使用如下命令关闭所有节点防火墙Systemctlstopfirewalld.servicesystemctldisablefirewalld.servicesystemctlstatusfirewalld.service2.1.6 安装JDK因为had。P所有组件都需要使用JDK,所以要提前安装JDK。本教程默认使用的jdk-8ul62-linux-x64.rpm版本.在官网下载好安装包后,拷贝到节点中,使用如下命令安装:yuminstall-yjdk-8ul62-linux-x64.rpmroothadoop01#java-versionjavaversion"1.8.0_162"Java(TM)SERuntimeEnvironment(build1.8.0_162-bl2)JavaHotSpot(TM)64-BitServerVM(build25.162-bl2rmixedmode)rootQhadoopOl#2-1-5安装JDK2.2 安装hadoop集群环境安装顺序如下:Zookeeper-hadoop-spark-hive-sqoop2.2.1 Zookeeper在节点hadoop01,hadoop02和hadoop03上配置安装Zookeeper,用户为子用户hadoop2.2.1.1 配置ZOokeePer1.创建先关文件夹mkdir-phomehadoopoptdatazookeepermkdir-phomehadoopoptdatazookeeperzookeeperjog2 .上传ZK安装包至JhomehadoopZOOkeePer-3.4.5-Cdh5.10.0.tar.gz,然后解压tar-zxvfzookeeper-3.4.5-cdh5.10.0.tar.gz3 .创建homehadoopzoOkeePer-3.4.5-Cdh5.10.0confZOo.cfgroot()hadoop01conf#catzoo.cfgtickTime=2000initLimit=5syncLimit=2dataDir=homehadoopoptdatazookeeperdataLogDir=homehadC)OPoptdataZoOkeePer/zookeeperOgclientPort=2181server.33=hadoop01:2888:3888server.34=hadoop02:2888:3888server.35=hadoop03:2888:38884.在每个节点上的homehadoopoptdataZOokeePer中创建文件myid,并且写入对应的值hadoop01,llmyid写入33hadoop02l,myid写入34hadoop03rlmyid写入352.2.1.2Zookeeper的使用1 .启动ZK在每个节点上用如下命令启动Zookeepehomehadoopzookeeper-3.4.5-cdh5.10.0binzkServer.shstart2 .测试连接ZKhomehadoopzookeeper-3.4.5-cdh5.10.0binzkCli.sh-serverhadoop01:21803 .杳看状态homehadoopzookeeper-3.4.5-cdh5.10.0binzkServer.shstatus4 .3.2Hadoop在全部节点上配置hadoop,用户为子用户hadoop2.3.2.1配置Hadoop1 .解压hadoop-2.6.0-cdh5.10.0.tar.gz至Jhomehadooptar-zxvfhadoop-2.6.0-cdh5.10.0.tar.gz2 .创建文件夹mkdir-phomehadoopoptdatahadooptmpmkdir-phomehadoopoptdatahadoophadoop-namemkdir-phomehadoopoptdatahadoophadoop-datamkdir-phomehadoopoptdatahadoopeditsdirdfsjournalnodemkdir-phomehadoopoptdatahadoopnm-local-dirmkdir-phomehadoopoptdatahadoophadoopjogmkdir-phomehadoopoptdatahadoopuserlogs3. i¾ghomehadoophadoop-2.6.0-cdh5.10.0etchadoophadoop-env.sh#Thejavaimplementationtouse.exportJAVA_HOME=/usr/java/jdkl.8.0_1624,配置hdfsha配置文件如下,详细配置在文件夹had。P中core-site.xmlhdfs-site.xml5 .配置yarnHA配置文件如下,详细配置在文件夹hadoop中yam-site.xml(单独到管理节点配置yarn.resourcemanager.ha.id指定为当前管理节点)mapred-site.×ml6 .YarndatamanagerDatamanager节点将文件SPark-2.3.0-yam-shuffle.jar放入homehadoophadoop-2.6.0-cdh5.10.0sharehadoopyarnspark-2.3.0-yarn-shuffle.jar2.3.2.2第一次启动hadoop1 .在namenodel上执行,创建命名空间homehadoophadoop-2.6.0-cdh5.10.0binhdfszkfc-formatZK检查:Ka-ActiveStandbyEIector:Successfullycreatedhadoop-habigdataclusterinZK.2 .journalnode(hadoop01fhadoop02jhadoop03)homehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartjournalnode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-journalnode-hadoop02.log3 .主namenode上运行命令,格式化,只在主NN格式化,产生唯一ID标识homehadoophadoop-2.6.0-cdh5.10.0binhdfsnamenode-formatbigdatacluster检查:没报错4 .在主namenode启动namenode进程homehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartnamenode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-namenode-hadoopmanager01.log5 .在从namenode上运行,从主NN上copy元数据,同步元数据homehadoophadoop-2.6.0-cdh5.10.0binhdfsnamenode-bootstrapstandby检查:Exitingwithstatus06 .在从namenode上启动NNhomehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartnamenode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-namenode-hadoopmanager02.log7 .2个namenode节点启动DFSZKFaiIOVerCOntre)Ilerhomehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartzkfc检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-zkfc-hadoopmanager02.lo8 .启动HDFShomehadoophadoop-2.6.0-cdh5.10.0sbinstart-dfs.sh9 .启动yarnhomehadoophadoop-2.6.0-cdh5.10.0sbinstart-yarn.sh检查:homehadoophadoop-2.6.0-cdh5.10.0logsyarn-hadoop-resourcemanager-hadoopmanager01.loghttp:/172.16.20.11:8088/cluster/nodes10 3.3Spark在节点hadoopOLhadoop02,hadoop03上配置部署SPark用户为子用户hadoop3.3.1安装SCale(全部节点)安装scale用户为root账户1 .上传安装包,然后解压后移动到/usr/l。CalFtarzxvfscala-2.12.5.tgzmvscala-2.12.5usrlocal2 .配置环境变量并SOUrCe生效VietcprofileexportSCALA_HOME=/usr/local/scala-2.12.5exportPATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bsourceetcprofile3 .查看SCalascala-version4 .33.2安装spark1 .上传spark-2.3.0-bin-hadoop2.6.tgz并解压到homehadooptar-z×vfspark-23.0-bin-hadoop2.6.tgz2 .配置spark-env.shHADOOP_CONF_DIR=/home/hadoop/hadoop-2.6.0-cdh5.10.0SPARK_HOME=/home/hadoop/spark-2.3.0-bin-hadoop2.63 .启动SPark进入hadoop01-03homehadoopspark-2.3.0-bin-hadoop2.6sbinstart-all.sh4 .查看WebUlhttp:/IP:8080/2.3.4Hive在节点hadoopManager01上配置Hive,用户为子用户hadoop,然后在hadoopManager01-02上配置主从MyS矶用户为root.23.4.1部署MySQL主从集群1 .清除默认安装的MariaDBrpm-qagrep-imariadbrpm-enodepsmariadb-libs-5.5.52-l.el7.×86-642 .上传MySQL安装包,然后解压tar-xvfmysql-5.7.21-l.el7.×86.64.rpm-bundle.tar3 .几个包由依赖关系,执行有先后其中,client依赖于libs,server依赖于common和client按照如下顺序安装yuminstallperl-y&&yuminstallnet-tools-yrpm-ivhmysql-community-common-5.7.21-l.el7.x86_64.rpmrpm-ivhmysql-community-libs-5.7.21-l.el7.×86.64.rpmrpm-ivhmysql-community-client-5.7.21-l.el7.x86_64.rpmrpm-ivhmysql-community-server-5.7.21-l.el7.x86_64.rpm4 .为了保证数据库目录为与文件的所有者为mysql登陆用户,如果你是以root身份运行mysql服务,需要执行下面的命令初始化mysqld-initializeuser=mysql5 .启动mysql数据库systemctlstartmysqld.serviceSystemctlstatusmysqld.service6 .登陆MySQL使用获取初始密码,然后登陆catvarlogmysqld.logmysql-uroot-p;7 .设置新密码mysql>setpassword=password("2018");8 .设置授权(远程访问)mysql>grantallprivilegeson*.*to,mysq'%'identifiedby,2018,;mysql>flushprivileges;10.修改主数据库配置(hadoop01)vietcf#加入下列参数log-bin=mysql-binserver-id=2binlog-ignore-db=information_schemabinlog-ignore-db=clusterbinlog-ignore-db=mysqlbinlog-do-db=test重启数据库登陆后:grantFILEon*.*to,mysql'(g),172.16.20.12'identifiedby,2018,;grantreplicationslaveon*.*to'mysqr'172.16.20.12'identifiedby,2018,;flushprivileges;SHOWMASTERSTATUS;11、修改从节点vietcf#加入下列参数log-bin=mysql-binserver-id=3binlog-ignore-db=information_schemabinlog-ignore-db=clusterbinlog-ignore-db=mysqlreplicate-do-db=testreplicate-ignore-db=mysqllog-slave-updatesslave-skip-errors=allslave-net-timeout=60重启数据库登陆后:CHANGEMASTERTOMASTERJHoST='172.16.20.11',MASTER_USER='mysql',MASTER-PASSWORD='2018,MASTER.LOG-FILE=,mysql-bin.000002,MASTER-LG.POS=883;stopslave:startslave;23.4.2配置Hive1 .登陆mysqlf创建一个用户,并且赋予权限mysql>createuser'hive'©'%'identifiedby'hive,;mysql>grantallon*.*to,hive'(),%,identifiedby,hive,;mysql>flushprivileges;2 .创建文件夹mkdir-phomehadoopoptdatahivemkdir-phomehadoopoptdatahivelogs3 .上传hive-1.1.0-cdh5.10.0并解压至Jhomehadooptar-zxvfhive-1.1.0-cdh5.10.0.tar.gz4 .创建文件hive-site.×ml< configuration>< property><name>javaxjdo.option.ConnectionURL<name><value>jdbc:mysql:/hadoopmanager01:3306/hive?createDatabaseIfNotExist=true</value><description>JDBCconnectstringforaJDBCmetastore<description><property>< property><name>javaxjdo.option.ConnectionDriverName<name><value>com.mysqljdbc.Driver<value><description>DriverclassnameforaJDBCmetastore<description><property>< property><name>javaxjdo.option.ConnectionUserName<name><value>hive<value><description>usernametouseagainstmetastoredatabase<description><property>< property><name>javaxjdo.option.ConnectionPassword<name><value>hive<value><description>passwordtouseagainstmetastoredatabase<description><property>< !-hwi->< property><name>hive.hwi.war.file<name><value>libhive-hwi-1.1.0-cdh5.10.0.jar<value><description>ThissetsthepathtotheHWIwarfile,relativeto$HIVE_HOME.</description><property><property><name>hive.hwi.Iisten.host<name><value>0.0.0.0<value><description>ThisisthehostaddresstheHiveWebInterfacewilllistenon<description><property>< property><name>hive.hwi.listen.port<name><value>9999<value>< description>ThisistheporttheHiveWebInterfacewilllistenon<description><property>< property><name>hive.exec.scratchdir<name><value>homehadoopoptdatahivehive-Suser.name<value><description>ScratchspaceforHivejobs<description><property><property><name>hive.exec.local.scratchdir<name><value>homehadoopoptdatahiveSuser.name<value><description>LocalscratchspaceforHivejobs<description><property></configuration>5 .修改hive-env.xmlcphive-env.sh.templatehive-env.sh# SetHADOOP_HOMEtopointtoaspecifichadoopinstalldirectoryHADOOP,HOME=homehadoophadoop-2.6.0-cdh5.10.0# HiveConfigurationDirectorycanbecontrolledby:exportHIVE_CONF_DIR=/home/hadoop/hive-1.1.0-cdh5.10.0/conf# Foldercontainingextraibrariesrequiredforhivecompilation/executioncanbecontrolledby:exportHIVE_AUX_JARS_PATH=/home/hadoop/hive-1.1.0-cdh5.10.0/lib6 .上传mysqlJDBC的jar到hive的Iib下tar-zxvfmysql-connectorava-5.1.46.tar.gzcpmysql-connectorava-5.1.46jarhomehadoophive-1.1.0-cdh5.10.0lib2.3.5Sqoop在节点hadoopManagerOl上配置Hive,用户为子用户hadoop23.5.1配置SqoOP1 .上传Sqoop安装包至JhomehadoopSqOoP-146-Cdh5.10.0.tar.g乙然后解压tar-zxvfsqoop-1.4.6-cdh5.10.0.tar.gz2 .mysql的jdbc驱动mysql-connector-java-5.1.10.jar复制到SqOOP项目的Iib目录下cpmysql-connectorava-5.1.46jarhomehadoopsqoop-1.4.6-cdh5.10.0lib3 .修改hbase-env.shexportJAVA_HOME=/usr/java/jdkl.8.0_162exportHBASE_LOG_DIR=/home/hadoop/data/hbase/logsexportHADOOP_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHBASE_MANAGES_ZK=false4 .配置SqOOP-env.shexportHADOOP_COMMON_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHIVE_HOME=/home/hadoop/hive-1.1.0-cdh5.10.023.5.2使用sqoop列出mysql数据库中的所有数据库sqooplist-databases-connectjdbc:mysql:/localhost:3306/-usernamemysql-password20182.4安装HbaSe集群hbase集群安装顺序:Zookeeper-hadoop-hbase-kafka-KafkaOffsetMonitorZOokeePer和hadoop的安装参考2.3.1和2.3.2安装,注意用户名的配置.2.4.1 Hbase在所有节点配置hbase集群2.4.1.2部署分布式hbase集群1 .上传Hbase安装包至Jhomehbasehbase-120-cdh5.10.0.tar.gz,然后解压tar-z×vfhbase-1.2.0-cdh5.10.0.tar.gz2 .创建先关文件夹mkdir-phomehbaseoptdatahbaselogsmkdir-phomehbaseoptdatahbasezookeepermkdir-phomehbaseoptdatahbasetmp3 .修改hbase-env.shexportJAVA_HOME=/usr/java/jdkl.8.0_162exportHBASE_LOG_DIR=/home/hbase/opt/data/hbase/logsexportHADOP-HOME=homehbasehadoop-2.6.0-cdh5.10.0exportHBASE_MANAGES_ZK=false4 .修改hbase-site.xml<?xmlversion="1.0"7><7xml-stylesheettype=ntext×shref="configuration.xs7><!-* * 1.icensedtotheApacheSoftwareFoundation(ASF)underone* ormorecontributorlicenseagreements.SeetheNOTICEfile* distributedwiththisworkforadditionalinformation* regardingcopyrightownership.TheASFlicensesthisfile* toyouundertheApacheLicense,Version2.0(the* 'License");youmaynotusethisfileexceptincompliance* withtheLicense.YoumayobtainacopyoftheLicenseat* http:/www.apache.Org/licenses/LICENSE-2.0* Unlessrequiredbyapplicablelaworagreedtoinwriting,software* distributedundertheLicenseisdistributedonan"ASIS"BASIS,* WITHOUTWARRANTIESORCONDITIONSOFANYKIND,eitherexpressorimplied.* SeetheLicenseforthespecificlanguagegoverningpermissionsand* limitationsundertheLicense.V><configuration><property><name>hbase.rootdir<name><value>hdfsbigdataclusterhbase<value><property><property><name>hbase.cluster.distributed<name><value>true<value><property>< property><name>hbase.master.port<name><value>16000<value><property>< property><name>hbase.zookeeper.quorum</name><value>hbase01,hbase02,hbase03<value><property>< property><name>hbase.zookeeper.property.clientPort<name><value>2181<value><property>< property><name>hbase.zookeeper.property.dataDir<name><value>homehbaseoptdatahbasezookeeper<value><property>< property><name>hbase.tmp.dir<name><value>homehbaseoptdatahbasetmp<value><property>< property><name>hbase.coprocessor.user.region.classes<name><value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value><property>< property><name>hbase.superuser<name><value>hbase,root,hadoop<value><property>< property><name>hbase.security.authorization<name><value>true<value><property>< property><name>hbase.coprocessor.master.classes<name><value>org.apache.hadoop.hbase.security.access.AccessController<value><property>< property><name>hbase.coprocessor.region.classes<name><value>org.apache.hadoop.hbase.security.token.TokenProvideorg.apache.hadoop.hbase.security.access.AccessController<value><property>

    注意事项

    本文(大数据平台搭建详细教程CHD.docx)为本站会员(夺命阿水)主动上传,课桌文档仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知课桌文档(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    备案号:宁ICP备20000045号-1

    经营许可证:宁B2-20210002

    宁公网安备 64010402000986号

    课桌文档
    收起
    展开